-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue reported on discord regarding un-uploaded YPP videos #174
Comments
Let me first start by addressing this
Def. the creators have to do this on their own, but moderators can get in touch and help them. Moderator powers are probably too broad right now, see here |
|
Sorry, I almost forgot to reply to this amid other work.
Is your concern is that even without this specific DB exception whether the side effects on Joystream are happening in a fully atomic way? Yes, the the action of creating video on Joystream is atomic operation (if we sidestep this particular exception), the way it works is that right before sending the extrinsic, the service does a pre-commit changing the state of the video from Upon service initialization a process will run that will check all the videos in DB in Conceptually, this is how most databases internally design the transactions to be an atomic operation, the act of pre-commiting is known as write-ahead-lorging(WAL) in PostgresSQL, and transaction logs in many other databases. These transaction logs are then used to rollback or apply unfinished changes (whenever the DB restarts) specified in the transaction by looking at the state of already committed changes in the DB.
So I think handling this DB exception (of not able to commit the sate to DB), coupled with the fact that video creation is an atomic operation, will fully solve the problem.
Yes, it's up to them, but if they use the Infrastructure-as-a-Code template provided in the YT-synch repo to bootstrap the database tables, the tables will be created with the For Gleev's instance of YPP, I switched to |
Summary
An issue was reported:
Of the 109 videos uploaded through YPP, 49 did not upload
. I started investigating the issue and it turns out that not uploading of the said videos was a result of another (more serious) issue. That is, some of the videos were duplicated on Joystream.The issue that caused video duplicates happened because of two things, a database configuration issue, and a lack of checks in YT-Synch BE to handle DB exceptions.
Explanation
Explanation of Database configuration issue
Before describing the DB configuration problem, here is some context of the DB architecture, the YT-synch service is using, YT_Synch use Dynamodb (an AWS cloud-based) DB to persist the records for channels & videos and their state. DynamoDB is a fully managed solution so as an application developer you don’t have to manage any DB-related infra such as server, disk space & memory specs, or update the infrastructure when the read/write load exceeds, etc.
However, as an application developer, you still have to specify how much throughput you need, and Dynamodb will automatically scale up to that requirement. It provides two configuration options to specify that (one should be selected):
PROVISIONED
: Specify the number of reads and writes per second that you require for your application (default)ON_DEMAND
: pay-per-request, scales automatically as the read/write requests increaseActual Problem
A Youtube channel with ID
UC1p45mMUW1ivJ2dVN7Eo_KQ
signed up for the YPP program with48
videos on 24/03/23. At this time the Dynamodb capacity mode was set toPROVISIONED
with absolutely minimum read/write capacity (1 read, 1 write). These were the default values specified in the template that was used to set up the production tables.After the channel was added, the 1) YT-synch service started downloading the videos by querying DB for URLs, and eventually 2) creating these videos by sending on-chain transactions. However, when the service was done with step
2)
thePROVISIONED
capacity had already been reached, so no more read/write was possible. Because of this issue, the service failed to commit the state of the video it just created toVideoCreated
(hence, the video was retried for on-chain creation which led to duplicates). As the server logs show, the video state update operation failed withProvisionedThroughputExceededException
error.YT-Synch error handling
Despite the invalid capacity mode option, the Yt-synch service should have handled the
ProvisionedThroughputExceededException
error gracefully so I investigated that this was not handled/tested/caught in either fault tolerance testing or community load testing.For infra load testing, when the community set up the YT-synch service, they used a local instance of Dynamodb(not real AWS-based DB), the local Dnamodb does not have this limitation of read/write capacity. That's why in community testing one channel with 512 videos was added for syncing, and it got synced without any exceptions.
I looked at the fault tolerance QA plan for the reason for this discrepancy, although it was pretty detailed and covered different failures for external APIs (e.g. RPC, QN, Storage Node & Google API), It did not have any test cases to mock/test the Database API failures.
Also, we didn't get this issue in the YT-synch dev setup, which was used for a considerable amount of time, because capacity was configured to
ON_DEMAND
Problem resolution
The problem automatically got resolved eventually. As some videos successfully got created, the number of videos whose state needs to be periodically queried & updated was reduced, so read/write used were within capacity limits, and the state of new videos created was successfully committed to DB.
State of the Affected Channel (
UC1p45mMUW1ivJ2dVN7Eo_KQ
)This table shows the list of duplicate videos (
~28
) of the affected channel. The first column lists the youtube video IDs, the second column shows the count of each video (duplicates), and the third column shows the Joystream video IDs of duplicate videos (these duplicate video IDs were only created on chain, their assets couldn't get uploaded on storage nodes).I am not sure what's the best action can be taken in this regard. The creator can remove the video ID that I mentioned, or any moderator can do that?
The text was updated successfully, but these errors were encountered: