Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize DepotManifest Serialization #1337

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

skylayer
Copy link

This PR builds on PR #1113, introducing a HashSet with ChunkIdComparer to manage unique chunks in DepotManifest serialization, significantly reducing processing time.

My comment

Changes:

  • Replaced List<byte[]> with HashSet<byte[]> for unique chunks.
  • Added ChunkIdComparer for efficient byte array comparison.

Impact:

  • Before: Serialization took up to 10 minutes.
  • After: Reduced to ~0.61 seconds.

This optimization streamlines serialization, especially for large manifests, enhancing performance without data integrity loss.

Replaced List<byte[]> with HashSet<byte[]> using a custom ChunkIdComparer for efficient chunk ID uniqueness checks
@xPaw
Copy link
Member

xPaw commented Feb 22, 2024

Why does it even need to check for duplicates?

@skylayer
Copy link
Author

I'm not entirely sure why the original author included the duplicate checks—probably had a good reason. I just worked on speeding up the process they set up. 🤷‍♂️

@skylayer
Copy link
Author

BTW, I'm thinking of making the Serialize() function in DepotManifest public. Right now, we've only got SaveToFile accessible, which kinda boxes us in. If Serialize() were public, we could directly handle more scenarios, like different formats or custom storage solutions.

@NicknineTheEagle
Copy link
Contributor

NicknineTheEagle commented Feb 22, 2024

Why does it even need to check for duplicates?

Because Valve's manifest format stores the number of unique chunks, I just followed it.

@NicknineTheEagle
Copy link
Contributor

BTW, I'm thinking of making the Serialize() function in DepotManifest public. Right now, we've only got SaveToFile accessible, which kinda boxes us in. If Serialize() were public, we could directly handle more scenarios, like different formats or custom storage solutions.

I checked the original PR and I don't know why I didn't make Serialize public, to be honest.

Serialize( fs ); // Directly pass the FileStream to the Serialize method
return true; // If serialization completes without throwing an exception, return true
}
catch ( Exception )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we expecting any exceptions, why the catch?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xPaw Thanks for pointing that out! While Serialize itself might be safe, I was thinking about scenarios where the provided filename could cause problems:

  • A user accidentally provides a file path with a non-existent directory (leading to a DirectoryNotFoundException).
  • The application might not have write permissions to the specified location (resulting in an UnauthorizedAccessException).
  • External factors like a full disk could also cause issues during file writing (IOException).

Do you think it would be helpful to handle these cases explicitly for better fault tolerance?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these should just throw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants