Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] Support for zstd compression #85

Open
Jacalz opened this issue Mar 21, 2023 · 6 comments
Open

[feature request] Support for zstd compression #85

Jacalz opened this issue Mar 21, 2023 · 6 comments
Labels
enhancement New feature or request Portal V2 Changes/Features for Portal V2

Comments

@Jacalz
Copy link

Jacalz commented Mar 21, 2023

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Gzip is relatively slow and doesn't compress that well. Zstd seems to be a more modern implementation with great performance and compression levels.

I've wanted to add that to magic-wormhole but the process of getting it standardised is a bit more cumbersome.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Implementing Zstd compression using https://pkg.go.dev/github.com/klauspost/compress/zstd.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Are there any other algorithms that are faster and compress better? I doubt it but it is always worth it to benchmark and compare :)

Additional context
Add any other context or screenshots about the feature request here.

@Jacalz Jacalz added the enhancement New feature or request label Mar 21, 2023
@ZinoKader
Copy link
Member

Hello there, and thank you for your issue!

We are currently using pgzip (parallel gzip compression) by @klauspost, which also is the author of compress/zstd which you have linked. In his own comparison spreadsheet which you can find on the pgzip and compress README, the current library (pgzip) used is faster by a large margin.

I'm not entirely sure if I am reading the results correctly, so please let me know if I didn't. If you find any alternative zstd libraries which are provably faster I'm happy to implement using them as an option!

PS: How fun with another Swede, and with your impressive application Rymdport, it's basically the same name as Portal in Swedish!

@klauspost
Copy link

Another factor to consider is decompression speed. In that department zstd is significantly better than any gzip/deflate based.

But yeah, zstd is missing a "multithreaded" compression mode. Not sure when I get the time for that.

@Jacalz
Copy link
Author

Jacalz commented Mar 22, 2023

@klauspost Thanks. That explains things. Which version of compress is the spreadsheet based on? I try to follow your project closely and I have noticed that there have been a lot of zstd assembly improvements over the last couple of months.

@klauspost
Copy link

@Jacalz It is updated about a week ago. So current.

file algo level insize outsize millis mb/s   Reduction MB/s
10gb.tar pgzip 1 10065157632 5168279238 2205 4352.94   48.65% 4353.23
10gb.tar s2 1 10065157632 5915541066 1026 9352.14   41.23% 9355.64
10gb.tar s2 2 10065157632 5453844650 1856 5169.32   45.81% 5171.81
10gb.tar s2 3 10065157632 5192490222 31721 302.6   48.41% 302.60
10gb.tar zskp 1 10065157632 4907785561 22628 424.2   51.24% 424.20
10gb.tar zskp 2 10065157632 4624627076 33002 290.85   54.05% 290.86
10gb.tar zskp 3 10065157632 4426778811 62535 153.5   56.02% 153.50
10gb.tar zskp 4 10065157632 4198925932 382364 25.1   58.28% 25.10

Decompression speeds:

Algo MB/s
gzip 361.11
zstd 1108.13
s2 7790.85
s2 - 1cpu 1610.48

gzip decompression is a bottleneck.

For local network transfer I would personally go for S2. It is made for speed both when compressing and decompressing. But if the expected network speed is below 3 gigabits/s zstd would make more sense.

@Jacalz
Copy link
Author

Jacalz commented Mar 23, 2023

Thanks for the explanation and clear benchmark results. S2 sounds like a great option then.

@ZinoKader
Copy link
Member

ZinoKader commented Mar 23, 2023

Thank you for your input Klaus!

S2 does look like a good overall option indeed. Furthermore, I have noticed that gzip spends a lot of time doing nothing with incompressible/already compressed files, and I read that S2 is great for that usecase.

The best way forward is probably to add an option/flag of using several of these compression algorithms, S2 included, and possibly changing the default in the next major version of Portal. That would require communicating the algorithm used and supported for the sender, and the supported algorithms by the peer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Portal V2 Changes/Features for Portal V2
Projects
None yet
Development

No branches or pull requests

4 participants