New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Speed up Base58 encoding/decoding by 400%/200% via preliminary byte packing #29473
base: master
Are you sure you want to change the base?
refactor: Speed up Base58 encoding/decoding by 400%/200% via preliminary byte packing #29473
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
afbc980
to
321456b
Compare
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the Possibly this is due to a silent merge conflict (the changes in this pull request being Leave a comment here, if you need help tracking down a confusing failure. |
397ae64
to
e168e62
Compare
cab8a6e
to
b930cd5
Compare
ba23707
to
c9b351d
Compare
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the Possibly this is due to a silent merge conflict (the changes in this pull request being Leave a comment here, if you need help tracking down a confusing failure. |
c9b351d
to
83caaf7
Compare
83caaf7
to
73845ec
Compare
6b684c4
to
bc303f3
Compare
3c5435b
to
947ea98
Compare
947ea98
to
c2df10b
Compare
5789342
to
ffb81c9
Compare
@@ -11,6 +11,13 @@ | |||
["ecac89cad93923c02321", "EJDM8drfXA6uyA"], | |||
["10c8511e", "Rt5zm"], | |||
["00000000000000000000", "1111111111"], | |||
["00000000000000000000000000000000000000000000000000000000000000000000000000000000", "1111111111111111111111111111111111111111"], | |||
["00000000000000000000000000000000000000000000000000000000000000000000000000000001", "1111111111111111111111111111111111111112"], | |||
["0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ec39d04c37e71e5d591881f6", "111111111111111111111111111111111111111111111111111111111111111111111111111111111111115TYzLYH1udmLdzCLM"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine to add tests, but I don't think this is used in a critical path, to warrant a speedup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the random test cases that failed during development to make sure they're covered deterministically.
Also added ones which are testing the boundaries of powers of 58, since those cases happen less frequently in the randomized tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I meant you can open a new pull request with the added tests, if you want.
7a3f66a
to
28c6717
Compare
Instead of collecting the values to and empty b58, we're cloning the input without leading zeros and dividing it in-place, while doing the quadratic division from the most significant, non-zero digit, forward. make && ./src/bench/bench_bitcoin --filter=Base58Encode --min-time=1000 Before: ``` | ns/byte | byte/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 59.34 | 16,851,250.88 | 0.2% | 1.10 | `Base58Encode` | 59.20 | 16,892,479.61 | 0.2% | 1.10 | `Base58Encode` | 58.97 | 16,958,226.76 | 0.2% | 1.10 | `Base58Encode` ``` After: ``` | ns/byte | byte/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 40.80 | 24,508,402.06 | 0.1% | 1.10 | `Base58Encode` | 40.83 | 24,489,762.49 | 0.2% | 1.10 | `Base58Encode` | 39.71 | 25,182,310.62 | 0.2% | 1.10 | `Base58Encode` ``` and make && ./src/bench/bench_bitcoin --filter=Base58CheckEncode --min-time=1000 Before: ``` | ns/byte | byte/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 79.93 | 12,511,576.75 | 0.4% | 1.10 | `Base58CheckEncode` | 79.49 | 12,580,136.21 | 0.1% | 1.10 | `Base58CheckEncode` | 79.65 | 12,554,785.16 | 0.1% | 1.10 | `Base58CheckEncode` ``` After: ``` | ns/byte | byte/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 55.27 | 18,094,561.35 | 0.1% | 1.10 | `Base58CheckEncode` | 55.41 | 18,046,778.32 | 0.1% | 1.10 | `Base58CheckEncode` | 55.32 | 18,075,763.16 | 0.1% | 1.10 | `Base58CheckEncode` ```
To mitigate the quadratic complexity inherent in the sequential byte division of the Base58 encoding process, this update aggregates characters into groups of 7, fitting them into 64-bit long integers. This refinement utilizes the inherent efficiency of native division and modulo operations on 64-bit architectures, significantly optimizing computational overhead. The benchmarks indicate a 4x speedup for `Base58CheckEncode` and `Base58Encode`: make && ./src/bench/bench_bitcoin --filter=Base58Encode --min-time=1000 After: ``` | ns/byte | byte/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 14.32 | 69,831,031.44 | 0.1% | 1.10 | `Base58Encode` | 14.32 | 69,811,995.18 | 0.1% | 1.10 | `Base58Encode` | 14.35 | 69,662,527.88 | 0.5% | 1.10 | `Base58Encode` ``` make && ./src/bench/bench_bitcoin --filter=Base58CheckEncode --min-time=1000 After: ``` | ns/byte | byte/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 19.15 | 52,231,968.11 | 0.1% | 1.06 | `Base58CheckEncode` | 19.13 | 52,269,345.54 | 0.4% | 1.05 | `Base58CheckEncode` | 19.14 | 52,244,117.61 | 0.3% | 1.06 | `Base58CheckEncode` ```
> make && ./src/bench/bench_bitcoin --filter=Base58Decode --min-time=1000 Before: ``` | ns/byte | byte/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 17.50 | 57,141,622.65 | 0.1% | 1.10 | `Base58Decode` | 17.42 | 57,392,223.96 | 0.0% | 1.10 | `Base58Decode` | 17.43 | 57,358,655.44 | 0.2% | 1.10 | `Base58Decode` ``` After: ``` | ns/byte | byte/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 16.37 | 61,093,842.90 | 0.2% | 1.10 | `Base58Decode` | 16.37 | 61,100,514.64 | 0.1% | 1.10 | `Base58Decode` | 16.38 | 61,046,275.93 | 0.1% | 1.10 | `Base58Decode` ```
> make && ./src/bench/bench_bitcoin --filter=Base58Decode --min-time=1000 After: ``` | ns/byte | byte/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 8.79 | 113,767,685.06 | 0.1% | 1.10 | `Base58Decode` | 8.78 | 113,831,528.33 | 0.0% | 1.10 | `Base58Decode` | 8.80 | 113,607,470.35 | 0.2% | 1.10 | `Base58Decode` ```
28c6717
to
ca21f67
Compare
Continuing the base conversion optimizations, started in TryParseHex and IsSpace and Bech32.
To mitigate the quadratic complexity inherent in the sequential byte division of the Base58 encoding process, this update aggregates characters into groups of 7 and 9, fitting them into 64-bit long integers.
This refinement utilizes the inherent efficiency of native division and modulo operations on 64-bit architectures, significantly optimizing computational overhead.
The benchmarks indicate a 4x speedup for
Base58CheckEncode
andBase58Encode
:Before:
After:
Before:
After:
The same was applied to decoding, resulting in a 2x speedup:
Before:
After:
See https://corecheck.dev/bitcoin/bitcoin/pulls/29458 for all benchmarks.