-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] Enhance performance of ob_wc_mb_utf8mb4 #1593
base: master
Are you sure you want to change the base?
[Enhancement] Enhance performance of ob_wc_mb_utf8mb4 #1593
Conversation
root seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
r[0] = (unsigned char) w_char; | ||
} | ||
else{ | ||
ret = OB_CS_TOOSMALLN(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line 1702 and line 1703 should be together, like "} else {", minded that there is empty space around 'else'
case 4: r[3] = (unsigned char) (0x80 | (w_char & 0x3f)); w_char >>= 6; w_char |= 0x10000; | ||
case 3: r[2] = (unsigned char) (0x80 | (w_char & 0x3f)); w_char >>= 6; w_char |= 0x800; | ||
case 2: r[1] = (unsigned char) (0x80 | (w_char & 0x3f)); w_char >>= 6; w_char |= 0xc0; | ||
case 1: r[0] = (unsigned char) w_char; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here, the original logic is when bytes = 4 , it will go through 4 case because there is no 'break', but seems ur modification have different logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does there any unittest cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for comment! Right, the logic is different. Will fix it and do more test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed the bug in previous commit. Performance improvement is similar as previous submission, because most of characters in Sysbench dataset is ascii.
And I ran unittest of deps/oblib (deps/oblib/unittest/run_tests.sh). There are 2 failures:
18:test_charset
97:test_utility
I think the 2 failures are not caused by this pull request, as the failures are exist even without this PR. @hnwyllmm
Task Description
Enhance performance of function 'ob_wc_mb_utf8mb4'.
When test Oceanbase with Sysbench OLTP_READ_ONLY, ob_wc_mb_utf8mb4 is one of hotspots. It uses ~3.4% CPU time. The line of 'switch (bytes)' uses 57% CPU time of whole function. switch is less efficient due to indirect jump involved.
Solution Description
Eliminating 'switch' by moving related code into 'if/else' can avoid indirect jump and improve performance accordingly. Using Sysbench OLTP_READ_ONLY test, Oceanbase performance can be improved ~2% and percentage of CPU time by ob_wc_mb_utf8mb4 can be reduced from 3.4% to 2.2%.
Passed Regressions
Test manually.
Upgrade Compatibility
N/A
Other Information
N/A
Release Note
N/A