Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Enhance performance of ob_wc_mb_utf8mb4 #1593

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

gangdeng-intel
Copy link
Contributor

@gangdeng-intel gangdeng-intel commented Oct 9, 2023

Task Description

Enhance performance of function 'ob_wc_mb_utf8mb4'.
When test Oceanbase with Sysbench OLTP_READ_ONLY, ob_wc_mb_utf8mb4 is one of hotspots. It uses ~3.4% CPU time. The line of 'switch (bytes)' uses 57% CPU time of whole function. switch is less efficient due to indirect jump involved.

image

Solution Description

Eliminating 'switch' by moving related code into 'if/else' can avoid indirect jump and improve performance accordingly. Using Sysbench OLTP_READ_ONLY test, Oceanbase performance can be improved ~2% and percentage of CPU time by ob_wc_mb_utf8mb4 can be reduced from 3.4% to 2.2%.

image

Passed Regressions

Test manually.

Upgrade Compatibility

N/A

Other Information

N/A

Release Note

N/A

@CLAassistant
Copy link

CLAassistant commented Oct 9, 2023

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ gangdeng-intel
❌ root


root seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@gangdeng-intel gangdeng-intel changed the title enhance performance of ob_wc_mb_utf8mb4 [Enhancement] Enhance performance of ob_wc_mb_utf8mb4 Oct 9, 2023
r[0] = (unsigned char) w_char;
}
else{
ret = OB_CS_TOOSMALLN(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 1702 and line 1703 should be together, like "} else {", minded that there is empty space around 'else'

case 4: r[3] = (unsigned char) (0x80 | (w_char & 0x3f)); w_char >>= 6; w_char |= 0x10000;
case 3: r[2] = (unsigned char) (0x80 | (w_char & 0x3f)); w_char >>= 6; w_char |= 0x800;
case 2: r[1] = (unsigned char) (0x80 | (w_char & 0x3f)); w_char >>= 6; w_char |= 0xc0;
case 1: r[0] = (unsigned char) w_char;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here, the original logic is when bytes = 4 , it will go through 4 case because there is no 'break', but seems ur modification have different logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does there any unittest cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for comment! Right, the logic is different. Will fix it and do more test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the bug in previous commit. Performance improvement is similar as previous submission, because most of characters in Sysbench dataset is ascii.

And I ran unittest of deps/oblib (deps/oblib/unittest/run_tests.sh). There are 2 failures:
18:test_charset
97:test_utility
I think the 2 failures are not caused by this pull request, as the failures are exist even without this PR. @hnwyllmm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants