-
Notifications
You must be signed in to change notification settings - Fork 639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
investigate SCHED0011 failure on odroid_xu4 #1130
Comments
The |
Moving the yield won't help this particular issue though, as the time is too long, not too short. Do we even now for sure that the clock used for scheduling and the clock used for timing are correct and synchronous enough to make valid measurements? This is again on the somewhat dodgy odroidxu4_1, can't we use odroidxu4_2 for CI instead? Looking at the raw log I'm not convinced that something weird isn't going on:
The period is 100 ms, so the whole test should take about 1 second, it takes more than 4 seconds instead. A successful run looks like:
|
FYI, I've seen this test fail when |
The printing happens after the measurement in each loop, so this wont affect it. |
I've definitely seen that flag reliably trigger this failure on another platform. If that's not supposed to happen I'll repro it again and file an issue about it. |
I also think some of the timer drivers in libplatsupport might introduce some rounding error when converting from/to timer ticks and nanoseconds because they use a pre-calculated number of nanoseconds per timer tick, which due to using integer arithmetic is truncated to a whole integer. I don't know if this is a problem on this platform or if it would be significant enough to cause any issues, but it seems to me this small error at the nanosecond scale might get large for longer time periods. |
Ah, the failure I saw was in a different test, SCHED0021. |
The timestamp is added by the logging system and includes the loop time, so it just shows a ridiculously slow (buggy?) UART driver (2 ms per character). The quality of the timer drivers in libplatsupport is low, but mostly bad overflow handling (I have a local patch to fix it, but didn't get to it yet). The meson timer assumes an 1 MHz clock it seems, if a rounding error is unlikely. According to the schematics, there's only a 24MHz clock and a 32kHz clock input, so it's unlikely that a different clock source is being used either. 3% overhead for "something" is ridiculous. |
This could be just a fluke, but I have not seen this failure before, so we should investigate. The scheduler accuracy test failed for config
ODROID_XU4_debug_MCS_clang_32
:The corresponding chunk of code in the test is here:
The bit that fails is
test_leq(diff, period_ns + 2 * NS_IN_MS);
The text was updated successfully, but these errors were encountered: