-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replaced parse_descriptor() function, fixed some overruns #1460
Conversation
To my personal taste - I like this (explicit) way better. I've reviewed it, and the change looks good. |
The change is not just one of taste, but correctness too. The old way was assuming specific struct alignment, and that struct fields did not have padding in between.
Thanks for reviewing.
I did write pretty detailed commit messages... is it useful to repeat in the PR description? |
Oh, I didn't see the commit message(s). NOTE: libusb does not use squash-merge functionality github provides. |
Really?!?! Why? That really reduces the usefulness of |
A very simple and verified over time philosophy: a single PR should contain a single independent change and solve a single "problem". If a PR is more than one commit - those are either not independent or should be split into separate PRs. |
This PR does solve a single "problem". The problem of If the official libusb policy is indeed 1 commit per PR, then it really should be documented in the |
No, there is no rule saying a PR gets squashed. We will certainly want to squash fixup commits and other WIP artifacts, but if each commit is an independent change they go in one by one. Sometimes it is for instance good to separate preparatory commits from those commits that really make a change, it makes it easier to review up front, and to understand and verify afterwards. |
And if and when we squash commits, no commit messages gets lost. When you use "squash" in git rebase it will add all commit messages, then you edit it before committing. |
Alright, looks like my observatory assumption is not correct. Some habbit of mine to generalize things I see consistently. But in any case I find it way more convenient to see the summary of the PR upfront, rather than having to look for individual commit messages. Maybe it's just my habbit. |
After a quick look at these, I see no need to squash them. OK, they all improve parse_descriptor, but in different ways. Each of them would have made sense without the others. A bunch of easy-to-review commits is better than huge commits where you have to look back and forth to see what pieces play together. And the larger the commit the more sure there will be something wrong that slips in and is missed in a review. I think most regressions I have found are from mega-commits where too many things were changed at a time. |
To my experience - that's a good reason to have separate PRs, not just separate commits. |
If you click on the "..." after each commit summary on this web page (e.g. under "seanm added 6 commits"), there is a drop-down with the full commit message,. Most of the time it is the opposite problem, people write a lot in the PR description (which doesn't get into the repo) while their commit messages are poor or empty. |
Sometimes they depend on each other to avoid conflicts, and it is just a saner way to work both for coder and reviewer. They fit together logically so they are best treated in the same PR. There is a risk that discussion about one commit delays the others, but it is rarely a big problem. The committer can in such cases also choose to cherry-pick the easy ones, and leave the questionable commits to be fixed up (and rebased if needed). |
This commit, apparently introduces an issue:
At least on MSVC x64 compilers. |
7ae012b
to
83942ce
Compare
The last commit here is marked WIP, I guess the others are ready to go? |
Let's figure out the WIP one before merging... Let me see if I remember what I was thinking those weeks ago.... |
Ah yes... The old code is this: // Second pass: Iterate through desc list, fill IAD structures
consumed = 0;
i = 0;
while (consumed < size) {
header.bLength = buffer[0];
header.bDescriptorType = buffer[1];
if (header.bDescriptorType == LIBUSB_DT_INTERFACE_ASSOCIATION) {
iad[i].bLength = buffer[0];
iad[i].bDescriptorType = buffer[1];
iad[i].bFirstInterface = buffer[2];
iad[i].bInterfaceCount = buffer[3];
iad[i].bFunctionClass = buffer[4];
iad[i].bFunctionSubClass = buffer[5];
iad[i].bFunctionProtocol = buffer[6];
iad[i].iFunction = buffer[7];
i++;
}
buffer += header.bLength;
consumed += header.bLength;
} We know the length of
|
beb1831
to
8bae9eb
Compare
8bae9eb
to
d324626
Compare
d324626
to
67ac1a7
Compare
d219558
to
9e52150
Compare
I've rebased and pushed. |
"Fixed potential buffer overread" still talks about WIP. Here as well as in your other PR it would be good to see in the commit summary what part of the code you are changing. We have established a prefix convention that everybody is using. When reviewing or perusing the git log it has great value e.g. to be able to identify changes in core code vs in examples. |
Yeah, would appreciate thoughts on that. I currently call
Any docs on that? I see nothing in |
Just look at the git log, i.e. |
9e52150
to
de91ffe
Compare
Sure, but if this is required of commit messages, shouldn't it be documented in HACKING, which already discusses commit messages: "Commit messages should be formatted to 72 chars width and have a Put detailed information in the commit message itself, which will end |
de91ffe
to
15f3123
Compare
Another tip: To spellcheck your commit messages you can use for instance: |
15f3123
to
d12e64a
Compare
Spellings fixed. |
So, is it that you want me to add the prefix |
We usually use "descriptor:" for changes in the descriptor parsing.
It is not straightforward to define formal rules for this, it is a question of taste and style, and above all usefulness so it might change over time. The idea is that someone when perusing the git log, can quickly identify what is of interest, or maybe even more important, what is not of interest. So e.g. a Linux distro maintainer wanting to see what has changes since last release can skip all commits like xcode, windows, etc. Or e.g. someone else doesn't care about examples and tests. But true, we could mention something in HACKING. For the casual contributor, I fix up this when I merge it anyway. But since you are a very productive contributor it would good if you get into this. BTW did I mention imperative commit messages? :) That is documented in the docs specified in HACKING. |
This function had a few problems: - it takes two buffers as parameters but knows nothing about their length, making it easy to overrun them. - callers make unwarranted assumptions about the alignment of structures that are passed to it (it assumes there's no padding) - it has tricky pointer arithmetic and masking With this new formulation, it's easier to see what's being read/written, especially the destination. It's now very clear that the destination is not being overrun because we are simply assigning to struct fields. Also converted byte swapping macros to inline functions for more type safety.
This was checking that `size` is at least `LIBUSB_DT_CONFIG_SIZE` (9) bytes long, but then increments the pointer with `buf += header.bLength`. That could end up pointing past of the end of the buffer. There is a subsequest check that would prevent dereferencing it, but it's still UB to even create such a pointer. Added a check with a similar pattern as elsewhere in this file.
All the right hand side is `dev_cap`, changed one outlier to match. Also clarified the relationships between some magic numbers. No change in behaviour here.
The first iteration of this loop was safe because the beginning of the function checked that `size` is at least LIBUSB_DT_CONFIG_SIZE (9) bytes long. But for subsequent iterations, it could advance the pointer too far (which is undefined behaviour) depending on the content of the buffer itself.
d12e64a
to
83b6ab1
Compare
Done. |
This function had a few problems: - it takes two buffers as parameters but knows nothing about their length, making it easy to overrun them. - callers make unwarranted assumptions about the alignment of structures that are passed to it (it assumes there's no padding) - it has tricky pointer arithmetic and masking With this new formulation, it's easier to see what's being read/written, especially the destination. It's now very clear that the destination is not being overrun because we are simply assigning to struct fields. Also convert byte swapping macros to inline functions for more type safety. References #1460
Thanks! I just fixed up a few typos in the commit messages. |
No description provided.