You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One type of testing flow that is common in software dev works as follows:
A developer adds a feature to an application
The developer writes a test that runs the application with that new feature (i.e. not a "unit" test). The output gets written to "test.actual".
"test.actual" gets compared to "test.expected", if it matches the test passes, if it doesn't match or "test.expected" doesn't exist, the test fails.
If the test fails, the developer checks the output of "test.actual". If the output looks correct, they "accept" the output, which writes this output to "test.expected", which gets committed to the repo. Otherwise, they fix any bugs until "test.actual" matches "test.expected".
CodeQL is one example that has a testing framework like this, but there are many others. It should be clear that this type of testing is often done in addition to unit testing and integration testing, not as a replacement for it. This type of test framework is often good when there is user output that may often go through minor changes, as the test file can easily be regenerated without any code changes.
I think this type of test framework is a good fit for pwndbg commands specifically. Whenever a new command is written, we can have a script that runs the command, writes it to ".actual", compares it to ".expected", and reports any differences. The same script can take an "accept" argument that accepts the last ".actual", or we can just mv it to ".expected", which will then get committed to the repo.
This type of testing framework is easy to implement, but there's an additional complication with pwndbg, which is that addresses may change between executions of the binary (i.e. under ASLR) or between different build of a binary (i.e. compiler versions, libc versions, etc.). To get around this, we could either just regenerate the tests if this happens (maybe automatically with a CI job or something similar), or have some special character that we use to indicate that byte of the output might differ. The issue with the latter solution is that it means we can no longer check if that special character is actually part of the output, as it will always be treated as a wildcard. We could choose a character we think we won't care about 99% of the time, or we could use unicode (which might make the implementation slightly more complex).
For the first version of this, I say we just do an exact byte by byte comparison, use it for tests where the output shouldn't change, and decide on the solution for changing addresses later.
The text was updated successfully, but these errors were encountered:
For the first version of this, I say we just do an exact byte by byte comparison, use it for tests where the output shouldn't change, and decide on the solution for changing addresses later.
I agree with the above, it seems like a sound way to produce tests for features that have consistent output, but could be less accurate when a pwndbg command is printing e.g. memory addresses, especially if that command needs to be tested with ASLR enabled.
Some of the tests that currently handle those commands also check for specific values that are unknowable ahead of time (again e.g. memory addresses subject to ASLR) which could require more than a wildcard character.
One type of testing flow that is common in software dev works as follows:
CodeQL is one example that has a testing framework like this, but there are many others. It should be clear that this type of testing is often done in addition to unit testing and integration testing, not as a replacement for it. This type of test framework is often good when there is user output that may often go through minor changes, as the test file can easily be regenerated without any code changes.
I think this type of test framework is a good fit for pwndbg commands specifically. Whenever a new command is written, we can have a script that runs the command, writes it to ".actual", compares it to ".expected", and reports any differences. The same script can take an "accept" argument that accepts the last ".actual", or we can just
mv
it to ".expected", which will then get committed to the repo.This type of testing framework is easy to implement, but there's an additional complication with pwndbg, which is that addresses may change between executions of the binary (i.e. under ASLR) or between different build of a binary (i.e. compiler versions, libc versions, etc.). To get around this, we could either just regenerate the tests if this happens (maybe automatically with a CI job or something similar), or have some special character that we use to indicate that byte of the output might differ. The issue with the latter solution is that it means we can no longer check if that special character is actually part of the output, as it will always be treated as a wildcard. We could choose a character we think we won't care about 99% of the time, or we could use unicode (which might make the implementation slightly more complex).
For the first version of this, I say we just do an exact byte by byte comparison, use it for tests where the output shouldn't change, and decide on the solution for changing addresses later.
The text was updated successfully, but these errors were encountered: