Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added APC support #110

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Added APC support #110

wants to merge 10 commits into from

Conversation

John-Toohey
Copy link

Fixes #109.

This adds support for APC (Application Program Command) in the same way OSC is implemented. It does, however, have the drawback of (just over) doubling the in-memory footprint of the table::STATE_CHANGES static, for it became necessary to split State and Action as the amount of variants within Action exceeded 15, meaning it was no longer possible for it to be a pseudo u4.

More detail in the commit messages.

Thank you for your time.

…ation

Several changes were made for this:

- `STATE_CHANGES` now stores a tuple of `(State, Action)` as opposed to `u8`
    - The negative aspect of this is that it will now take over twice as
      much space in memory (4 KiB) to 8.5 KiB
    - On a modern system this should not come at a reasonable preformance cost
    - The justification for this is that further `Action`s were needed to help
      manage an APC command: `ApcBegin`, `ApcEnd`, and `ApcPut`.  Therein it
      became necassary to extend the struct from a quasi-`u4` to a `u8` as
      these represent values 16, 17, and 18 respectively.
    - This also removes the functions `pack` and `unpack`, a possible benefit
      of the change.
- This means code was also changed in the macro library
  `vte_generate_state_changes`
- `Perform` now includes three new functions with signatures listed below:

    fn apc_begin(&mut self) {}
    fn apc_end(&mut self) {}
    fn apc_put(&mut self) {}

    These I believe are the best way to facilitate the use of APC.  An
    alternative method could exist without `apc_begin`, although I believe
    that from a library perspective it is more ergonomic to have such as
    function, as it elimintates a potential check required by downstream
    libraries.
- For illustration purposes there is a new examples `apc.rs`.  I will
  delete this example in the next commit and merge its functionality with
  the already existing `parselog.rs` and the testing suite.  I have merely
  kept it to help demonstrate the functionality provided in the commit, and to
  facilitate my preliminary testing of it
- This involved:
    - Removing its dedicated (and frankly useless) example
    - Writing a proper test
    - Having it store a buffer.  It currently uses its own buffer although
      it could fairly easily be merged with that of OSC without any
      obvious drawbacks.  Next commit?
    - Rather than having three functions within `Perform` it now has just
      one `acp_dispatch` which should be easier to code with.
Comment on lines 55 to 65
#[inline(always)]
pub fn unpack(delta: u8) -> (State, Action) {
unsafe {
(
// State is stored in bottom 4 bits
mem::transmute(delta & 0x0f),
// Action is stored in top 4 bits
mem::transmute(delta >> 4),
)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be benchmarked. There is no way we'll merge a patch that reduces performance for a feature nobody should be using.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running some initial benchmarks I've got the following results from cargo bench:

test bench::state_changes ... bench:     108,197 ns/iter (+/- 3,199)
test bench::testfile      ... bench:     121,021 ns/iter (+/- 3,617)

whilst the old code yields the following:

test bench::state_changes ... bench:      95,320 ns/iter (+/- 4,951)
test bench::testfile      ... bench:     101,009 ns/iter (+/- 7,342)

Which equates to roughly an 12% ~ 16% decrease in performance. I am using flamegraph now to investigate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also recommend testing with https://github.com/alacritty/vtebench. That's ultimately what decides which features get merged into Alacritty or not.

src/lib.rs Outdated
@@ -73,7 +73,8 @@ impl<'a, P: Perform> utf8::Receiver for VtUtf8Receiver<'a, P> {
/// [`Perform`]: trait.Perform.html
///
/// Generic over the value for the size of the raw Operating System Command
/// buffer. Only used when the `no_std` feature is enabled.
/// buffer. Only used when the `no_std` feature is enabled. This buffer is
/// also used for APC (Application Program Commands.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// also used for APC (Application Program Commands.)
/// also used for APC (Application Program Commands).

src/lib.rs Outdated
Comment on lines 378 to 383
Action::ApcStart => {
self.osc_raw.clear();
},
Action::ApcEnd => {
self.apc_dispatch(performer);
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Statements should be inlined.

src/lib.rs Outdated
Comment on lines 459 to 460
/// Called at the end of an APC (application program command), where
/// `bytes` is the content of the APC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Called at the end of an APC (application program command), where
/// `bytes` is the content of the APC
/// Called at the end of an APC (application program command), where
/// `bytes` is the content of the APC.

src/lib.rs Outdated
const INPUT_1: &[u8] = b"\x1b_abc\x9c";
const INPUT_2: &[u8] = b"\x1b_abc\x1b\\";

// Test with ST terminator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Test with ST terminator
// Test with ST terminator.

src/lib.rs Outdated
}
assert_eq!(dispatcher.dispatched, vec![Sequence::Apc(b"abc".to_vec()),]);

// Test with ESC \ terminator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Test with ESC \ terminator
// Test with ESC \ terminator.

src/table.rs Outdated
Comment on lines 173 to 178
0x00..=0x06 => (Anywhere, Ignore),
0x08..=0x17 => (Anywhere, Ignore),
0x19 => (Anywhere, Ignore),
0x1c..=0x1f => (Anywhere, Ignore),
0x20..=0xff => (Anywhere, ApcPut),
0x9c => (Ground, None),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't checked these yet, just for personal reference.

@John-Toohey
Copy link
Author

I've worked on performance, and have got the following results (from vtebench):

Summary Table

TL;DR About the same in both master and the patch

master (ms) John-Toohey:apc (ms)
cursor_motion 125.9 ± 23.72 119.24 ± 18.61 🟢
dense_cells 357.36 ± 23.02 335.43 ± 13.69 🟢
light_cells 157.11 ± 15.96 151.83 ± 18.97 🟢
scrolling 485.88 ± 99.5 444.17 ± 14.44 🟢
scrolling_bottom_region 449.83 ± 9.96 🟢 475.5 ± 46.53
scrolling_bottom_small_region 1486.29 ± 67.27 1436.43 ± 11.07 🟢
scrolling_fulscreen 142.09 ± 8.89 141.37 ± 16.11 🟢
scrolling_top_region 1538.14 ± 22.46 1391.38 ± 9.32 🟢
scrolling_top_small_region 1358.63 ± 29.52 1308.38 ± 14.42 🟢
unicode 150.99 ± -20.21 🟢 151.41 ± 35.04

🟢 = fastest

Raw Results

Normal Alacritty:

Results:

  cursor_motion (80 samples @ 1.41 MiB):
    125.9ms avg (90% < 151ms) +-23.72ms

  dense_cells (28 samples @ 4.1 MiB):
    357.36ms avg (90% < 398ms) +-23.02ms

  light_cells (64 samples @ 1.06 MiB):
    157.11ms avg (90% < 166ms) +-15.96ms

  scrolling (17 samples @ 1 MiB):
    485.88ms avg (90% < 651ms) +-99.5ms

  scrolling_bottom_region (23 samples @ 1 MiB):
    449.83ms avg (90% < 463ms) +-9.96ms

  scrolling_bottom_small_region (7 samples @ 1 MiB):
    1486.29ms avg (90% < 1634ms) +-67.27ms

  scrolling_fullscreen (35 samples @ 1 MiB):
    142.09ms avg (90% < 154ms) +-8.89ms

  scrolling_top_region (7 samples @ 1 MiB):
    1538.14ms avg (90% < 1577ms) +-22.36ms

  scrolling_top_small_region (8 samples @ 1 MiB):
    1358.63ms avg (90% < 1431ms) +-29.52ms

  unicode (67 samples @ 1.06 MiB):
    150.99ms avg (90% < 177ms) +-20.21ms

Alacritty + John-Toohey:apc

Results:

  cursor_motion (84 samples @ 1.41 MiB):
    119.24ms avg (90% < 153ms) +-18.61ms

  dense_cells (30 samples @ 4.1 MiB):
    335.43ms avg (90% < 347ms) +-13.69ms

  light_cells (66 samples @ 1.06 MiB):
    151.83ms avg (90% < 182ms) +-18.97ms

  scrolling (18 samples @ 1 MiB):
    444.17ms avg (90% < 455ms) +-14.44ms

  scrolling_bottom_region (22 samples @ 1 MiB):
    475.5ms avg (90% < 506ms) +-46.53ms

  scrolling_bottom_small_region (7 samples @ 1 MiB):
    1436.43ms avg (90% < 1450ms) +-11.07ms

  scrolling_fullscreen (35 samples @ 1 MiB):
    141.37ms avg (90% < 156ms) +-16.11ms

  scrolling_top_region (8 samples @ 1 MiB):
    1391.38ms avg (90% < 1412ms) +-9.32ms

  scrolling_top_small_region (8 samples @ 1 MiB):
    1308.38ms avg (90% < 1339ms) +-14.42ms

  unicode (66 samples @ 1.06 MiB):
    151.41ms avg (90% < 179ms) +-35.04ms

vte's benchmarks

TL;DR Approximately the same.

alacritty/vte:master

test bench::state_changes ... bench:     99,218 ns/iter (+/- 1,106)
test bench::testfile      ... bench:     98,814 ns/iter (+/- 1,364)

John-Toohey:apc

test bench::state_changes ... bench:     100,910 ns/iter (+/- 1,106)
test bench::testfile      ... bench:      98,791 ns/iter (+/- 1,364)

@John-Toohey
Copy link
Author

As for how this new solution works, it basically takes the implementation of OscString and makes it work over all 4 types of strings (so it actually works for SOS and PM strings too) meaning that the STATE_CHANGES is still an array of u8s. This also means there is now a Null state to replace ApcPmSos, which is no longer a special case.

@chrisduerr
Copy link
Member

@John-Toohey If you have an Alacritty revision that is built against this, could you put that up in a draft PR on the Alacritty repo? That way I can easily run performance tests against it.

Copy link
Member

@chrisduerr chrisduerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've kinda been ignoring this PR for a while, mostly because I'm not particularly happy with the implementation but I also do not have any concrete suggestions without looking into this myself (which I have no interest in doing).

But hopefully this will at least keep the conversation going and provide some feedback on why I think it's not mergeable as-is.

@@ -19,7 +19,7 @@ pub enum State {
#[default]
Ground = 12,
OscString = 13,
SosPmApcString = 14,
Null = 14,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a bad name. A "null state" already exists, it's called Ground. Looking at State::Null alone would make it impossible to tell what that state is for.

Comment on lines +309 to 343
match self.str_start {
0x1d => {
match param_idx {
// Finish last parameter if not already maxed
MAX_OSC_PARAMS => (),

// First param is special - 0 to current byte index
0 => {
self.osc_params[param_idx] = (0, idx);
self.osc_num_params += 1;
},

// All other params depend on previous indexing
_ => {
let prev = self.osc_params[param_idx - 1];
let begin = prev.1;
self.osc_params[param_idx] = (begin, idx);
self.osc_num_params += 1;
},
}
self.osc_dispatch(performer, byte);
},

// All other params depend on previous indexing
_ => {
let prev = self.osc_params[param_idx - 1];
let begin = prev.1;
self.osc_params[param_idx] = (begin, idx);
self.osc_num_params += 1;
0x18 => {
self.sos_dispatch(performer);
},
0x1e => {
self.pm_dispatch(performer);
},
0x1F => {
self.apc_dispatch(performer);
},
// This condition should never be hit under the
// current way in which the code is ran.
_ => unreachable!(),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This basically feels like an embedded parser. Most of our parser is structured in a way to stream through bytes and process them as they're coming in, however rather than creating state transitions you're storing a state transition to later then match on it instead. VTE is already too slow and I'm afraid this will just amplify that problem.

Comment on lines +340 to +342
// This condition should never be hit under the
// current way in which the code is ran.
_ => unreachable!(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is basically just a long way to say unreachable!(). It doesn't explain why it is unreachable it just states that it is, which unreachable!() does too. Should either add a useful comment or remove this.

@@ -261,7 +278,7 @@ impl<const OSC_RAW_BUF_SIZE: usize> Parser<OSC_RAW_BUF_SIZE> {
let idx = self.osc_raw.len();

// Param separator
if byte == b';' {
if self.str_start == 0x1d && byte == b';' {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One extra branch isn't going to be the end of the world, but I don't see how this isn't just going to be a net-negative for anyone who doesn't use APC escapes (which is basically everyone).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Support for APC?
2 participants