Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interceptors #31

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Interceptors #31

wants to merge 1 commit into from

Conversation

mfelsche
Copy link
Member

@mfelsche mfelsche commented Nov 4, 2020

Signed-off-by: Matthias Wahl <mwahl@wayfair.com>
@darach darach added open-review This RFC is undergoing open / public review with intent to proceed enhancement New feature or request labels Nov 4, 2020
source:
- id: ws-in
type: ws
interceptors:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep in and out separated what's the difference to using pre- and postprocesssors? i.e. what is the benefit we'd get from interceptors if a user still has to configure the chains separately. It seems to just move the current system into a more nested format.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, i first went with only 1 chain. Why this isn't satisfying, there are two elements to this:

  • There is the minor issue, that users might not always want to have the exact reversed operations. Using interceptors that are strictly bidirectional and reverse their operation from the other direction, we take that possibility from users.
  • Also, there more general and important (to me) is that interceptor operations are fundamentally different for the two directions. They need to be configured differently. Putting this into the same entity, conflates and confuses
    The most obvious example is the interceptor splitting input bytes by \n. The reverse operation on outgoing data could be appending a \n to each incoming byte array. But this is not doing exactly the reverse, it should also join those lines, to be the exact opposite. In our current implementation, some sinks do batch, some dont.
    Adding this joining/batching to the interceptor who splits in one direction and joins in the other, would require to configure the max batch size and/or match batch wait time. To me it is cleaner and more flexible to separate it out into two chains like described in the RFC. Each interceptor (split and join) has a cleaner operation, that does different things.

benefits compared to pre-/postprocessors

  • We have a clearer api. we have one entity for which the direction of processing is irrelevant.
  • Clearer purpose for each interceptor due to more generic concept/operation. split and join instead of lines.
  • They are configurable
  • Having two chains does not block people who want to do different operations on incoming and outgoing data.

If you have a good idea, to smunch both chains into one and stuff is still clear and maintains the level of flexibility this RFC exposes, I am all for it.


A `Codec` is stateless and bidirectional. All existing pre- and postprocessors define the same operation (e.g. split a given byte sequence into lines). Interceptors would join those dual operations into one entity.

It might be interesting at some point to not only allow a simple chain of interceptors, feeding bytes through them, but allow graphs with branching and joining, decisions where to continue based on investigation of the current chunk.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worried about this, it seems to add a lot of complexity. Mentioning it in three spots somewhat sets the expectation of that being implemented.

For example how would a graph based set of processors be different from a pipeline? Would users now be able to define whole pipelines in the connector? How would we guarantee that data that flows through a graph in the inbound follows the right graph outbound? How can we make sure this is easy to understand for a user? How do we make sure it's debug-gable? Last but not least what would be use-cases for this?

Perhaps we can just take that out and include it in a new RFC if it becomes relevant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes a lot of sense!

@Licenser
Copy link
Member

related: tremor-rs/tremor-runtime#529

@Licenser Licenser added this to Triage in RFC Status Jan 20, 2021
@darach darach changed the title RFC for interceptors Interceptors Jan 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request open-review This RFC is undergoing open / public review with intent to proceed
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants