Interceptors #31

mfelsche · 2020-11-04T11:52:47Z

Signed-off-by: Matthias Wahl <mwahl@wayfair.com>

Licenser · 2020-11-09T14:15:34Z

text/0000-interceptors.md

+source:
+  - id: ws-in
+    type: ws
+    interceptors:


If we keep in and out separated what's the difference to using pre- and postprocesssors? i.e. what is the benefit we'd get from interceptors if a user still has to configure the chains separately. It seems to just move the current system into a more nested format.

So, i first went with only 1 chain. Why this isn't satisfying, there are two elements to this:

There is the minor issue, that users might not always want to have the exact reversed operations. Using interceptors that are strictly bidirectional and reverse their operation from the other direction, we take that possibility from users.

Also, there more general and important (to me) is that interceptor operations are fundamentally different for the two directions. They need to be configured differently. Putting this into the same entity, conflates and confuses
The most obvious example is the interceptor splitting input bytes by \n. The reverse operation on outgoing data could be appending a \n to each incoming byte array. But this is not doing exactly the reverse, it should also join those lines, to be the exact opposite. In our current implementation, some sinks do batch, some dont.
Adding this joining/batching to the interceptor who splits in one direction and joins in the other, would require to configure the max batch size and/or match batch wait time. To me it is cleaner and more flexible to separate it out into two chains like described in the RFC. Each interceptor (split and join) has a cleaner operation, that does different things.

benefits compared to pre-/postprocessors

We have a clearer api. we have one entity for which the direction of processing is irrelevant.

Clearer purpose for each interceptor due to more generic concept/operation. split and join instead of lines.

They are configurable

Having two chains does not block people who want to do different operations on incoming and outgoing data.

If you have a good idea, to smunch both chains into one and stuff is still clear and maintains the level of flexibility this RFC exposes, I am all for it.

Licenser · 2020-11-09T14:26:45Z

text/0000-interceptors.md

+
+A `Codec` is stateless and bidirectional. All existing pre- and postprocessors define the same operation (e.g. split a given byte sequence into lines). Interceptors would join those dual operations into one entity.
+
+It might be interesting at some point to not only allow a simple chain of interceptors, feeding bytes through them, but allow graphs with branching and joining, decisions where to continue based on investigation of the current chunk.


I'm a bit worried about this, it seems to add a lot of complexity. Mentioning it in three spots somewhat sets the expectation of that being implemented.

For example how would a graph based set of processors be different from a pipeline? Would users now be able to define whole pipelines in the connector? How would we guarantee that data that flows through a graph in the inbound follows the right graph outbound? How can we make sure this is easy to understand for a user? How do we make sure it's debug-gable? Last but not least what would be use-cases for this?

Perhaps we can just take that out and include it in a new RFC if it becomes relevant?

That makes a lot of sense!

Licenser · 2020-11-13T13:36:03Z

related: tremor-rs/tremor-runtime#529

RFC for interceptors

9678fac

Signed-off-by: Matthias Wahl <mwahl@wayfair.com>

darach added open-review This RFC is undergoing open / public review with intent to proceed enhancement New feature or request labels Nov 4, 2020

Licenser reviewed Nov 9, 2020

View reviewed changes

Licenser mentioned this pull request Nov 13, 2020

Investigate Introducing Peer/Transport tremor-rs/tremor-runtime#529

Closed

Licenser added this to Triage in RFC Status Jan 20, 2021

darach changed the title ~~RFC for interceptors~~ Interceptors Jan 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interceptors #31

Interceptors #31

mfelsche commented Nov 4, 2020 •

edited

Licenser Nov 9, 2020

mfelsche Nov 9, 2020

Licenser Nov 9, 2020

mfelsche Nov 9, 2020

Licenser commented Nov 13, 2020


		A `Codec` is stateless and bidirectional. All existing pre- and postprocessors define the same operation (e.g. split a given byte sequence into lines). Interceptors would join those dual operations into one entity.

		It might be interesting at some point to not only allow a simple chain of interceptors, feeding bytes through them, but allow graphs with branching and joining, decisions where to continue based on investigation of the current chunk.

Interceptors #31

Are you sure you want to change the base?

Interceptors #31

Conversation

mfelsche commented Nov 4, 2020 • edited

Licenser Nov 9, 2020

Choose a reason for hiding this comment

mfelsche Nov 9, 2020

Choose a reason for hiding this comment

benefits compared to pre-/postprocessors

Licenser Nov 9, 2020

Choose a reason for hiding this comment

mfelsche Nov 9, 2020

Choose a reason for hiding this comment

Licenser commented Nov 13, 2020

mfelsche commented Nov 4, 2020 •

edited