Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add abstract Process Start/Stop, Module Load/Unload and DbgId events #1265

Open
mjsabby opened this issue Sep 12, 2020 · 4 comments · May be fixed by #1276
Open

Add abstract Process Start/Stop, Module Load/Unload and DbgId events #1265

mjsabby opened this issue Sep 12, 2020 · 4 comments · May be fixed by #1276

Comments

@mjsabby
Copy link
Contributor

mjsabby commented Sep 12, 2020

Today TraceLog depends on ProcessStart, ProcessStop (and their DCStart/DCStop equivalents), ImageLoad, Image Unload (and their DCStart/Stop equivalents) and ImageID events to be able to do useful event viewing and analysis of data.

As we add the ability to add multiple processes inside NetTrace files, it becomes important to represent these events. After discussion with @noahfalk and @brianrob it seems like having them not depend on the Windows Kernel events per-se seems useful and doesn't encumber the NetTrace format with these OS-specific concepts.

So, effectively these events will inform equivalent data but using different events. In fact, they will be like any other EventSource events, except their parsers will be part of the TraceEvent package and we will hook them up by default in the parser hook up for NetTrace and are "enlightened" to be acted up on when serializing TraceProcess/TraceModule, etc.

@mjsabby
Copy link
Contributor Author

mjsabby commented Sep 12, 2020

I'm thinking of the following:

Process Start -> Process Id, Filename, Command Line Args
Process End -> Process Id
Module Load -> ModuleLoadAddress, ModuleSize, ModuleDebugGuid, ModuleDebugAge, ModuleFilename, ModuleDebugFilename
Module Unload -> LoadAddress, Size

Thoughts?

@noahfalk
Copy link
Member

Thanks Mukul! A few thoughts came to mind, not sure where it all leads yet but just dumping it here as part of our brainstorming : )

  1. The Nettrace format might benefit from 1st class support for ThreadCreate/Destroy as well? One spot this shows up is that if the reader sees two events from the same threadId we can't actually be certain if it is the same thread or two different threads that recycled a threadId and potentially dropped events.
  2. Do want to allow these special events to be omitted if there was backpressure while creating the stream? I'm tempted to say no, they need to be reliable. This doesn't necessarily require special treatment in the file format but it implies there will be some special treatment from the trace writer regardless.
  3. Do we need these process/module/thread events to exist outside of the normal event stream so that they can be rapidly located in the file? One thought I was playing around with was allowing the format to encode multiple independent streams of events by putting a 4 byte stream index in the header of every EventBlock, MetadataBlock, and StackBlock. All of these abstract process, thread, and module events might live in stream 1 while the majority of the events are in stream 2.
  4. We probably don't want to lose OS specific data that is present in the native event formats even if we are abstracting a limited subset. That suggests either we have a way to log both the platform native and abstract event correlating them together or we have a way to fuse them into a single event.
  5. I could imagine that some of the module load fields wind up being format specific because probably a common task is searching for images or symbols? I think it is fine as long as we document clearly and one of the event payload fields encodes the image format.
  6. I am wondering if we will have future scenarios where it would be useful to encode data about processes that were running in different PID namespaces? A simple example might be a multi-container pod that has several processes all reporting PID=1. A more exotic scenario might be a trace that represents activity on multiple virtual or physical machines? We might want some extensibility to add more identifiers to a process.

I've got yet some other thoughts about improvements it might be worth making if we are to make a breaking change but I'm trying to keep this thread just on the abstract events : ) We'll probably need to create an issue that is a general purpose v.next format wishlist. Let me know if you think I am getting carried away but breaking changes have same adoption challenges whether we do a little or a lot - so I want to at least get a lot out of it : )

@mjsabby
Copy link
Contributor Author

mjsabby commented Sep 15, 2020

First, let me say that nothing you've mentioned is bad, and I want ALL of it :) That said, I'm tempering this with the reality of me wanting at least something fairly quickly. Whether that means some of it is a format change and some of it is merely a library change is for us to decide.

  1. ThreadCreate/Destroy is wonderful, I'm supportive of it, although it's not high on my priority list.
  2. Agreed.
  3. I like the general idea, although sometimes having them in the same stream is also useful. So perhaps both?
  4. Sure.
  5. Ok.
  6. I'm skeptical this is something that is needed in the format just yet (and I've been doing all my work in light of containers), so maybe we can discuss more?

@noahfalk
Copy link
Member

That said, I'm tempering this with the reality of me wanting at least something fairly quickly

Sure, we can chat a little tomorrow about timelines and options for staging the work. There probably is some natural tension for you to deliver increments of work quickly vs. my desire to make breaking changes fairly substantive. I also have relatively limited time since this is concurrent with planning .NET 6 + trying to finish .NET 5. Hopefully we can still find a path that delivers what we need : )

@mjsabby mjsabby linked a pull request Sep 18, 2020 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants