Add support for collecting CLR event through EventPipe #1291

chrisnas · 2020-10-16T08:36:14Z

Add -eventpipe option to activate the EventPipe collection
Add -providers provider:keyword:verbosity:tags,... option to allow fine grained tuning of dotnet-trace execution
Add -sdk-path to point to an already installed dotnet SDK (otherwise it will be install in /tmp/dotnet_sdk_tool)
When the collection is done, a trace.nettrace and eventpipe.log files will be part of the resulting zip file that perfview is now able to leverage

brianrob

@chrisnas, @gleocadie thank you very much for your contribution! I have a few questions and comments, but this is looking quite good.

brianrob · 2020-10-22T23:43:59Z

src/perfcollect/perfcollect

@@ -642,6 +642,10 @@ usePerf=1
 # Use LTTng
 useLTTng=1

+# Use EventPipe to collect CLR events
+useEventPipe=0
+sdkAndToolDir="/tmp/dotnet_sdk_tool"


Nit: Can we make this something like /tmp/perfcollect-dotnet-sdk so that it's clear where it came from?

brianrob · 2020-10-22T23:44:27Z

src/perfcollect/perfcollect

@@ -1377,18 +1382,33 @@ ProcessArguments()
        elif [ "-nolttng" == "$arg" ]
        then
            useLTTng=0
+        elif [ "-eventpipe" == "$arg" ]
+        then
+            useEventPipe=1


I like this - if you enable EventPipe, LTTng is disabled.

brianrob · 2020-10-22T23:45:20Z

src/perfcollect/perfcollect

+            useLTTng=0
+        elif [ "-providers" == "$arg" ]
+        then
+            providers=$rawvalue


I would like to propose that if -eventpipe isn't specified then this writes a FatalError that -eventpipe must be specified. This ensures that users of LTTng don't inadvertently specify this, but get no results.

Good one. I will add a validation step after the arguments are parsed.

brianrob · 2020-10-22T23:46:55Z

src/perfcollect/perfcollect

        elif [ "-noperf" == "$arg" ]
        then 
            usePerf=0
        elif [ "-gccollectonly" == "$arg" ]
        then
            gcCollectOnly=1
+            usePerf=0


I may have mis-understood, but I think you were wanting a way to make it possible to capture -gccollectonly with perf enabled.

I was thinking that I would do that in a second step (PR), we are not blocked now. what do you think?

Sure, no problem.

brianrob · 2020-10-22T23:50:32Z

src/perfcollect/perfcollect

+BuildEventPipeArgs()
+{
+    if [ "$collectionPid" == "" ]
+    then


I'm wondering if we should have an argument validation step after the arguments are parsed to handle this and some of the other possible argument-related issues that I mentioned above. What do you think?

We are aligned :) I will add a validation step

Great, thanks!

brianrob · 2020-10-22T23:56:07Z

src/perfcollect/perfcollect

+       WriteStatus "Installing dotnet sdk in $sdkAndToolDir"
+       ResetText
+       RunSilent "mkdir $sdkAndToolDir"
+       RunSilent "curl -OL https://dot.net/v1/dotnet-install.sh"


I think the SDK directory should be created first so that you can download dotnet-install.sh and store it in this directory as opposed to the current working directory.

Oh, I see thank, I missed that.

brianrob · 2020-10-22T23:56:52Z

src/perfcollect/perfcollect

+   then
+      FatalError "dotnet-trace tool was installed correctly."
+   fi
+   LogAppend 'dotnet-trace version:' `$sdkAndToolDir/dotnet trace --version`


Thanks for adding the version information to the log unconditionally!

brianrob · 2020-10-22T23:57:56Z

src/perfcollect/perfcollect

@@ -2153,6 +2312,11 @@ ProcessArguments $@
 # Ensure prerequisites are installed.
 EnsurePrereqsInstalled

+if [ "$useEventPipe" == "1" ]  && [ "$1" != "stop" ]


I would like this to follow the same pattern as we do for the other prerequisites, and make download and install of the SDK part of perfcollect install. This ensures that any disk changes other than for the traces themselves are intentional on the part of the user.

@brianrob one question: the .NET sdk and the tool will unconditionally be installed in /tmp/perfcollect-dotnet-sdk when using perfcollect install.
The DiscoverCommands and InitializeLog functions will take use this path but not the one provided by -sdk-path option.
I was wondering if the -sdk-path was still worth. Maybe we can remove it no. What do you think ? (I might have missed something)

You bring up a good point here. I was trying to have more flexibility here, but I feel like the more I think about it, the more complicated things get, and that we should go back to a more simple plan as you are suggesting.

Here's what I think we should do, let me know what you think:

If there is a global dotnet SDK installed, use it. If not, install to /tmp/perfcollect-dotnet-sdk.

Install dotnet-trace using whichever SDK we have.

When collecting, discover the SDK to use based on whether or not there is a global one or one in /tmp.

This ensures that if someone doesn't want to install another SDK, but already has one that they can use it. What do you think?

That sounds good to me.

brianrob · 2020-10-23T00:01:01Z

src/perfcollect/perfcollect

@@ -642,6 +642,10 @@ usePerf=1
 # Use LTTng
 useLTTng=1

+# Use EventPipe to collect CLR events
+useEventPipe=0


I'd like to propose that instead of using the term eventpipe that we use dotnetTrace and/or dotnet-trace since the actual tool being used here is dotnet-trace. This is more of a forward-looking thing, so that it's clear what collector is being used should there be multiple, or if people don't know what eventpipe is.

brianrob · 2020-10-23T00:01:58Z

src/perfcollect/perfcollect

+then
+    EnsureDotNetTraceToolIsInstalled
+fi
+


Can you please add a regression test that uses dotnet-trace?

Ok, I missed the test folder. I will add a regression test.

@brianrob I tried to add a regression test (it allowed me to find stuff I forgot to add in the script: installing curl). But, to run the script with -dotnet-trace, we need an .NET app running in the container. I'm not sure it's possible to do that in the current state. Do you want me to add this?

Yes, that would be great.

ezsilmar · 2021-05-20T13:26:01Z

Hi, I'd like to push this PR forward to be able to merge criteo-forks@747f2a8 so that we stop relying on the perfview fork internally :)

I discussed with @chrisnas and @gleocadie, and it seems that some tests were missing. I also see that 1 test in the current build failed, but the build result is long gone. @brianrob could you please re-trigger the test if there's such an option?

brianrob · 2021-05-24T15:33:06Z

/azp run

azure-pipelines · 2021-05-24T15:33:15Z

Azure Pipelines successfully started running 1 pipeline(s).

brianrob · 2021-05-24T15:33:41Z

Thanks @ezsilmar. Just triggered a new CI run.

ezsilmar · 2021-09-17T12:57:38Z

Hello! @brianrob I got some time to come back to this PR and would be glad to get a code review.

I mainly fought test instability:

Disable test parallelization: this was already the case for most test projects
Better handling of shared directories in EtlTestBase
In perfcollect install for Ubuntu removed the packages that are missing, used linux-tools-generic instead
In container tests for dotnet-trace added a sleep after launching the test program

About the last point, something weird is happening. If I attach to the process with dotnet-trace right after the process is started, dotnet-trace hangs forever printing Stopping the trace. This may take up to minutes depending on the application being traced. If I wait for about a couple of seconds it works fine. This behavior reproduces in the github build pipeline, so there's probably a bug in dotnet-trace.

ezsilmar · 2021-09-17T14:16:43Z

The test failing currently is OOM of CanReadV4EventPipeTraceBiggerThan4GB, it passes on my machine.

…Tng)

ezsilmar · 2023-02-03T16:20:02Z

Hello, this PR is hanging for almost 3 years but it is still relevant in our context, and I think it'd be beneficial for the community as well.

To remind what this is all about, we often use PerfView on Windows to analyze the behavior of dotnet apps running on Linux. Relevant to this PR, PerfView can understand:

perf CPU samples, collected with perfcollect
lttng text data file, collected with perfcollect
nettrace file, collected with dotnet-trace

In the days of net2, using perf+lttng was the only way. Perfcollect greatly eased the process by combining events and cpu samples into a single .trace.zip file. Later, dotnet-trace became a thing making perfcollect almost abandoned (at least that's my feeling). While for the events dotnet-trace is much more convenient than lttng, it was never intended to match capabilities of perf. Thus today when we need both cpu sampling and events we deal with two separate artifacts: a perfcollect output and a nettrace file.

This PR is a quality of life change that allows the nettrace file to be packaged in .trace.zip, alongside perf data. A nice side-effect is we can zip .nettrace file which is important for sharing long sessions. The PR also modifies perfcollect to be able to use dotnet-trace under the hood, however this part is not important for my particular usecase as we run perf and zip directly in our troubleshooting code.

If modifying perfcollect is not something you'd like to support, we could just merge the change in PerfViewData.cs that tries to read .nettrace from .trace.zip: it's small and beneficial on its own. Wdyt?

Mentioning @brianrob as the last reviewer

gleocadie force-pushed the PR_perfcollect_eventpipe branch from 785af82 to 4c4c6de Compare October 19, 2020 14:35

brianrob reviewed Oct 23, 2020

View reviewed changes

gleocadie force-pushed the PR_perfcollect_eventpipe branch 2 times, most recently from f307984 to e8b179f Compare November 3, 2020 20:02

gleocadie force-pushed the PR_perfcollect_eventpipe branch from e8b179f to 9f100e8 Compare December 17, 2020 15:39

Base automatically changed from master to main February 2, 2021 23:16

Christophe Nasarre and others added 5 commits February 3, 2023 14:19

Support traces recorded on Linux with dotnet-trace (in addition to LT…

7c95595

…Tng)

Add support for events collection through EventPipe

8930b23

Add Rider files to .gitignore

c188a90

Make tests more stable

9944623

Make container tests pass

954aada

ezsilmar force-pushed the PR_perfcollect_eventpipe branch 2 times, most recently from 02df7c7 to 65bd20b Compare February 3, 2023 13:42

Bump test app to net6

d1fa55b

ezsilmar force-pushed the PR_perfcollect_eventpipe branch from 65bd20b to d1fa55b Compare February 3, 2023 13:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for collecting CLR event through EventPipe #1291

Add support for collecting CLR event through EventPipe #1291

chrisnas commented Oct 16, 2020

brianrob left a comment

brianrob Oct 22, 2020

gleocadie Oct 27, 2020

brianrob Oct 22, 2020

brianrob Oct 22, 2020

gleocadie Oct 27, 2020

brianrob Oct 22, 2020

gleocadie Oct 27, 2020

brianrob Oct 28, 2020

brianrob Oct 22, 2020

gleocadie Oct 27, 2020

brianrob Oct 28, 2020

brianrob Oct 22, 2020

gleocadie Oct 27, 2020

brianrob Oct 22, 2020

brianrob Oct 22, 2020

gleocadie Oct 28, 2020 •

edited

brianrob Oct 28, 2020

gleocadie Oct 29, 2020

brianrob Oct 23, 2020

gleocadie Oct 27, 2020

brianrob Oct 23, 2020

gleocadie Oct 27, 2020

gleocadie Nov 3, 2020 •

edited

brianrob Nov 10, 2020

ezsilmar commented May 20, 2021

brianrob commented May 24, 2021

azure-pipelines bot commented May 24, 2021

brianrob commented May 24, 2021

ezsilmar commented Sep 17, 2021

ezsilmar commented Sep 17, 2021

ezsilmar commented Feb 3, 2023

Add support for collecting CLR event through EventPipe #1291

Are you sure you want to change the base?

Add support for collecting CLR event through EventPipe #1291

Conversation

chrisnas commented Oct 16, 2020

brianrob left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gleocadie Oct 28, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gleocadie Nov 3, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ezsilmar commented May 20, 2021

brianrob commented May 24, 2021

azure-pipelines bot commented May 24, 2021

brianrob commented May 24, 2021

ezsilmar commented Sep 17, 2021

ezsilmar commented Sep 17, 2021

ezsilmar commented Feb 3, 2023

gleocadie Oct 28, 2020 •

edited

gleocadie Nov 3, 2020 •

edited