{"id":1321,"date":"2014-01-22T13:00:19","date_gmt":"2014-01-22T13:00:19","guid":{"rendered":"http:\/\/writeasync.net\/?p=1321"},"modified":"2014-01-22T06:38:31","modified_gmt":"2014-01-22T06:38:31","slug":"better-tracing-with-eventsource","status":"publish","type":"post","link":"http:\/\/writeasync.net\/?p=1321","title":{"rendered":"Better tracing with EventSource"},"content":{"rendered":"<p>Starting in .NET 4.5, managed code developers finally have a simple way to interact with ETW &#8212; <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/system.diagnostics.tracing.eventsource(v=vs.110).aspx\"><code>System.Diagnostics.Tracing.EventSource<\/code><\/a>. For a quick introduction, check out Vance Morrison&#8217;s <a href=\"http:\/\/blogs.msdn.com\/b\/vancem\/archive\/2012\/07\/09\/logging-your-own-etw-events-in-c-system-diagnostics-tracing-eventsource.aspx\">inaugural blog post on <code>EventSource<\/code><\/a> and the also <a href=\"http:\/\/blogs.msdn.com\/b\/vancem\/archive\/2012\/07\/09\/more-details-on-eventsource-the-class-specification.aspx\">very helpful feature specification<\/a>.<\/p>\n<p>While <code>EventSource<\/code> is an improvement over the prior art (using <code>EventProvider<\/code> manually or via unwieldy <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/windows\/desktop\/aa385638(v=vs.85).aspx#arguments_specific_to_generating_code_that_your_provider_would_use_to_log_events\">generated code from MC.exe<\/a>), it is not perfect. In particular:<\/p>\n<ul>\n<li><strong>It forces the use of the <a href=\"http:\/\/accu.org\/index.php\/journals\/337\">singleton anti-pattern<\/a>.<\/strong> If you construct more than one instance of the same <code>EventSource<\/code>, <a href=\"http:\/\/entlib.codeplex.com\/discussions\/461975\">you are likely to encounter problems<\/a>.<\/li>\n<li><strong>Most existing tools do not understand how to decode <code>EventSource<\/code> events.<\/strong> <code>EventSource<\/code> <em>does<\/em> support the <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/system.diagnostics.tracing.eventsource.generatemanifest(v=vs.110).aspx\">generation of an instrumentation manifest<\/a> from your code (and it will even send this manifest at runtime as part of the event stream). However, the current in-box tools generally only know about registered ETW providers. Getting traces in a human-readable form is therefore a bit of a challenge, but there are solutions which I will cover in a future post.<\/li>\n<li><strong>In the initial version, it did not provide built-in support for activity\/correlation IDs.<\/strong> This has been addressed in .NET 4.5.1 (e.g. see <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/system.diagnostics.tracing.eventsource.writeeventwithrelatedactivityid(v=vs.110).aspx\"><code>WriteEventWithRelatedActivityId<\/code><\/a>) but will remain a problem for those who can&#8217;t upgrade. You can still set activity IDs yourself, however, and it will propagate them, as I show in the sample referenced below.<\/li>\n<\/ul>\n<p>But don&#8217;t let these issues dissuade you &#8212; <code>EventSource<\/code> is worth a look.<\/p>\n<p>I put together a complete sample to demonstrate some of the more challenging aspects of designing traces: <a href=\"https:\/\/github.com\/brian-dot-net\/writeasync\/tree\/master\/projects\/EventSourceSample\">EventSourceSample on GitHub<\/a>. The sample is built around a WCF calculator service client (the <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/ms733069(v=vs.110).aspx\">canonical WCF sample<\/a>). Below I describe a few important design points which may be of use to trace authors.<\/p>\n<p>I limited the number of contract methods to keep things relatively succinct (add, subtract, square root), but you can imagine in the real world you may have dozens of different client operations. The temptation might be to instrument every single operation with events like <code>Add<\/code>, <code>AddCompleted<\/code>, <code>AddError<\/code> but I don&#8217;t recommend this. Not only does it bloat your trace events, it makes writing automated analysis tools harder since they have to deal with dozens of distinct-but-really-the-same cases for start\/end\/error operations. Instead, as demonstrated in <a href=\"https:\/\/github.com\/brian-dot-net\/writeasync\/blob\/master\/projects\/EventSourceSample\/source\/EventSourceSample.Core\/CalculatorClientWithEvents.cs\"><code>CalculatorClientWithEvents<\/code><\/a>, I have a single typed event only for the start of every operation. Wrapping around this, I have <a href=\"https:\/\/github.com\/brian-dot-net\/writeasync\/blob\/master\/projects\/EventSourceSample\/source\/EventSourceSample.Core\/CalculatorClientWithActivity.cs\"><code>CalculatorClientWithActivity<\/code><\/a> which writes more generic client requests events for start, end, and error.<\/p>\n<p>The activity wrapper also handles the mostly mechanical process of generating a correlation ID before a call, resuming it on completion, and (perhaps most importantly) <em>restoring<\/em> it outside these scopes. The correlation ID is a critical piece of the tracing equation since it allows you to stitch together related calls across contexts. In this case, it is essential since the client calls are asynchronous and there is really no other way to tell which start and end\/error events are related.<\/p>\n<p>With correlation IDs I can track distinct calls but I still can&#8217;t tell where the calls originated. There might be one client or 100 clients all invoking methods at the same time. To distinguish them, it is useful to provide a client ID in the event payload. Depending on the needs of the application, this can be a friendly name (string), an integer (perhaps auto-incremented from some factory class?), or a totally unique ID (GUID). I opted for a GUID in the sample since it has the benefit of being truly unique anywhere and fewer drawbacks: integers can&#8217;t easily be synchronized across clients in different processes or machines; strings also have uniqueness issues and if allowed to be arbitrary size, can quickly bloat trace files.<\/p>\n<p>Keywords are an important filtering mechanism. In ETW, they group together broad categories of functionality within a provider. In the sample I define three keywords: &#8220;Basic&#8221; (add, subtract), &#8220;Advanced&#8221; (square root), &#8220;Request&#8221; (generic request start\/end\/error), and &#8220;Connection&#8221; (open\/abort). These come in handy in situations where you want to generate a pre-filtered trace file with only the events likely to be useful for a specific analysis scenario (e.g. to do performance analysis of requests regardless of operation type you could just enable the &#8220;Request&#8221; keyword).<\/p>\n<p>Choose event levels wisely. For most situations, <code>Informational<\/code> is appropriate. Use <code>Verbose<\/code> for events which are helpful for detailed debugging but would cause too much noise or performance degradation if always enabled. <code>Warning<\/code> and <code>Error<\/code> are tricky; I use both of them for cases of <em>unexpected<\/em> runtime exceptions with the distinction that <code>Warning<\/code> is more of a non-fatal or recoverable condition whereas <code>Error<\/code> indicates true degraded functionality or imminent failure. For <em>expected<\/em> exceptions, use <code>Informational<\/code> or even <code>Verbose<\/code> so as not to litter your traces with <a href=\"http:\/\/en.wikipedia.org\/wiki\/The_Boy_Who_Cried_Wolf\">&#8220;wolf cry&#8221;<\/a> warnings and errors; these &#8220;ignorable&#8221; errors cause heartburn for others who aren&#8217;t clued in (i.e. basically everyone except the original developer). I rarely see <code>Critical<\/code> in practice, but it could be the right choice for an event after which the app would <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/ms131100(v=vs.110).aspx\">immediately terminate<\/a>.<\/p>\n<p>Is it possible to test your traces? Yes, quite easily in fact when you use <code>EventSource<\/code>. With an appropriately configured <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/system.diagnostics.tracing.eventlistener(v=vs.110).aspx\"><code>EventListener<\/code><\/a> implementation, you too can practice TDD for trace events. This gives you a nice safety net as you evolve your system (e.g. asserting that important traces will continue to be emitted in all tested code paths, despite later refactoring) and also saves you from mistakes in initial event definitions (which are quite easy to make due to a preponderance of copy\/paste errors in such areas). Check out <a href=\"https:\/\/github.com\/brian-dot-net\/writeasync\/blob\/master\/projects\/EventSourceSample\/test\/EventSourceSample.Test.Unit\/ClientEventListener.cs\"><code>ClientEventListener<\/code><\/a> and <a href=\"https:\/\/github.com\/brian-dot-net\/writeasync\/blob\/master\/projects\/EventSourceSample\/test\/EventSourceSample.Test.Unit\/ClientWithActivityTest.cs\"><code>ClientWithActivityTest<\/code><\/a> for a starting point.<\/p>\n<p>Check out the <a href=\"https:\/\/github.com\/brian-dot-net\/writeasync\/commits\/master\/projects\/EventSourceSample\">commit history for EventSourceSample<\/a> to see how I built a complete ETW-instrumented app in steps with full TDD.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Starting in .NET 4.5, managed code developers finally have a simple way to interact with ETW &#8212; System.Diagnostics.Tracing.EventSource. For a quick introduction, check out Vance Morrison&#8217;s inaugural blog post on EventSource and the also very helpful feature specification. While EventSource&hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21,71,41],"tags":[],"class_list":["post-1321","post","type-post","status-publish","format-standard","hentry","category-async","category-diagnostics","category-tdd"],"_links":{"self":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts\/1321","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1321"}],"version-history":[{"count":0,"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts\/1321\/revisions"}],"wp:attachment":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1321"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1321"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1321"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}