{"id":5388,"date":"2018-01-28T13:00:48","date_gmt":"2018-01-28T13:00:48","guid":{"rendered":"http:\/\/writeasync.net\/?p=5388"},"modified":"2018-01-21T21:45:00","modified_gmt":"2018-01-21T21:45:00","slug":"async-holes-ziparchive","status":"publish","type":"post","link":"http:\/\/writeasync.net\/?p=5388","title":{"rendered":"Async holes: ZipArchive"},"content":{"rendered":"<p>From time to time, I encounter unexpected gaps in asynchronous object models in .NET. I&#8217;ve taken to calling these &#8220;async holes&#8221; since they usually present an unpleasant obstacle in the clear path I try to follow while executing the <a href=\"http:\/\/writeasync.net\/?p=211\">TDD + async<\/a> workflow. Today I will share an async hole I fell into recently on <code>ZipArchive<\/code>.<\/p>\n<p><a href=\"https:\/\/docs.microsoft.com\/en-us\/dotnet\/api\/system.io.compression.ziparchive?view=netframework-4.7.1\">System.IO.Compression.ZipArchive<\/a> is test-friendly, for the most part. Given that it only requires a <a href=\"https:\/\/docs.microsoft.com\/en-us\/dotnet\/api\/system.io.stream?view=netframework-4.7.1\">Stream<\/a> to access the underlying archive, you are free to pass, say, a <a href=\"https:\/\/docs.microsoft.com\/en-us\/dotnet\/api\/system.io.memorystream?view=netframework-4.7.1\">MemoryStream<\/a> if you are trying to <a href=\"http:\/\/www.artima.com\/weblogs\/viewpost.jsp?thread=126923\">avoid file system operations<\/a>. However, be aware that even if you construct a <code>ZipArchive<\/code> from a <code>MemoryStream<\/code>, <strong>all of the async operations will use the thread pool<\/strong>. To see for yourself, compile and run the <code>ZipUnzipAsync<\/code> method in this code snippet:<\/p>\n<pre class=\"brush: csharp; title: ; notranslate\" title=\"\">\r\nprivate static async Task ZipUnzipAsync()\r\n{\r\n    Encoding encoding = Encoding.UTF8;\r\n    string text = &quot;hello!&quot;;\r\n\r\n    Log(&quot;Zipping '{0}'...&quot;, text);\r\n    byte&#x5B;] output = await ZipAsync(text, encoding);\r\n\r\n    Log(&quot;Unzipping output ({0} bytes)...&quot;, output.Length);\r\n    string result = await UnzipAsync(output, encoding);\r\n\r\n    Log(&quot;Unzipped result: '{0}'&quot;, result);\r\n}\r\n\r\nprivate static void Log(string format, params object&#x5B;] args)\r\n{\r\n    string text = string.Format(CultureInfo.InvariantCulture, format, args);\r\n    Console.WriteLine(&quot;&#x5B;{0}] {1}&quot;, Thread.CurrentThread.ManagedThreadId, text);\r\n}\r\n\r\nprivate static async Task&lt;byte&#x5B;]&gt; ZipAsync(string text, Encoding encoding)\r\n{\r\n    MemoryStream outputZip = new MemoryStream();\r\n    using (ZipArchive zip = new ZipArchive(outputZip, ZipArchiveMode.Create))\r\n    {\r\n        ZipArchiveEntry entry = zip.CreateEntry(&quot;one.txt&quot;);\r\n        using (Stream entryStream = entry.Open())\r\n        {\r\n            byte&#x5B;] bytes = encoding.GetBytes(text);\r\n            await entryStream.WriteAsync(bytes, 0, bytes.Length);\r\n        }\r\n    }\r\n\r\n    return outputZip.ToArray();\r\n}\r\n<\/pre>\n<p>On my system, the output is:<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\n&#x5B;1] Zipping 'hello!'...\r\n&#x5B;3] Unzipping output (120 bytes)...\r\n&#x5B;4] Unzipped result: 'hello!'\r\n<\/pre>\n<p>You can see that we have a thread switch after each call to the inner zip and unzip methods. Given that there is only one <code>await<\/code> inside those methods, the thread switch must be occurring due to the <a href=\"https:\/\/docs.microsoft.com\/en-us\/dotnet\/api\/system.io.stream.writeasync?view=netframework-4.7.1\">WriteAsync<\/a> and <a href=\"https:\/\/docs.microsoft.com\/en-us\/dotnet\/api\/system.io.stream.readasync?view=netframework-4.7.1\">ReadAsync<\/a> stream methods.<\/p>\n<p>So, why does this happen? We can get a hint by looking at the <a href=\"https:\/\/github.com\/dotnet\/corefx\/blob\/master\/src\/System.IO.Compression\/src\/System\/IO\/Compression\/ZipArchiveEntry.cs\">.NET CoreFX implementation of ZipArchiveEntry<\/a>. There we see the incoming stream is wrapped in another stream &#8212; or more accurately multiple levels of streams, with <a href=\"https:\/\/github.com\/dotnet\/corefx\/blob\/master\/src\/System.IO.Compression\/src\/System\/IO\/Compression\/ZipCustomStreams.cs\">WrappedStream<\/a> or <a href=\"https:\/\/github.com\/dotnet\/corefx\/blob\/master\/src\/System.IO.Compression\/src\/System\/IO\/Compression\/DeflateZLib\/DeflateStream.cs\">DeflateStream<\/a> at the top for zip or unzip, respectively. As of .NET 4.7.1, neither of these streams implement the async methods, so the &#8220;failsafe&#8221; implementation on the core Stream class kicks in and pushes the blocking calls to <code>Read<\/code> or <code>Write<\/code> to the thread pool.<\/p>\n<p>As a side note, .NET Core 2.0 has improved this situation a bit, given that it does have async overrides on DeflateStream. Running the same app code shown above will thus eliminate the last thread switch on the unzip call (the zip portion still runs on the thread pool, alas). In any case, exercise caution in your unit tests, and make sure to handle the thread switch with an appropriate call to <code>Wait()<\/code> or <code>.Result<\/code>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>From time to time, I encounter unexpected gaps in asynchronous object models in .NET. I&#8217;ve taken to calling these &#8220;async holes&#8221; since they usually present an unpleasant obstacle in the clear path I try to follow while executing the TDD&hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21,41],"tags":[],"class_list":["post-5388","post","type-post","status-publish","format-standard","hentry","category-async","category-tdd"],"_links":{"self":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts\/5388","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5388"}],"version-history":[{"count":3,"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts\/5388\/revisions"}],"predecessor-version":[{"id":5390,"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts\/5388\/revisions\/5390"}],"wp:attachment":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5388"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5388"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5388"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}