From sync to async: file I/O

Spread the love

In the previous post, I described the basics of switching to async for network calls. Today I will discuss the same for file operations.

Reading and writing

To read and write files asynchronously, look no further than System.IO.FileStream. You must use the constructor with the useAsync parameter set to true. Failing to do so will cause all async operations to be inefficiently serviced via the thread pool instead of using overlapped I/O. Here is a quick example of how to read the last N bytes of one file and write them to a new file:

private static async Task ReadAndWriteLastBytesAsync(string inputFile, string outputFile, int byteCount)
{
    // This is the default buffer size.
    // (see <http://msdn.microsoft.com/en-us/library/47ek66wy(v=vs.110).aspx>)
    const int BufferSize = 4096;
    using (FileStream inputStream = new FileStream(inputFile, FileMode.Open, FileAccess.Read, FileShare.Read, BufferSize, true))
    using (FileStream outputStream = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.Read, BufferSize, true))
    {
        byte[] bytes = new byte[byteCount];
        inputStream.Seek(-bytes.Length, SeekOrigin.End);
        int actualCount = await inputStream.ReadAsync(bytes, 0, bytes.Length);
        await outputStream.WriteAsync(bytes, 0, actualCount);
    }
}

Unfortunately there aren’t many convenience methods for dealing with async file I/O, but it’s easy enough to write your own. Here’s an example showing a possible asynchronous counterpart for File.ReadAllLines:

private static async Task<string[]> ReadAllLinesAsync(string path, Encoding encoding)
{
    const int BufferSize = 4096;
    List<string> lines = new List<string>();
    using (StreamReader reader = new StreamReader(new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read, BufferSize, true), encoding))
    {
        while (!reader.EndOfStream)
        {
            string line = await reader.ReadLineAsync();
            lines.Add(line);
        }
    }

    return lines.ToArray();
}

Enumerating files/directories

Now for some bad news… there are no asynchronous APIs for traversing files and directories in .NET. It is theoretically possible to make an asynchronous request for a file/directory listing in Windows (e.g. see ZwQueryDirectoryFile), but good luck trying to do it, let alone from managed code.

But not to worry — as it turns out, Directory.EnumerateFiles can do a good enough job here. It has some limited degree of deferred execution and with a strategically placed Task.Yield will work fine for most circumstances. Here is a sample method which walks a directory tree and executes an async method for each file:

private static async Task ForEachFileAsync(string path, string searchPattern, SearchOption searchOption, Func<string, Task> doAsync)
{
    // Avoid blocking the caller for the initial enumerate call.
    await Task.Yield();

    foreach (string file in Directory.EnumerateFiles(path, searchPattern, searchOption))
    {
        await doAsync(file);
    }
}

And here is a sample use case which recursively searches for all text files in a directory tree and computes the Adler-32 checksum for each file:

private static Task SampleAsync()
{
    return ForEachFileAsync(
        @"Some\Path\Here",
        "*.txt",
        SearchOption.AllDirectories,
        f => ComputeAdler32Async(f));
}

private static async Task ComputeAdler32Async(string file)
{
    const int BufferSize = 4096;
    const int ModAdler = 65521;
    int a = 1;
    int b = 0;
    using (FileStream stream = new FileStream(file, FileMode.Open, FileAccess.Read, FileShare.Read, BufferSize, true))
    {
        byte[] buffer = new byte[BufferSize];
        int bytesRead;
        do
        {
            bytesRead = await stream.ReadAsync(buffer, 0, BufferSize);
            for (int i = 0; i < bytesRead; ++i)
            {
                a = (a + buffer[i]) % ModAdler;
                b = (b + a) % ModAdler;
            }
        }
        while (bytesRead > 0);
    }

    int adler32 = (b << 16) | a;
    Console.WriteLine("{0}: 0x{1:X}", file, adler32);
}

Deleting files

More bad news… there is no asynchronous API to delete a file. However, this might do in a pinch — a method to truncate and flush a file asynchronously, after which it is deleted:

private static async Task DeleteFileAsync(string file)
{
    using (FileStream stream = new FileStream(file, FileMode.Truncate, FileAccess.Write, FileShare.Delete, 4096, true))
    {
        await stream.FlushAsync();
        File.Delete(file);
    }
}

Creating/deleting directories

Alas, there are also no asynchronous APIs for creation and deletion of directories. You might consider pushing these operations after an already async step so that at worst you’re only blocking a thread pool thread. For example, consider this code which recursively deletes an entire directory tree; it takes advantage of the fact that the file truncate/flush operations are async and does the delete as a final step:

public static class Recursive
{
    public static async Task DeleteAsync(string path)
    {
        // Don't block calling thread
        await Task.Yield();
        await DeleteInnerAsync(path);
    }

    private static async Task DeleteInnerAsync(string path)
    {
        foreach (string file in Directory.EnumerateFiles(path, "*", SearchOption.TopDirectoryOnly))
        {
            await DeleteFileAsync(file);
        }

        foreach (string directory in Directory.EnumerateDirectories(path, "*", SearchOption.TopDirectoryOnly))
        {
            await DeleteInnerAsync(directory);
        }

        Directory.Delete(path);
    }

    private static async Task DeleteFileAsync(string file)
    {
        using (FileStream stream = new FileStream(file, FileMode.Truncate, FileAccess.Write, FileShare.Delete, 4096, true))
        {
            await stream.FlushAsync();
            File.Delete(file);
        }
    }
}

Despite the limitation of relatively few truly asynchronous APIs, “good enough” async file I/O is possible and relatively painless in .NET.

3 thoughts on “From sync to async: file I/O

  1. An *actual* async developer

    When the first thing “writeasync.net” tells me is that it’s “good enough” when enumerating directories to just yield between directories, that’s a sure sign it’s time to move along. As soon as a user navigates to a network directory and the network is offline, your UI freezes while the first call times out. Avoiding that is the entire point of doing it async.

    If you even don’t know the basics, please learn the basics of async I/O yourself first before running a site named “writeasync.net”.

    1. Brian Rogers Post author

      Hello, thanks for the comment. The initial Task.Yield (marked by the comments “// Avoid blocking the caller for the initial enumerate call.” and “Don’t block calling thread”) in my code samples mitigates the negative effect of freezing the UI that you speak of — then you are free to call a potentially long blocking method without degrading the user experience. As we both know, it is never good to execute any I/O operations directly on the UI thread. Let me know if I missed your point.

  2. Pingback: A real async GetFiles? – WriteAsync .NET

Leave a Reply

Your email address will not be published. Required fields are marked *