A real async GetFiles?

Spread the love

I’ve lamented in the past that there is no real async GetFiles. But that’s okay — we’re problem solvers! Perhaps if we could drop down to the core native API, we could fill in a gap like this. Let’s start by figuring out how GetFiles is implemented in .NET. Searching in the .NET Core source code, we eventually find a FileSystemEnumerator class for Win32 which calls NtQueryDirectoryFile.

Ah, but there are some problems. First, since this is an API from the NT native layer, we are not working with the kinds of constructs that we are used to, such as OVERLAPPED structures and the like. (From previous experience, we know that overlapped I/O is actually quite doable in .NET.) Instead, we would have to do some sort of complex APC magic — not so straightforward in managed code! Second, it seems this might not even work if we tried, if the wisdom of Stackoverflow is to be believed.

Do we give up hope? Not necessarily! Checking the documentation for the structures returned from NtQueryDirectoryFile (say, FILE_FULL_DIR_INFORMATION), we see that there is another way to get this information. Specifically, we could “create an IRP with major function code IRP_MJ_DIRECTORY_CONTROL and minor function code IRP_MN_QUERY_DIRECTORY.” I guess that clears it up, aside from three questions I would have about that sentence.

An IRP (pronounced “urp”) is an I/O request packet. These are mainly used by drivers to perform requests on the file system or network. IRP_MJ_DIRECTORY_CONTROL is major function code which tells a driver what general type of function needs to be performed (think: “class”) — in this case it would refer to the set of directory handling routines. IRP_MN_QUERY_DIRECTORY is a minor function code which exactly describes the requested operation (think: “method”)

This is all well and good, but how would we even send an IRP from user mode? Unfortunately, there aren’t any ways to send arbitrary requests. We have DeviceIoControl, but that sends IRP_MJ_DEVICE_CONTROL and does not support the directory query command we want. There are also some WMI functions to send IRPs (e.g. the command line "wmic.exe sysdriver ... call ...") but this is only for IRP_MJ_SYSTEM_CONTROL.

It seems that all signs are pointing to “figure out this APC thing.” Before attempting this in managed code, let’s take a look at a possible implementation in native C++. The basic idea will be to use PPL to define a real-ish async “get files” operation that returns a task holding a list (vector) of path names (wstring). I say real-ish because the use of APCs means that we will be forced to stay on the same thread to query the information and handle the async callback. Yes, I realize that means this is hardly worth it, but we’re doing this just to show that we can!

First things first: we need to install the Windows Driver Kit. It contains the necessary headers and libraries to call NT APIs without a lot of hassle. After that, we can create a simple static library project to help isolate all the NT junk from our main application — mixing Windows.h and NT headers doesn’t really work out well. Note that the WDK adds several Visual Studio templates for creating driver projects, but we are not going to use one here, as we are writing user mode code that just happens to use NT APIs. Because of this, we’ll have to add a few project settings to make the compiler happy:

  • C/C++ : Additional Include Directories: add $(FrameworkSdkDir)\Include\$(TargetPlatformVersion)\km\
  • C/C++ : Preprocessor Definitions: add _AMD64_ and make sure you are compiling for the x64 platform

To help visualize the operation, we will create a simple log callback function definition. It will be used by the inner operation later:

// LogCallback.h
#pragma once

namespace nt
{
    typedef void(*LogCallback)(const wchar_t* message);
}

Now we’ll define a detail header/namespace to hold all the gory details:

// detail.h
#pragma once

#include "LogCallback.h"
#include <sstream>
#include <ppltasks.h>
#include <ntifs.h>

namespace nt
{
    namespace detail
    {
        class NTError : public std::runtime_error
        {
        public:
            NTError(NTSTATUS status)
                : runtime_error(Message(status))
            {
            }

            static std::string Message(NTSTATUS status)
            {
                std::stringstream ss;
                ss << "Error 0x" << std::hex << status;
                return ss.str();
            }
        };

        class DirectoryHandle
        {
        private:
            HANDLE handle_;

        public:
            DirectoryHandle(const std::wstring& path)
                : handle_(Open(path.c_str()))
            { }

            HANDLE get() const
            {
                return handle_;
            }

            ~DirectoryHandle()
            {
                NtClose(handle_);
            }

        private:
            static HANDLE Open(LPCWSTR path)
            {
                UNICODE_STRING name;
                RtlInitUnicodeString(&name, path);
                OBJECT_ATTRIBUTES attributes;
                InitializeObjectAttributes(&attributes, &name, 0, nullptr, nullptr);
                HANDLE handle;
                IO_STATUS_BLOCK statusBlock = { 0 };
                NTSTATUS status = NtOpenFile(
                    &handle,
                    FILE_LIST_DIRECTORY,
                    &attributes,
                    &statusBlock,
                    FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
                    FILE_DIRECTORY_FILE);
                if (status != STATUS_SUCCESS)
                {
                    throw NTError(status);
                }

                return handle;
            }

            DirectoryHandle & operator=(const DirectoryHandle& other) = delete;
            DirectoryHandle(const DirectoryHandle& other) = delete;
        };

        class GetFilesOperation
        {
        private:
            LogCallback log_;
            DirectoryHandle handle_;
            std::wstring path_;
            UCHAR buffer_[1024];
            std::vector<std::wstring> files_;
            concurrency::task_completion_event<std::vector<std::wstring>> taskEvent_;

        public:
            GetFilesOperation(LogCallback log, const std::wstring& absolutePath)
                : log_(log),
                handle_(L"\\DosDevices\\" + absolutePath),
                path_(absolutePath),
                buffer_(),
                files_(),
                taskEvent_()
            {
            }

            concurrency::task<std::vector<std::wstring>> RunAsync()
            {
                concurrency::create_task([]() {}).then([this]() { Run(); });
                return concurrency::create_task(taskEvent_);
            }

        private:
            static void NTAPI OnCompleted(
                _In_ PVOID ApcContext,
                _In_ PIO_STATUS_BLOCK /* IoStatusBlock */,
                _In_ ULONG /* Reserved */)
            {
                static_cast<GetFilesOperation*>(ApcContext)->Log(L"<OnCompleted>");
            }

            GetFilesOperation & operator=(const GetFilesOperation& other) = delete;
            GetFilesOperation(const GetFilesOperation& other) = delete;

            void Log(LPCWSTR message) const
            {
                if (log_)
                {
                    log_(message);
                }
            }

            void Run()
            {
                try
                {
                    bool done;
                    do
                    {
                        done = Next();
                    } while (!done);
                }
                catch (const std::exception& e)
                {
                    taskEvent_.set_exception(e);
                }

                taskEvent_.set(files_);
            }

            bool Next()
            {
                return Query() || ReadEntries();
            }

            bool Query()
            {
                RtlZeroMemory(buffer_, sizeof(buffer_));
                IO_STATUS_BLOCK statusBlock = { 0 };
                Log(L"<NtQueryDirectoryFile>");
                NTSTATUS status = NtQueryDirectoryFile(
                    handle_.get(),
                    nullptr,
                    OnCompleted,
                    this,
                    &statusBlock,
                    buffer_,
                    sizeof(buffer_),
                    FileNamesInformation,
                    FALSE,
                    nullptr,
                    FALSE);

                switch (status)
                {
                case STATUS_PENDING:
                    Log(L"<ZwWaitForSingleObject>");
                    ZwWaitForSingleObject(NtCurrentThread(), TRUE, nullptr);
                    break;
                default:
                    throw NTError(status);
                }

                bool done = static_cast<size_t>(statusBlock.Information) == 0;
                if (done)
                {
                    Log(L"< done! >");
                }

                return done;
            }

            bool ReadEntries()
            {
                size_t offset = 0;
                do
                {
                    FILE_NAMES_INFORMATION* fileInfo = reinterpret_cast<FILE_NAMES_INFORMATION*>(&buffer_[offset]);
                    std::wstring fileName(fileInfo->FileName, fileInfo->FileNameLength / 2);
                    std::wstring fullPath(path_ + L"\\" + fileName);
                    files_.push_back(fullPath);

                    ULONG next = fileInfo->NextEntryOffset;
                    if (next == 0)
                    {
                        return false;
                    }

                    offset += next;
                } while (true);
            }
        };
    }
}

There is a lot going on here, but here is the gist:

  • DirectoryHandle holds an open handle to the directory object, retrieved using the native NtOpenFile call. We are not specifying the usual synchronous flags (see NtCreateFile), so we will expect all underlying calls to complete asynchronously.
  • GetFilesOperation has all the actual query logic. Note that it passes the special “DosDevices” namespace when opening the handle since we’re dealing with NT native API and have to be specific.
  • Before we start enumerating in GetFilesOperation::RunAsync, we have to get off the main thread, which we accomplish by using task::then, which schedules a continuation on the thread pool. This ensures that we can block while waiting for APC callbacks.
  • Now we get to the more interesting GetFilesOperation::Run method which eventually calls into all the real NT stuff. It calls the inner Next method in a loop, taking care to handle exceptions and set the task result.
  • Next is just a thin wrapper around Query (which finally calls the directory query and waits for the APC) and ReadEntries (which interprets the data).

Now, for the “public” API, a wrapper class called AsyncDirectory:

// AsyncDirectory.h
#pragma once

#include "LogCallback.h"
#include <string>
#include <ppltasks.h>

namespace nt
{
    class AsyncDirectory
    {
    private:
        std::wstring path_;
        LogCallback log_;

    public:
        AsyncDirectory(const std::wstring& absolutePath, LogCallback log = nullptr);

        concurrency::task<std::vector<std::wstring>> GetFilesAsync() const;
    };
}
// AsyncDirectory.cpp
#include "AsyncDirectory.h"
#include "detail.h"
#include <memory>

using namespace concurrency;
using namespace nt;
using namespace nt::detail;
using namespace std;

AsyncDirectory::AsyncDirectory(const std::wstring& absolutePath, LogCallback log)
    : path_(absolutePath),
    log_(log)
{ }

task<vector<wstring>> AsyncDirectory::GetFilesAsync() const
{
    auto op = make_shared<GetFilesOperation>(log_, path_);
    return op->RunAsync().then([op](task<vector<wstring>> t) { return t; });
}

The only semi-interesting thing here is that we have to keep GetFilesOperation alive until we return the final result. This is achieved by using a shared_ptr which is captured in a pass-through continuation at the end, ensuring its lifetime.

Whew. It was lots of effort, but amazingly, this actually works! Below is a sample app which references the above static lib and reads the files in a test directory I prepared beforehand. To ensure this compiles and links properly, make sure the include path points to the static lib header folder and put ntdll.lib as an additional dependency in the Linker options.

#include "AsyncDirectory.h"
#include <Windows.h>
#include <iostream>

using namespace concurrency;
using namespace nt;
using namespace std;

double Now()
{
    LARGE_INTEGER c;
    QueryPerformanceCounter(&c);
    LARGE_INTEGER f;
    QueryPerformanceFrequency(&f);
    return (double)c.QuadPart / f.QuadPart;
}

int ElapsedMilliseconds()
{
    static double Start = Now();
    return static_cast<int>(1000 * (Now() - Start));
}

void Log(const wchar_t* message)
{
    wcout << L"[" << ElapsedMilliseconds() << L", T=" << GetCurrentThreadId() << L"] " << message << endl;
}

int main()
{
    Log(L"Starting.");
    AsyncDirectory dir(L"G:\\temp\\dir", Log);

    Log(L"Getting files...");
    task<vector<wstring>> task = dir.GetFilesAsync();
    auto files = task.get();
    for (auto it = files.cbegin(); it != files.cend(); ++it)
    {
        Log(it->c_str());
    }

    return 0;
}

The output shows that everything functions basically as we’d expect:

[0, T=13992] Starting.
[0, T=13992] Getting files...
[1, T=19740] <NtQueryDirectoryFile>
[1, T=19740] <ZwWaitForSingleObject>
[2, T=19740] <OnCompleted>
[2, T=19740] <NtQueryDirectoryFile>
[3, T=19740] <ZwWaitForSingleObject>
[4, T=19740] <OnCompleted>
[4, T=19740] <NtQueryDirectoryFile>
[5, T=19740] <ZwWaitForSingleObject>
[5, T=19740] <OnCompleted>
[6, T=19740] < done! >
[6, T=13992] G:\temp\dir\.
[7, T=13992] G:\temp\dir\..
[7, T=13992] G:\temp\dir\00
[8, T=13992] G:\temp\dir\01
[9, T=13992] G:\temp\dir\02
[9, T=13992] G:\temp\dir\03
[10, T=13992] G:\temp\dir\04
 . . . 
[82, T=13992] G:\temp\dir\96
[83, T=13992] G:\temp\dir\97
[84, T=13992] G:\temp\dir\98
[85, T=13992] G:\temp\dir\99

Obviously there are about 1000 caveats here. Not all potential errors are handled. The actual asynchrony of this mechanism is highly debatable — we’re switching threads and using alertable waits, so every operation still burns one thread, just like it would have for the synchronous version. Yet, it does show that an async GetFiles is at least achievable. I have a feeling we can further explore this space and come up with something more useful, however. But more on that at a later time….

2 thoughts on “A real async GetFiles?

  1. tobi

    Do IO Completion ports not work for this API? They are normally the mechanism to do async IO. APCs are, as far as I understand it, an obsolete technology that’s a design flaw in the kernel.

    1. Brian Rogers Post author

      Theoretically IOCP will work here, but it requires more investigation on how to do it properly when using the NT native API… I haven’t had time yet. 🙂 I might write a follow up post at a later date, since as you mention, APCs are not a great mechanism — especially if the goal is managed code interop.

Leave a Reply

Your email address will not be published. Required fields are marked *