A faster TryFormat

Spread the love

As part of my ongoing DHCP adventure, I needed to start thinking about string formatting for some of the core data types like MacAddress. After all, these values are likely to make their way into a trace file eventually and we will need to turn them into characters somehow.

Obviously, ToString is off the table. It will allocate memory which is a no-no in this library. Instead, the .NET Core way is TryFormat. So let’s explore a few ways we could implement this method. We’ll assume the intended output string is in the common MAC address format with bytes in hex delimited by hyphens, e.g. “AA-BB-CC-DD-EE-FF”.

Let’s start by having .NET do the heavy lifting for us. All .NET numeric types support hexadecimal formatting with precision, and conveniently the internal representation of MAC address here is a 64-bit unsigned integer. If we use “X12” on the value, we can get half of what we need. The only gap to fill is with the hyphens, which we can insert into the hex digit sequence in the right places. The trick is doing this without allocating any memory. Here is one such way:

        public bool TryFormat(Span<char> destination, out int charsWritten, ReadOnlySpan<char> format = default)
        {
            if (destination.Length < 17)
            {
                charsWritten = 0;
                return false;
            }

            this.value.TryFormat(destination, out _, "X12");
            for (int i = 5; i >= 1; --i)
            {
                destination[(3 * i) + 1] = destination[1 + (2 * i)];
                destination[(3 * i) + 0] = destination[0 + (2 * i)];
                destination[(3 * i) - 1] = '-';
            }

            charsWritten = 17;
            return true;
        }

We start from the end of the buffer and project the characters ahead in groups of two, placing hyphens before them. The buffer ends up going through these transformations:

AABBCCDDEEFF.....
AABBCCDDEEFF..-FF
AABBCCDDEEF-EE-FF
AABBCCDD-DD-EE-FF
AABBC-CC-DD-EE-FF
AA-BB-CC-DD-EE-FF

The first part of the buffer is already correct so we stop here. Here is a benchmark to measure the speed:

    [SimpleJob(RuntimeMoniker.NetCoreApp31)]
    [MemoryDiagnoser]
    public class StringBenchmarks
    {
        private char[] buffer;

        [GlobalSetup]
        public void Setup()
        {
            this.buffer = new char[32];
        }

        [Benchmark]
        public int Mac()
        {
            new MacAddress(0xFEDCBA987654).TryFormat(new Span<char>(this.buffer), out int c);
            return c;
        }
    }

The benchmark results:

| Method |     Mean |    Error |   StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|------- |---------:|---------:|---------:|------:|------:|------:|----------:|
|    Mac | 80.90 ns | 1.184 ns | 0.925 ns |     - |     - |     - |         - |

It’s a nice little algorithm, and even better, it results in no memory allocations. But given how complex the .NET formatting scenarios are, I get the feeling we could make this faster. (That’s right, 80 ns is still too slow!)

Instead of this hyphen insertion business, let’s just painstakingly format each byte and put the hyphens inline:

        public bool TryFormat(Span<char> destination, out int charsWritten, ReadOnlySpan<char> format = default)
        {
            if (destination.Length < 17)
            {
                charsWritten = 0;
                return false;
            }

            WriteHexByte(destination, 0, this.value >> 40);
            destination[2] = '-';
            WriteHexByte(destination, 3, this.value >> 32);
            destination[5] = '-';
            WriteHexByte(destination, 6, this.value >> 24);
            destination[8] = '-';
            WriteHexByte(destination, 9, this.value >> 16);
            destination[11] = '-';
            WriteHexByte(destination, 12, this.value >> 8);
            destination[14] = '-';
            WriteHexByte(destination, 15, this.value & 0xFF);
            charsWritten = 17;
            return true;
        }

        private static void WriteHexByte(Span<char> destination, int start, ulong v)
        {
            byte b = (byte)v;
            b.TryFormat(destination.Slice(start), out _, "X2");
        }

Sadly, the benchmark results are far worse with this implementation:

| Method |     Mean |   Error |  StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|------- |---------:|--------:|--------:|------:|------:|------:|----------:|
|    Mac | 236.3 ns | 1.16 ns | 1.08 ns |     - |     - |     - |         - |

We’re looking at a 3X slowdown here. It makes sense that adding multiple calls to TryFormat instead of just the one comes at a higher overall cost. Let’s just go all the way and rewrite WriteHexByte to do the formatting completely on our own:

        private static void WriteHexByte(Span<char> destination, int start, ulong v)
        {
            byte b = (byte)v;
            destination[start] = HexDigit((b >> 4) & 0xF);
            destination[start + 1] = HexDigit(b & 0xF);
        }

        private static char HexDigit(int d)
        {
            switch (d)
            {
                case 0x0:
                case 0x1:
                case 0x2:
                case 0x3:
                case 0x4:
                case 0x5:
                case 0x6:
                case 0x7:
                case 0x8:
                case 0x9:
                    return (char)(d + '0');
                default:
                    return (char)(d - 0xA + 'A');
            }
        }

How did we do?

| Method |     Mean |    Error |   StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|------- |---------:|---------:|---------:|------:|------:|------:|----------:|
|    Mac | 34.34 ns | 0.340 ns | 0.301 ns |     - |     - |     - |         - |

Success! We have a fast zero-allocation TryFormat method. Considering its flexibility, the .NET implementation is quite fast. But there is often room for improvement when you are allowed to sacrifice generality.

Leave a Reply

Your email address will not be published. Required fields are marked *