We’ve been talking about high performance datagram scenarios but have so far used a contrived sample of sending mostly fixed size strings — not so realistic. That will change today, as we turn to one of the more famous datagram scenarios, the Dynamic Host Configuration Protocol (DHCP).
There are perhaps hundreds of .NET DHCP server implementations out there. Why build another one? There are many potential justifications (practice, learning, arrogance) but one actually good reason: performance!
While most .NET code in the wild is built for convenience, this implementation will be heavily biased towards performance, even if the resulting API is not quite as straightforward. To illustrate, let’s walk through the initial code for dealing with the DHCP header. By header, I mean all the parts of the message other than the trailing variable length options (we’ll deal with that pain later):
op (1) | htype (1) | hlen (1) | hops (1) |
xid (4) | |||
secs (2) | flags (2) | ||
ciaddr (4) | |||
yiaddr (4) | |||
siaddr (4) | |||
giaddr (4) | |||
chaddr (16) | |||
sname (64) | |||
file (128) | |||
options (variable) |
Standard practice in .NET would be to define a class with all the fields above. The obvious choices would be String for sname
(server host name) and file
(boot file name), IPAddress for the *iaddr
s, and perhaps a raw byte array for chaddr
(client hardware address). But already this is unthinkable from a performance perspective — we’re talking about heap allocation galore. We would need a heap object for the message header class itself, two strings, four IP addresses (yes, this four byte IPv4 address is actually backed by a reference type!), and the hardware address byte array. Heap allocation is of course very convenient, typically pretty fast, and done without much fanfare in .NET’s garbage collected runtime. But if we want the highest performance, we must think a bit differently.
First, consider that a DHCP message does not come from just anywhere. Rather, you would generally be reading it from a datagram socket. A socket already requires that you give it a buffer where it will place the received data. The same is generally true even for a constructed DHCP response message — you are almost certainly going to send it via a socket eventually. With that in mind, the core concept here need not be a DHCP message but a DHCP message buffer. This buffer can still provide quick access to common header properties but should be parsimonious when it comes to the larger and variable length fields. Since the header is relatively large (10+ fields), it doesn’t make sense to use a struct. That being the case, we should presume that this buffer is instantiated once only before the first request and subsequently reused. Here is the implementation so far: DhcpMessageBuffer.cs
Basically the DhcpMessageBuffer is just a wrapper around a Memory<byte>. Why Memory and not just a byte array? Because we want to give the user the choice of the underlying buffer implementation. Use a byte array if you want, or go ahead and slice bytes out of a huge shared pool — it’s up to you. To avoid string allocations, the ServerHostName and BootFileName fields are represented as computed Span<byte> properties, to be used just-in-time when the caller needs these values. The IP address fields use a custom struct (see IPAddressV4.cs) backed by an unsigned 32-bit int. The ClientHardwareAddress is also a computed Span, but a convenience struct (see MacAddress.cs) is provided for the usual case of 6-byte MAC addresses.
This unit test excerpt provides a good overview on how the header can be read and written in practice:
public void SaveAndLoad() { byte[] raw = new byte[500]; DhcpMessageBuffer output = new DhcpMessageBuffer(new Memory<byte>(raw)) { Opcode = DhcpOpcode.Reply, HardwareAddressType = DhcpHardwareAddressType.Ethernet10Mb, HardwareAddressLength = 6, Hops = 1, TransactionId = 0x12345678, Seconds = 34, Flags = DhcpFlags.Broadcast, ClientIPAddress = new IPAddressV4(1, 2, 3, 4), YourIPAddress = new IPAddressV4(5, 6, 7, 8), ServerIPAddress = new IPAddressV4(9, 10, 11, 12), GatewayIPAddress = new IPAddressV4(13, 14, 15, 16), MagicCookie = MagicCookie.Dhcp, }; output.ClientHardwareAddress[0] = 0xAA; output.ServerHostName[0] = (byte)'S'; output.BootFileName[0] = (byte)'B'; DhcpMessageBuffer buffer = new DhcpMessageBuffer(new Memory<byte>(raw)); output.Save(); buffer.Load(500); buffer.Opcode.Should().Be(DhcpOpcode.Reply); buffer.HardwareAddressType.Should().Be(DhcpHardwareAddressType.Ethernet10Mb); buffer.HardwareAddressLength.Should().Be(6); buffer.Hops.Should().Be(1); buffer.TransactionId.Should().Be(0x12345678); buffer.Seconds.Should().Be(34); buffer.Flags.Should().Be(DhcpFlags.Broadcast); buffer.ClientIPAddress.Should().Be(new IPAddressV4(1, 2, 3, 4)); buffer.YourIPAddress.Should().Be(new IPAddressV4(5, 6, 7, 8)); buffer.ServerIPAddress.Should().Be(new IPAddressV4(9, 10, 11, 12)); buffer.GatewayIPAddress.Should().Be(new IPAddressV4(13, 14, 15, 16)); new MacAddress(buffer.ClientHardwareAddress).Should().Be(new MacAddress(0xAA, 0x00, 0x00, 0x00, 0x00, 0x00)); buffer.MagicCookie.Should().Be(MagicCookie.Dhcp); }
We keep talking about performance but it is just an empty slogan if we don’t have measurements. As such, the library includes a BenchmarkDotNet project to time some of the important operations:
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated | |------- |---------:|---------:|---------:|------:|------:|------:|----------:| | Load | 65.51 ns | 0.356 ns | 0.297 ns | - | - | - | - | | Save | 71.46 ns | 0.290 ns | 0.242 ns | - | - | - | - |
Under 100 ns for load and save, and zero allocations! This is a good start, but we have yet to tackle the most complicated parsing challenges. Stay tuned…
Pingback: Let’s do DHCP: options – WriteAsync .NET
Pingback: A faster TryFormat – WriteAsync .NET
Pingback: Let’s do DHCP: fuzzing – WriteAsync .NET
Pingback: Let’s do DHCP: diagnostic events – WriteAsync .NET