My day to day programming environment is C# + Visual Studio + ReSharper. I like to think I’m fairly productive in this setup and after years of experience have picked up a few fast and easy ways to refactor my way into design improvements (even in legacy code). This is unfortunately not the case with C++ where, due to a combination of my relative inexperience, the lack of tooling, and fundamental design decisions, it is just harder to do what I want.
Consider Extract Method, one of the first refactorings that I reach for when evolving a codebase. In languages like C# or Java, it is no big deal to move some code from an existing method on a class into a new private
method. Existing clients and callers of the code, being limited to the public surface area, will never know or care. C++ classes, however, are a different beast than their counterparts in newer managed languages. A C++ class definition must accurately reflect all data members and functions — public
and private
— in order for calling code to use the class successfully. Adding a new private
function to a C++ class is therefore a significant event, requiring an update of the definition and, in turn, recompilation of all calling code.
The severe implications of this led to some popular dependency management techniques known as “compilation firewalls” AKA the “private implementation” or “pimpl idiom.” Still, there are notable drawbacks (overhead in the form of extra code, heap allocations, and pointer indirection), and thus in practice a large amount of existing code is not pimpl-ized. This leaves us in the typical situation where we have a big ball of mud that we want to organize into smaller, more comprehensible pieces, and yet we are thwarted as each small change requires an update to header files and class definitions that we would rather not make.
While mulling over this problem recently, I came up with one solution that minimizes some of the drawbacks above while enabling a simple and mechanical Extract Method procedure — I’ll call it friend struct This
for reasons which will soon become obvious. To demonstrate it, I will use a small sample from Day 5 of Advent of Code 2017. We’ll start with the function that returns the answer (invoked from main
somewhere):
#include "JumpTable.h" int Day5A(const string& input) { JumpTable jumpTable(input); return jumpTable.Eval(); }
Now the header file JumpTable.h
containing the required definitions:
#pragma once #include <string> #include <vector> class JumpTable { private: JumpTable(const JumpTable&) = delete; JumpTable& operator=(const JumpTable&) = delete; std::vector<int> offsets_; public: JumpTable(const std::string& input); int Eval(); };
Finally, the actual code from JumpTable.cpp
:
#include "JumpTable.h" #include <sstream> using namespace std; JumpTable::JumpTable(const string& input) : offsets_() { vector<char> next; for (auto it = input.cbegin(); it != input.cend(); ++it) { char c(*it); switch (c) { case '\r': case '\n': if (!next.empty()) { string str(&next[0], next.size()); int i = stoi(str); offsets_.push_back(i); next.clear(); } break; default: next.push_back(c); break; } } if (!next.empty()) { string str(&next[0], next.size()); int i = stoi(str); offsets_.push_back(i); next.clear(); } } int JumpTable::Eval() { int steps = 0; int i = 0; while ((i >= 0) && (i < offsets_.size())) { int offset = offsets_[i]; ++offsets_[i]; i += offset; ++steps; } return steps; }
This wouldn’t be a bad solution, if we could just get rid of that duplication in the line splitting code in the constructor. How do we do this without adding more function definitions to our header file and without going “full pimpl”? Let’s start by adding one simple line to the JumpTable
class definition:
//... class JumpTable { private: friend struct This; //...
The idea is that we’re going to implement a completely private struct
— defined above as an incomplete type — which is a friend
, and thus given full access to the internals of the main class. We are now free to add code to our This
struct at will without ever changing the header file again! We’ll start by defining This
in the implementation file JumpTable.cpp
:
//... struct This { }; //... JumpTable code follows
Okay, that was easy. Now let’s use Extract Method on the duplicated code in the constructor and put it in inside a new function in This
:
//... struct This { static void Process(JumpTable* This, vector<char>& next) { if (!next.empty()) { string str(&next[0], next.size()); int i = stoi(str); This->offsets_.push_back(i); next.clear(); } } }; JumpTable::JumpTable(const string& input) : offsets_() { vector<char> next; for (auto it = input.cbegin(); it != input.cend(); ++it) { char c(*it); switch (c) { case '\r': case '\n': This::Process(this, next); break; default: next.push_back(c); break; } } This::Process(this, next); } //...
A few things to note:
This
(the struct) is just a holder for “extended” private functions of the main class. As such it has no state of its own and should not be instantiated, and will therefore declare only static methods.- In order for the struct to access the private state, you must pass to it the actual
this
pointer of the main class. - Since
this
is already reserved, we have to use a different name — henceThis
as the pointer argument.
The result is some slightly unusual syntax but it gets the job done; you can think of it as exposing some of the internals of what a class method actually is with the normally hidden pointer on display.
Just to make sure that this technique doesn’t have any unforeseen overhead, I experimented with Matt Godbolt‘s excellent Compiler Explorer to inspect the assembly listings from the code above. I compared it to the “normal” code one would write with Process
as a direct private method as follows:
class JumpTable { private: JumpTable(const JumpTable&) = delete; JumpTable& operator=(const JumpTable&) = delete; std::vector<int> offsets_; void Process(std::vector<char>& next); public: JumpTable(const std::string& input); int Eval(); };
I somewhat arbitrarily chose x86-64 gcc 7.2 as the compiler and was pleased to note that the assembly output was basically identical. This should not be surprising, given that a private instance method under the hood is no different than a static method with an explicit this
pointer as its first argument.
What do you think? Could friend struct This
be a useful, low-overhead addition to the C++ refactoring toolbox?