{"id":5380,"date":"2018-01-21T13:00:59","date_gmt":"2018-01-21T13:00:59","guid":{"rendered":"http:\/\/writeasync.net\/?p=5380"},"modified":"2018-01-14T03:49:03","modified_gmt":"2018-01-14T03:49:03","slug":"refactoring-c-extract-method-using-friend-struct-this","status":"publish","type":"post","link":"http:\/\/writeasync.net\/?p=5380","title":{"rendered":"Refactoring C++: Extract Method using `friend struct This`"},"content":{"rendered":"<p>My day to day programming environment is <a href=\"https:\/\/docs.microsoft.com\/en-us\/dotnet\/csharp\/quick-starts\/hello-world\">C#<\/a> + <a href=\"https:\/\/www.visualstudio.com\/\">Visual Studio<\/a> + <a href=\"https:\/\/www.jetbrains.com\/resharper\/\">ReSharper<\/a>. I like to think I&#8217;m fairly productive in this setup and after years of experience have picked up a few fast and easy ways to refactor my way into design improvements (even in <a href=\"http:\/\/blog.thecodewhisperer.com\/permalink\/rescuing-legacy-code-by-extracting-pure-functions\">legacy code<\/a>). This is unfortunately not the case with C++ where, due to a combination of my relative inexperience, the lack of tooling, and fundamental design decisions, it is just harder to do what I want.<\/p>\n<p>Consider <a href=\"https:\/\/refactoring.com\/catalog\/extractMethod.html\">Extract Method<\/a>, one of the first refactorings that I reach for when evolving a codebase. In languages like C# or Java, it is no big deal to move some code from an existing method on a class into a new <code>private<\/code> method. Existing clients and callers of the code, being limited to the public surface area, will never know or care. C++ classes, however, are a different beast than their counterparts in newer managed languages. A C++ class definition must accurately reflect <em>all<\/em> data members and functions &#8212; <code>public<\/code> and <code>private<\/code> &#8212; in order for calling code to use the class successfully. Adding a new <code>private<\/code> function to a C++ class is therefore a significant event, requiring an update of the definition and, in turn, recompilation of all calling code.<\/p>\n<p>The severe implications of this led to some popular dependency management techniques known as &#8220;<a href=\"https:\/\/herbsutter.com\/gotw\/_100\/\">compilation firewalls<\/a>&#8221; AKA the &#8220;private implementation&#8221; or &#8220;<a href=\"http:\/\/wiki.c2.com\/?PimplIdiom\">pimpl idiom<\/a>.&#8221; Still, there are notable drawbacks (overhead in the form of extra code, heap allocations, and pointer indirection), and thus in practice a large amount of existing code is not pimpl-ized. This leaves us in the typical situation where we have a <a href=\"http:\/\/laputan.org\/mud\/\">big ball of mud<\/a> that we want to organize into smaller, more comprehensible pieces, and yet we are thwarted as each small change requires an update to header files and class definitions that we would rather not make.<\/p>\n<p>While mulling over this problem recently, I came up with one solution that minimizes some of the drawbacks above while enabling a simple and mechanical Extract Method procedure &#8212; I&#8217;ll call it <code>friend struct This<\/code> for reasons which will soon become obvious. To demonstrate it, I will use a small sample from <a href=\"http:\/\/adventofcode.com\/2017\/day\/5\">Day 5 of Advent of Code 2017<\/a>. We&#8217;ll start with the function that returns the answer (invoked from <code>main<\/code> somewhere):<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\n#include &quot;JumpTable.h&quot;\r\n\r\nint Day5A(const string&amp; input)\r\n{\r\n    JumpTable jumpTable(input);\r\n    return jumpTable.Eval();\r\n}\r\n<\/pre>\n<p>Now the header file <code>JumpTable.h<\/code> containing the required definitions:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\n#pragma once\r\n\r\n#include &lt;string&gt;\r\n#include &lt;vector&gt;\r\n\r\nclass JumpTable\r\n{\r\nprivate:\r\n    JumpTable(const JumpTable&amp;) = delete;\r\n    JumpTable&amp; operator=(const JumpTable&amp;) = delete;\r\n    std::vector&lt;int&gt; offsets_;\r\npublic:\r\n    JumpTable(const std::string&amp; input);\r\n    int Eval();\r\n};\r\n<\/pre>\n<p>Finally, the actual code from <code>JumpTable.cpp<\/code>:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\n#include &quot;JumpTable.h&quot;\r\n#include &lt;sstream&gt;\r\n\r\nusing namespace std;\r\n\r\nJumpTable::JumpTable(const string&amp; input)\r\n    : offsets_()\r\n{\r\n    vector&lt;char&gt; next;\r\n    for (auto it = input.cbegin(); it != input.cend(); ++it)\r\n    {\r\n        char c(*it);\r\n        switch (c)\r\n        {\r\n        case '\\r':\r\n        case '\\n':\r\n            if (!next.empty())\r\n            {\r\n                string str(&amp;next&#x5B;0], next.size());\r\n                int i = stoi(str);\r\n                offsets_.push_back(i);\r\n                next.clear();\r\n            }\r\n\r\n            break;\r\n\r\n        default:\r\n            next.push_back(c);\r\n            break;\r\n        }\r\n    }\r\n\r\n    if (!next.empty())\r\n    {\r\n        string str(&amp;next&#x5B;0], next.size());\r\n        int i = stoi(str);\r\n        offsets_.push_back(i);\r\n        next.clear();\r\n    }\r\n}\r\n\r\nint JumpTable::Eval()\r\n{\r\n    int steps = 0;\r\n    int i = 0;\r\n    while ((i &gt;= 0) &amp;&amp; (i &lt; offsets_.size()))\r\n    {\r\n        int offset = offsets_&#x5B;i];\r\n        ++offsets_&#x5B;i];\r\n        i += offset;\r\n        ++steps;\r\n    }\r\n\r\n    return steps;\r\n}\r\n<\/pre>\n<p>This wouldn&#8217;t be a bad solution, if we could just get rid of that duplication in the line splitting code in the constructor. How do we do this without adding more function definitions to our header file <em>and<\/em> without going &#8220;full pimpl&#8221;? Let&#8217;s start by adding one simple line to the <code>JumpTable<\/code> class definition:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\n\/\/...\r\nclass JumpTable\r\n{\r\nprivate:\r\n    friend struct This;\r\n\/\/...\r\n<\/pre>\n<p>The idea is that we&#8217;re going to implement a completely private <code>struct<\/code> &#8212; defined above as an <a href=\"https:\/\/docs.microsoft.com\/en-us\/cpp\/c-language\/incomplete-types\">incomplete type<\/a> &#8212; which is a <a href=\"https:\/\/docs.microsoft.com\/en-us\/cpp\/cpp\/friend-cpp\"><code>friend<\/code><\/a>, and thus given full access to the internals of the main class. We are now free to add code to our <code>This<\/code> struct at will without ever changing the header file again! We&#8217;ll start by defining <code>This<\/code> in the implementation file <code>JumpTable.cpp<\/code>:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\n\/\/...\r\nstruct This\r\n{\r\n};\r\n\/\/... JumpTable code follows\r\n<\/pre>\n<p>Okay, that was easy. Now let&#8217;s use Extract Method on the duplicated code in the constructor and put it in inside a new function in <code>This<\/code>:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\n\/\/...\r\nstruct This\r\n{\r\n    static void Process(JumpTable* This, vector&lt;char&gt;&amp; next)\r\n    {\r\n        if (!next.empty())\r\n        {\r\n            string str(&amp;next&#x5B;0], next.size());\r\n            int i = stoi(str);\r\n            This-&gt;offsets_.push_back(i);\r\n            next.clear();\r\n        }\r\n    }\r\n};\r\n\r\nJumpTable::JumpTable(const string&amp; input)\r\n    : offsets_()\r\n{\r\n    vector&lt;char&gt; next;\r\n\r\n    for (auto it = input.cbegin(); it != input.cend(); ++it)\r\n    {\r\n        char c(*it);\r\n        switch (c)\r\n        {\r\n        case '\\r':\r\n        case '\\n':\r\n            This::Process(this, next);\r\n            break;\r\n\r\n        default:\r\n            next.push_back(c);\r\n            break;\r\n        }\r\n    }\r\n\r\n    This::Process(this, next);\r\n}\r\n\/\/...\r\n<\/pre>\n<p>A few things to note:<\/p>\n<ol>\n<li><code>This<\/code> (the struct) is just a holder for &#8220;extended&#8221; private functions of the main class. As such it has no state of its own and should not be instantiated, and will therefore declare only static methods.<\/li>\n<li>In order for the struct to access the private state, you must pass to it the actual <code>this<\/code> pointer of the main class.<\/li>\n<li>Since <code>this<\/code> is already reserved, we have to use a different name &#8212; hence <code>This<\/code> as the pointer argument.<\/ol>\n<p>The result is some slightly unusual syntax but it gets the job done; you can think of it as exposing some of the internals of what a class method actually is with the normally hidden pointer on display.<\/p>\n<p>Just to make sure that this technique doesn&#8217;t have any unforeseen overhead, I experimented with <a href=\"https:\/\/xania.org\/MattGodbolt\">Matt Godbolt<\/a>&#8216;s excellent <a href=\"https:\/\/godbolt.org\/\">Compiler Explorer<\/a> to inspect the assembly listings from the code above. I compared it to the &#8220;normal&#8221; code one would write with <code>Process<\/code> as a direct private method as follows:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nclass JumpTable\r\n{\r\nprivate:\r\n    JumpTable(const JumpTable&amp;) = delete;\r\n    JumpTable&amp; operator=(const JumpTable&amp;) = delete;\r\n    std::vector&lt;int&gt; offsets_;\r\n    void Process(std::vector&lt;char&gt;&amp; next);\r\n\r\npublic:\r\n    JumpTable(const std::string&amp; input);\r\n    int Eval();\r\n};\r\n<\/pre>\n<p>I somewhat arbitrarily chose <strong>x86-64 gcc 7.2<\/strong> as the compiler and was pleased to note that the <strong>assembly output was basically identical<\/strong>. This should not be surprising, given that a private instance method under the hood is no different than a static method with an explicit <code>this<\/code> pointer as its first argument.<\/p>\n<p>What do you think? Could <code>friend struct This<\/code> be a useful, low-overhead addition to the C++ refactoring toolbox?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>My day to day programming environment is C# + Visual Studio + ReSharper. I like to think I&#8217;m fairly productive in this setup and after years of experience have picked up a few fast and easy ways to refactor my&hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[91,101],"tags":[],"class_list":["post-5380","post","type-post","status-publish","format-standard","hentry","category-design","category-native"],"_links":{"self":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts\/5380","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5380"}],"version-history":[{"count":6,"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts\/5380\/revisions"}],"predecessor-version":[{"id":5386,"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts\/5380\/revisions\/5386"}],"wp:attachment":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5380"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5380"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5380"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}