{"id":5894,"date":"2024-04-08T07:00:26","date_gmt":"2024-04-08T14:00:26","guid":{"rendered":"http:\/\/writeasync.net\/?p=5894"},"modified":"2024-04-08T07:07:06","modified_gmt":"2024-04-08T14:07:06","slug":"di-tricks-in-c-ref-or-ptr","status":"publish","type":"post","link":"http:\/\/writeasync.net\/?p=5894","title":{"rendered":"DI tricks in C++: ref or ptr?"},"content":{"rendered":"<p>Dependency injection, AKA &#8220;DI&#8221;, <a href=\"https:\/\/blog.ploeh.dk\/2017\/01\/27\/dependency-injection-is-passing-an-argument\/\">AKA &#8220;passing arguments&#8221;<\/a>, is commonly used in modern software design. You&#8217;ll see this approach quite often in <a href=\"https:\/\/learn.microsoft.com\/en-us\/aspnet\/core\/fundamentals\/dependency-injection?view=aspnetcore-8.0\">C#<\/a> and <a href=\"https:\/\/docs.spring.io\/spring-framework\/reference\/core\/beans\/dependencies\/factory-collaborators.html\">Java<\/a> applications, mainly because these languages have specific support for interfaces and <a href=\"https:\/\/softwareengineering.stackexchange.com\/questions\/439977\/di-injecting-interfaces-vs-actual-classes\">interfaces are a big part of DI in practice<\/a>. C++, however, supports bonafide <a href=\"https:\/\/isocpp.org\/wiki\/faq\/multiple-inheritance\">multiple inheritance<\/a>; you could say <a href=\"https:\/\/stackoverflow.com\/questions\/318064\/how-do-you-declare-an-interface-in-c\">C++ has &#8220;interfaces,&#8221; but only by convention<\/a>. Here is a C++ &#8220;interface&#8221; type:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\ntemplate &lt;typename T&gt;\r\nclass ISerializer\r\n{\r\npublic:\r\n    virtual void serialize(const T&amp; value, std::ostream&amp; out) = 0;\r\n\r\n    virtual ~ISerializer() = default;\r\n};\r\n<\/pre>\n<p>This is a pure abstract base class with a virtual destructor. Voila, it&#8217;s an interface type! This one represents a way to serialize a given data type <code>T<\/code> to an output stream.<\/p>\n<p>If we want to map from DI in C#\/Java to DI in C++, we need two things: one of those interface-like classes we just saw and the knowledge to use it correctly. On the second point, we need to be aware that we have to use <a href=\"https:\/\/isocpp.org\/wiki\/faq\/value-vs-ref-semantics\">reference semantics<\/a> to get proper virtual behavior. That means that the following pattern is almost always wrong in a C++ program:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\n\/\/ WRONG WRONG WRONG\r\nISerializer&lt;MyData&gt; ser = get_data_serializer();\r\nser.serialize(MyData{}, std::cout);\r\n<\/pre>\n<p>The problem is very subtle, but you will note that the <code>ISerializer<\/code> type is returned by value here from the <code>get_data_serializer<\/code> function. This problem is referred to as <a href=\"https:\/\/localcoder.net\/understanding-and-avoiding-object-slicing-in-c\">object slicing<\/a> (as in, we are &#8220;slicing off&#8221; the derived class information and leaving behind only the base class). It will thwart your attempts to achieve real polymorphism, which is usually what you need when <a href=\"https:\/\/stackoverflow.com\/questions\/383947\/what-does-it-mean-to-program-to-an-interface\">programming against interfaces<\/a>.<\/p>\n<p>One common fix is to use a smart pointer, say, <code>std::unique_ptr<\/code>, to wrap our interface. This technique is demonstrated in <a href=\"https:\/\/vladris.com\/\">Vlad Ri\u0219cu\u021bia<\/a>&#8216;s helpful article, &#8220;<a href=\"https:\/\/vladris.com\/blog\/2016\/07\/06\/dependency-injection-in-c.html\">Dependency Injection in C++<\/a>.&#8221; The bad example above can be thus rewritten correctly as follows:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\n\/\/ OK; the smart pointer gives us the ref semantics we need\r\nstd::unique_ptr&lt;ISerializer&lt;MyData&gt;&gt; ser = get_data_serializer();\r\nser-&gt;serialize(MyData{}, std::cout);\r\n<\/pre>\n<p>This approach is tried and true. It always works. Yet, it has a cost that C++ programmers especially are wary of &#8212; <a href=\"https:\/\/en.wikibooks.org\/wiki\/Optimizing_C%2B%2B\/Writing_efficient_code\/Allocations_and_deallocations\">dynamic allocation<\/a>! In many scenarios, this is not a cost worth worrying about. There are a dozen other things that tend to crop up in large C++ systems which are more expensive (e.g., <a href=\"https:\/\/www.oreilly.com\/library\/view\/optimized-c\/9781491922057\/ch04.html\">string handling<\/a>). But there is something to be said about seemingly <em>unnecessary<\/em> heap allocation. If my object graph <em>can<\/em> be fully relegated to the stack, why <em>must<\/em> I use the heap?<\/p>\n<p>Let&#8217;s look at a somewhat realistic example. Imagine a <code>Config<\/code> class that contains zero or more named <code>ConfigSection<\/code>s:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nclass ConfigSection\r\n{\r\npublic:\r\n    using Values = std::map&lt;std::string, std::string&gt;;\r\n\r\n    ConfigSection(std::string&amp;&amp; name);\r\n\r\n    const std::string&amp; name() const noexcept;\r\n\r\n    const std::string&amp; value(const std::string&amp; key) const;\r\n\r\n    Values::const_iterator begin() const noexcept;\r\n    Values::const_iterator end() const noexcept;\r\n\r\n    void insert(std::string&amp;&amp; key, std::string&amp;&amp; value);\r\n\r\n    void remove(const std::string&amp; key);\r\n\r\nprivate:\r\n    std::string m_name;\r\n    Values m_values;\r\n};\r\n\r\nclass Config\r\n{\r\npublic:\r\n    using Sections = std::map&lt;std::string, ConfigSection&gt;;\r\n\r\n    Config();\r\n\r\n    ConfigSection&amp; section(const std::string&amp; name);\r\n\r\n    Sections::iterator begin() noexcept;\r\n    Sections::iterator end() noexcept;\r\n\r\n    Sections::const_iterator begin() const noexcept;\r\n    Sections::const_iterator end() const noexcept;\r\n\r\n    void insert(ConfigSection&amp;&amp; section);\r\n\r\n    void remove(const std::string&amp; name);\r\n\r\nprivate:\r\n    Sections m_sections;\r\n};\r\n<\/pre>\n<p>This might be a reasonable way to represent configuration loaded from an <a href=\"https:\/\/en.wikipedia.org\/wiki\/INI_file\">INI file<\/a>. And in fact, maybe it would be useful to support serializing this data back out to a stream with our <code>ISerializer<\/code> interface above:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nclass ConfigSectionIniSerializer final : public ISerializer&lt;ConfigSection&gt;\r\n{\r\npublic:\r\n    void serialize(const ConfigSection&amp; value, std::ostream&amp; out) final;\r\n};\r\n\r\nclass ConfigIniSerializer final : public ISerializer&lt;Config&gt;\r\n{\r\npublic:\r\n    ConfigIniSerializer() noexcept;\r\n\r\n    void serialize(const Config&amp; value, std::ostream&amp; out) final;\r\n\r\nprivate:\r\n    ConfigSectionIniSerializer m_inner;\r\n};\r\n<\/pre>\n<p>This implementation would work fine but only if we never need to customize the serialization mechanism. This is because we have a hardcoded dependency on the concrete ConfigSectionIniSerializer:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\n\/\/ default construct the m_inner serializer -- not injectable!\r\nConfigIniSerializer::ConfigIniSerializer() noexcept : m_inner{}\r\n{}\r\n\r\nvoid ConfigIniSerializer::serialize(const Config&amp; value, std::ostream&amp; out)\r\n{\r\n    for (const auto&amp; s : value)\r\n    {\r\n        out &lt;&lt; '\\n';\r\n        m_inner.serialize(s.second, out); \/\/ by-value dependency\r\n    }\r\n}\r\n<\/pre>\n<p>As I mentioned before, by-ref semantics are generally what we need for polymorphic behavior &#8212; especially in a DI context. So maybe we could rework the outer Config serializer like this:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nclass ConfigIniSerializer final : public ISerializer&lt;Config&gt;\r\n{\r\npublic:\r\n    ConfigIniSerializer(ISerializer&lt;Config&gt;&amp; inner) noexcept;\r\n\r\n    void serialize(const Config&amp; value, std::ostream&amp; out) final;\r\n\r\nprivate:\r\n    ISerializer&lt;Config&gt;&amp; m_inner;\r\n};\r\n<\/pre>\n<p>This is okay &#8212; we now have the ability to inject different implementations. However, we have a new requirement to ensure a strictly nested lifetime of the outer Config serializer and the inner ConfigSection serializer. In a large enough program, this becomes quite hard to manage. We could always swap out the reference for a <code>std::unique_ptr<\/code>, but this gets us back to the original dilemma &#8212; maybe <em>sometimes<\/em> it&#8217;s fine to use a reference (in a small, constrained scope, say) and other times we would rather use a smart pointer.<\/p>\n<p>Normally a problem like this would call for a templated serializer. Perhaps there is a good way to make that work, but I see a much easier solution. Why not try <a href=\"https:\/\/www.cppstories.com\/2018\/06\/variant\/\">std::variant<\/a>? Essentially, we need a type that can hold a reference or a smart pointer and <code>std::variant<\/code> seems perfect for that job. There is one problem, though &#8212; a <a href=\"https:\/\/stackoverflow.com\/questions\/54218595\/why-are-references-forbidden-in-stdvariant\"><code>std::variant<\/code> cannot hold a reference<\/a>. In this case, though, that&#8217;s only a minor problem. We have a workaround:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\ntemplate &lt;typename T&gt;\r\nclass RefOrPtr\r\n{\r\npublic:\r\n    RefOrPtr(T&amp; ref) noexcept : m_value{ &amp;ref }\r\n    {}\r\n\r\n    RefOrPtr(std::shared_ptr&lt;T&gt; p) noexcept : m_value{ std::move(p) }\r\n    {}\r\n\r\n    RefOrPtr(std::unique_ptr&lt;T&gt; p) noexcept : m_value{ std::move(p) }\r\n    {}\r\n\r\n    T* operator-&gt;()\r\n    {\r\n        if (std::holds_alternative&lt;T*&gt;(m_value))\r\n        {\r\n            return std::get&lt;T*&gt;(m_value);\r\n        }\r\n        else if (std::holds_alternative&lt;std::shared_ptr&lt;T&gt;&gt;(m_value))\r\n        {\r\n            return std::get&lt;std::shared_ptr&lt;T&gt;&gt;(m_value).get();\r\n        }\r\n        else\r\n        {\r\n            return std::get&lt;std::unique_ptr&lt;T&gt;&gt;(m_value).get();\r\n        }\r\n    }\r\n\r\nprivate:\r\n    std::variant&lt;T*, std::shared_ptr&lt;T&gt;, std::unique_ptr&lt;T&gt;&gt; m_value;\r\n};\r\n<\/pre>\n<p>The <code>RefOrPtr<\/code> class wraps a variant over a <strong>raw pointer<\/strong>, a <code>std::shared_ptr<\/code>, or a <code>std::unique_ptr<\/code>. Note that the raw pointer value is initialized from a real reference, so we haven&#8217;t lost sight of our original goal (it&#8217;s but a minor implementation detail!). Now instead of tying yourself to one of these three options, we can have it all! And with judicious use of <a href=\"https:\/\/www.tutorialspoint.com\/cplusplus\/class_member_access_operator_overloading.htm\">arrow operator overloading<\/a>, we have a uniform access pattern to get at the internally held ref-or-pointer.<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nclass ConfigIniSerializer final : public ISerializer&lt;Config&gt;\r\n{\r\npublic:\r\n    ConfigIniSerializer(RefOrPtr&lt;ISerializer&lt;ConfigSection&gt;&gt;&amp;&amp; inner) noexcept;\r\n\r\n    void serialize(const Config&amp; value, std::ostream&amp; out) final;\r\n\r\nprivate:\r\n    RefOrPtr&lt;ISerializer&lt;ConfigSection&gt;&gt; m_inner;\r\n};\r\n\r\n\/\/ ...\r\nvoid ConfigIniSerializer::serialize(const Config&amp; value, std::ostream&amp; out)\r\n{\r\n    for (const auto&amp; s : value)\r\n    {\r\n        out &lt;&lt; '\\n';\r\n        \/\/ using `-&gt;` operator to access the interface method\r\n        m_inner-&gt;serialize(s.second, out);\r\n    }\r\n}\r\n<\/pre>\n<p>Because the RefOrPtr <em>may<\/em> contain a <code>std::unique_ptr<\/code>, it is <a href=\"https:\/\/stackoverflow.com\/questions\/33316942\/is-there-a-point-to-define-move-only-objects-in-c11\">a move-only type<\/a>. Here we take it by <a href=\"https:\/\/learn.microsoft.com\/en-us\/cpp\/cpp\/rvalue-reference-declarator-amp-amp?view=msvc-170\">r-value reference<\/a> in the serializer constructor (to be <a href=\"https:\/\/stackoverflow.com\/questions\/3413470\/what-is-stdmove-and-when-should-it-be-used\"><code>std::move<\/code>&#8216;d into place<\/a> in the implementation). RefOrPtr is also implicitly convertible from any of its <a href=\"https:\/\/en.cppreference.com\/w\/cpp\/utility\/variant\/holds_alternative\">variant alternatives<\/a> (because it defines <a href=\"https:\/\/stackoverflow.com\/questions\/121162\/what-does-the-explicit-keyword-mean\">single argument non-explicit constructors<\/a>). This means it&#8217;s pretty simple to use as a drop-in replacement.<\/p>\n<p>What do you think? Is RefOrPtr a DI trick or DI treat?<\/p>\n<p>For a complete implementation of all the code mentioned above, check out the <a href=\"https:\/\/github.com\/brian-dot-net\/writeasync-cpp\/tree\/main\/src\/dinject\"><code>dinject<\/code> sample on GitHub<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Dependency injection, AKA &#8220;DI&#8221;, AKA &#8220;passing arguments&#8221;, is commonly used in modern software design. You&#8217;ll see this approach quite often in C# and Java applications, mainly because these languages have specific support for interfaces and interfaces are a big part&hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[91,101],"tags":[],"class_list":["post-5894","post","type-post","status-publish","format-standard","hentry","category-design","category-native"],"_links":{"self":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts\/5894","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5894"}],"version-history":[{"count":6,"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts\/5894\/revisions"}],"predecessor-version":[{"id":5900,"href":"http:\/\/writeasync.net\/index.php?rest_route=\/wp\/v2\/posts\/5894\/revisions\/5900"}],"wp:attachment":[{"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5894"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5894"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/writeasync.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5894"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}