boost::optional and its internals

boost::optional is a very handy generalization of returning a null when there’s nothing to return. Consider a function such as strstr – either the required sub-string is found within the given string, in which case a pointer to it is returned, or a NULL value is returned to indicate that there was no such sub-string. In terms of boost::optional we would either return the same pointer to the occurrence of that sub-string (as before), or we would simply return nothing – no value at all. In this post we will demonstrate the usage of boost::optional and discuss its implementation.

Let us start with a simple use case to illustrate the exact usage of our optional idea. We propose the following implementation for strstr (which is essentially a wrapper around string.find):

#include "optional.hpp"

#include <string>
#include <iostream>

// myfind is basically the same as strstr:
optional<size_t> myfind (const std::string &str, const std::string &sub) {
    size_t ret = str.find(sub);
    if (ret != std::string::npos)
        return ret;            // builds an optional from a value
    return optional<size_t>(); // nothing to return
}

// use case:
int main () {
    if (optional<size_t> f = myfind("roman", "oma"))
        std::cout << "myfind(roman, oma) found: " << *f << 
            std::endl;
    if (!myfind("roman", "doma"))
        std::cout << "myfind(roman, doma) not found" << 
            std::endl;
    return 0;
}

We can tell that the usage of optional made the code a lot clearer – there’s no odd-looking comparison to string::npos in the calling code, and even in the implementation it is clear that we return sensible values when possible and nothing otherwise.

A discussion of possible ways to implement such a construct is in order.

The easiest solution would be to store a member of class T as part of our optional class, and just assign to it when needed. But this both poses an unnecessary requirement on type T, as it must have a default constructor, and also means that we create an object of type T even when we don’t need one (for example when no value is returned).

One other approach would be to store a pointer to type T within our optional class, and have dynamic allocation and deallocation going on when needed. This approach incurs overhead we would like to avoid, and also contradicts the by-value nature of the initial problem we were trying to solve. We want a different solution.. One that will only require stack allocation and nothing more.

Evidently, what we are looking for is an implementation which is (1) fully stack based, and (2) does not require creating the real object unless we really have to. Can you suggest such a design? The code below provides such an implementation.

#include <new>

template <typename T>
class optional {
    public:
        optional () :init_(false) {}

        optional (const T &val) :init_(true) { 
            construct(val);
        }

        optional (const optional<T> &opt) :init_(opt.init_) {
            if (init_) construct(opt.get());
        }

        optional &operator= (const optional<T> &opt) {
            destruct();
            if (init_ = opt.init_)
                construct(opt.get());
            return *this;
        }

        ~optional () {
            destruct();
        }

        operator bool () const { 
            return init_; 
        }

        T &operator *() {
            return get(); // undefined if not initialized            
        }

        const T &operator * () const {
            return get(); // undefined if not initialized
        }

    protected:
        void *storage () { return data_; }
        const void *storage () const { return data_; }

        T &get () { return *static_cast<T*>(storage()); }
        const T &get () const { return *static_cast<const T*>(storage()); };

        void destruct () { if (init_) get().~T(); init_ = false; }
        void construct (const T &data) { new (storage()) T(data); }

        bool init_;
        char data_[sizeof(T)];
};

As you can see, we create a char array of the exact size of the given type T. This array is then used to construct a new object of the given type in-place (using placement new), which means we need no heap allocation. What it also means is that we can create that object only when its needed, and not before. Hurray.

In this given implementation, the only requirement of type T is to be copy-constructible. In the real boost::optional, the requirements are essentially the same.

Another thing worth mentioning is that the real implementation of boost takes care of more fine-grained issues: it makes sure that proper memory-alignment of the given data type is used, it provides extra conversion from other close optional types, it handles reference types, and a little more. But in essence, boost’s implementation is very similar to this one.

11 thoughts on “boost::optional and its internals

  1. Great post. I’ve become so used to Boost Optional that in my current project I’m writing wrappers for common STL operations in terms of optional. For instance, I have the following function:

        template <class Map, class K>
        boost::optional<typename Map::mapped_type const &> 
        map_find_opt(Map const & m, K const & k)
        {
            typename Map::const_iterator it = m.find((Map::key_type const &)k);
            if(it == m.end())
                return boost::none;
            return it->second;
        }     
    

    And then I can write:

        map<string, int> m;
    
        if(map_find_opt(m, string("foo"))
        {
             // Key exists
        }
    

    Or;

        int value = map_find_opt(m, string("foo").get_value_or(-1));
        // -1 if key doesn't exist
    
    1. Boost as a whole is indeed pretty great, and also this very neat utility class.

      I’ve taken the liberty to slightly edit your post to use the ‘sourcecode’ tags and re-added the ‘>’ and ‘<' characters, which have been lost. I hope you don't mind and that I haven't messed anything up. Let me know if I did.

      Thanks for the kind words.

      1. Thanks for fixing the formatting of my code (I didn’t know what tags to use) but I’m afraid some of changes you’ve made are not quite right. This is how it should look like:

        http://pastie.org/814956

        Please notice that there’s no need for “V” as a template parameter.

        Best regards

  2. Which library do I need to link that includes boost::optional?
    I’ve getting: undefined reference to `boost::optional etc…

    1. boost::optional is part of the boost libraries: http://www.boost.org/

      Luckily, it is a header-only utility, which means that you don’t have to link with anything. All you have to do is:

      #include <boost/optional.hpp>

      Alternatively, you could use the implementation I have provided in this post – if it is good enough for your needs.

  3. I think the angle bracket with the T in between is unnecessary, as we know we are dealing with the T instantiation.
    I think it’s a good implementation and it somehow reminds me the famous sentence “We can solve any problem by introducing an extra level of indirection.”.

    1. If you refer to the signatures of the copy constructor and the assignment operator, it was done as a kind of documentation; In a complete implementation I would add a constructor to build an optional<T> from any related optional<U> such that there’s a conversion from U to T. And the same goes for operator=. Since I left that out, I felt it would be better to leave an explicit reminder of the matter.
      Essentially, you are correct. Thanks for the comment.

      1. Roman, now I noticed that there is a slight problem with the posted code:
        1). It’s not exception-safe: it fails to obey the strong guarantee(i.e the program is inconsistent when an exception is thrown).
        operator= (const optional &opt) is exception-unsafe:
        If the construct operation throws then the program is in inconsistent state as we stay with a destroyed object(instead of staying in the previous normal state).
        2). object lifetime playing is not always good, it is tricky, suppose T has a mutex object(which uses RAII) then we could have a potential hard to detect race condition(as the construct and destruct operation are not atomic).
        3). The more serious problem is that stack allocated buffer is not guaranteed to be aligned, thus it can trigger a hardware exception when you try to access T’s member.

        So as the buffer array solved some problems(we postpone T initialization and don’t use dynamic memory allocation) it also created new evil ones.

        Thanks!

        1. Hi Or,
          First off, thanks for the correct comments.
          It is important to keep in mind that this post only introduces a proof-of-concept for the proposed optional mechanism; As stated, it does not provide a full implementation.

          However, the issues you have raised can be addressed:
          1. Assuming T::operator= is itself exception safe, we could alter the implementation of our operator= to delegate the assignment to T::operator= if the optional is initialized. Note that by doing so we place a (new!) requirement on T: it now has to provide a public assignment operator. Another thing to think about is this: is that the behaviour you would want from an optional construct like this one? I am not sure.
          2. It is true. However, keep in mind that the nature of the problem we started with was that of an object we would like to return by value from a function. It is unlikely to use any resources in a RAII fashion.
          3. This problem is indeed a known one and is addressed at the end of the article. I would strongly advice you to have a look at how the folks at Boost have tackled the problem.

          Thanks for the insightful comments!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>