Blog: January 2015

C++ Trait Mechanics — Part 2

| Stefanos | Comments 2

In this post, we are going to discuss about more advanced uses of type traits. This entry continues the discussion where we left off in C++ Trait Mechanics — Part 1, so if you haven’t read that, now is a good time to do so.

The tl;dr version for those of you who don’t want to revisit the previous post: We were given a legacy embedded application which was used to control a certain function of the car, in our car the seat heating. There were some requirements of the application which probably led to why the code was written is such a way. Although all the seats had to be heated, there were some subtle differences to be found per set. For example number of input and output signals.

To tackle these differences the original approach used multiple switch/if/else statements that were scattered throughout. The code therefore was rather huge in terms of ROM and also quite slow for the functionality it provided. It also was quite a nightmare to maintain. For example, a single change for a certain seat would have to be tracked in all those if/else/switch cases and changed. We thought that we could solve the above problems better and more efficient by using trait mechanics and templates.

Some feedback I’ve received from the previous post:

  • Everything is now in ROM, your executable grew too large.
  • Why bother with templates at all? I can just write four distinct classes, each of which with one function and it will be perfectly optimized, inlined and I won’t have this template stuff!

For the first bullet point: it is true that template programming does depend on compile time constants and type inference, so we do need some ROM. However multiple if/else statements also need ROM for the amount of code that is to be executed, plus they are not as efficient in terms of performance. As always with template programming you need to keep an eye on your generated code ROM footprint.

For the second bullet point the answer is yes, one could implement different distinct classes. But what about if we had 10 buttons? Or 20 buttons? Do we really need 20 distinct types? Would that be easy to maintain? What if 15 of those 20 buttons had identical functionality, while 3 had some extra functionality and the last 2 were a mix from both functionalities?

This is where type traits do shine. In our previous example we had four buttons which were reading some signals and we were outputting different signals based on their type:

void ButtonController::updateButtonLeds(unsigned id)
{
    switch (id)
    {
    case BUTTON_ID_1:
        updateOutputX(_buttonOne._signalX);
        updateOutputY(_buttonOne._signalY);
        updateOutputZ(_buttonOne._signalZ);
        return;
    case BUTTON_ID_2:
        updateOutputX(_buttonTwo._signalX);
        updateOutputY(_buttonTwo._signalY);
        updateOutputZ(_buttonTwo._signalZ + _buttonTwo._signalY);
        return;
    case BUTTON_ID_3:
        updateOutputX(_buttonThree._signalX);
        updateOutputY(_buttonThree._signalY);
        updateOutputZ(_buttonThree._signalZ + _buttonThree._signalY);
        return;
    default:
        updateOutputX(_buttonFour._signalX);
        updateOutputY(_buttonFour._signalY);
    }
}

Figure 1.

The main points here are:

  • All types output signal X and Y
  • One type outputs signal Z as it is
  • Two types output signal Z plus signal Y

One approach would have been to use distinct types for each scenario. Another would be to have a virtual output function which would have to be overridden in three concrete classes. Our approach was to have a base class outputting always X and Y:

template<class ButtonTraits>
class ButtonBase
{
public:
    void read()
    {
        _signalX = typename ButtonTraits::ReadSignalOne()();
        _signalY = typename ButtonTraits::ReadSignalTwo()();
    }

    void update()
    {
        updateOutputX(_signalX);
        updateOutputY(_signalY);
    }
protected:
    typename ButtonTraits::SignalOneType _signalX;
    typename ButtonTraits::SignalTwoType _signalY;
};

Figure 2.

And an extended one which is taking one template argument “EnableSignalAddition” which defaults to void. Now this may look like some kind of mystery, but this little argument can help us with our “Signal addition” problem. Since we have a default argument, if we provide nothing when instantiating the class, the compiler happily puts a void in that place and gives us the “default” implementation of the class which would just print signal Z as it is:

template<class ButtonTraits, class EnableSignalAddition = void >
class ExtendedButton : public ButtonBase<ButtonTraits>
{
public:
    void read()
    {
        ButtonBase<ButtonTraits>::read();
        _signal = typename ButtonTraits::ReadSignalThree()();
    }

    void update()
    {
        ButtonBase<ButtonTraits>::update();
        updateOutputZ(_signal);
    }
private:
    typename ButtonTraits::SignalThreeType _signal;
};

Figure 3.

Now how can we solve the signal addition problem? One approach would have been to explicitly specialize the class template for the other two types:

template<>
class ExtendedButton< BUTTON_ID_2, void>
{
//provide special update function
};

template<>
class ExtendedButton< BUTTON_ID_3, void>
{
//provide special update function
};

Figure 4.

Would this have worked? Yes. Do we like this? No. Why? Well we kind of copied/pasted the same class twice for no apparent reason other to simply provide an implementation for a function. You’ll say : “But wait, isn’t this what we wanted to do?” Yes but how about we let the compiler work for us?

template<class ButtonTraits>
class ExtendedButton<ButtonTraits,
      typename std::enable_if< 
               std::is_same< ButtonTraits, ButtonTwoTraits>::value ||  
               std::is_same< ButtonTraits, ButtonThreeTraits>::value
                             >::type > : public ButtonBase<ButtonTraits>
{
public:
    void read()
    {
        ButtonBase<ButtonTraits>::read();
        _signal = typename ButtonTraits::ReadSignalThree()();
    }

    void update()
    {
        ButtonBase<ButtonTraits>::update();
        updateOutputZ(calculateZ());
    }
private:
    typename ButtonTraits::SignalThreeType _signal;

    typename ButtonTraits::SignalThreeType calculateZ()
    {
        return static_cast<ButtonTraits::SignalThreeType>(_signal + _signalY);
    }
};

Figure 5.

You are probably wondering what this mumbo jumbo enable_if stuff is all about. It’s quite easy actually. The implementation of enable_if looks like this:

template<int B, class T = void>
struct enable_if { typedef T type; };

template<class T>
struct enable_if<false, T> { };

Figure 6.

For any given type T and any int B other than false it provides a typedef of T, otherwise it doesn’t provide a typedef. The std::is_same gives true if the two template types are the same, and false otherwise. This expression:

typename std::enable_if< 
         std::is_same< ButtonTraits, ButtonTwoTraits>::value ||  
         std::is_same< ButtonTraits, ButtonThreeTraits>::value
                       >::type >

Figure 7.

Will actually provide a type, if and only if the ButtonTraits are of type ButtonTwoTraits or ButtonThreeTraits. So with this partial specialization, the compiler will generate two “different” classes for those types, but we’ll have only a single class to maintain. You can imagine how this scales with even more types. So in this partial specialization we provide a calculateZ function which does what we want.

So, now with a single partial specialization and no polymorphic or virtual functions we have the desired behavior. In a sense, enable_if allows us to group partial specializations into a single implementation. Finally let’s imagine that for some reason, one of the above types has an “extra function”, while the other has none. Since we have already partially specialized this class to get there, that way is no longer an option. So what do we do? A virtual function and a concrete class would solve this. But then we have the performance overhead + an extra class to maintain. We don’t like that. How about some old plain overloading on types?

template<class ButtonTraits>
class ExtendedButton<ButtonTraits,
                     typename std::enable_if
                     < 
                        std::is_same< ButtonTraits, ButtonTwoTraits>::value ||  
                        std::is_same< ButtonTraits, ButtonThreeTraits>::value
                     >::type > : public ButtonBase<ButtonTraits>
{
public:   
    void mySpecialFunc()
    {
        mySpecialFunc(ButtonTraits());
    }
private:
    void mySpecialFunc(ButtonTwoTraits)
    {
        // Do my special thingy here
    }

    void mySpecialFunc(ButtonThreeTraits)const
    {        
    }
};

Figure 8.

Same functionality without performance costs. Chances are the optimizer is smart enough to avoid linking mySpecialFunc(ButtonThreeTraits) all together since it does nothing at all. Here we just exploit the fact that we have a distinct type which will result in the correct overloaded function being called.

To summarize in a few points:

  • Idioms like enable_if allow us to control code generation at compile time, thus providing us all the tools we need for special implementation based on types.

  • Having a traits type for a class, allows us to provide different overloaded methods based on that type without having to resort to virtual functions for the same effect.

  • Maintaining or extending such a code when a new type is introduced is not really that difficult since we need to change/add code in very few places.

  • If virtual dispatching and the associated runtime costs are a problem for you, as it was for us, then static dispatching is the way to go. Of course one has to always weight the pros and cons, e.g. runtime vs. ROM consumption.

I hope you enjoyed this article as much as I did writing it.

Stefanos