Lambda vs Bind¶
Consider the following function:
template <typename Function>
void do_test_loop(Function func, const uint64_t upper_limit = 1000000000ULL)
{
for (uint64_t i = 0; i < upper_limit; ++i)
func(i);
}
This rather useless higher-order function calls func
some
upper_limit
amount of times (default of one billion).
There a multiple methods of giving this function a function to call repeatedly; here, we will discuss two of them:
std::bind
to generate a polymorphic std::function<void (uint64_t)>
and a lambda expression.
void test_accumulate_bind_function(uint64_t& x, uint64_t i)
{
x += i;
}
uint64_t test_accumulate_bind()
{
namespace arg = std::placeholders;
uint64_t x = 0;
std::function<void (uint64_t)> accumulator = std::bind(&test_accumulate_bind_function, std::ref(x), arg::_1);
do_test_loop(accumulator);
return x;
}
That is a pretty ugly function.
The biggest problem I have had with std::bind
and boost::bind
are that they requires you to separate your
functions (and therefore your logic), which can lead to hard-to-follow code.
For larger functions, that is not a big deal, but for small functions like this one, the context switching is more
obnoxious than anything.
The equivalent lambda expression looks like this:
uint64_t test_accumulate_lambda()
{
uint64_t x = 0;
auto accumulator = [&x] (uint64_t i) { x += i; };
do_test_loop(accumulator);
return x;
}
The lambda syntax keeps everything tight – no context switching.
Of course, we are losing the major advantage the polymorphic std::function
has, which is polymorphism.
The type of a lambda expression is a statically compiler-generated unpronouncable name, which is why you must use
auto
to name it.
The type of accumulator
is specific to the result of that expression – no other expression (lambda or otherwise)
can generate the same type.
Even if the contents of two lambda expressions are identical, they will not have the same type.
If do_test_loop
was a non-trivial function hidden within some cpp file, we would not get to pass the lambda
expression’s type around.
Luckily, some very smart people thought of this potential problem and made assignment from a lambda expression to an
std::function
not only possible, but downright trivial:
uint64_t test_accumulate_bound_lambda()
{
uint64_t x = 0;
std::function<void (uint64_t)> accumulator = [&x] (uint64_t i) { x += i; };
do_test_loop(accumulator);
return x;
}
By using lambda syntax instead of std::bind
, we get all the power of the polymorphic std::function
with the
convenience and expressiveness of C++ lambdas.
This sounds like a win-win to me.
We can do a fairly simple comparison of these three functions (using the Stopwatch timer class):
template <typename Function>
void run_test(const std::string& name, Function func)
{
std::cout << name;
timer t;
volatile_write(func());
auto duration = t.elapsed();
std::cout << '\t' << duration.count() << std::endl;
}
int main()
{
run_test("Accumulate (lambda) ", &test_accumulate_lambda);
run_test("Accumulate (bind) ", &test_accumulate_bind);
run_test("Accumulate (bound lambda)", &test_accumulate_bound_lambda);
}
Note
The volatile_write
function is a simple function to force the system to actually write data somewhere in memory.
In this case, it prevents the optimizer from seeing that we are not doing anything with the result of func
in
run_test
.
Without further ado, here are the results on my Intel Core i7 Q740 (compiled with GCC 4.6.2 -O3):
Accumulate (lambda) 7
Accumulate (bind) 4401849
Accumulate (bound lambda) 4379315
Whenever I am doing performance tests and see an absurdly small time like this, I disassemble the program to see what the compiler did.
(gdb) disassemble test_accumulate_lambda
Dump of assembler code for function _Z22test_accumulate_lambdav:
0x0000000000400e70 <+0>: movabs $0x6f05b59b5e49b00,%rax
0x0000000000400e75 <+5>: retq
End of assembler dump.
After optimization, the whole function just moves 0x6f05b59b5e49b00
into rax
and returns it.
In decimal, that number is 499999999500000000
.
Whoa!
The compiler is smart enough to realize that we are just looking for the value of \(\sum_{n=0}^{n<1000000000}n\) and
does that for us.
This is actually a rather trivial optimization.
The contents of the function are statically known the by the instantiated do_test_loop
, so the compiler unrolls the
contents to something like:
uint64_t test_accumulate_lambda()
{
uint64_t x = 0;
// do_test_loop:
for (uint64_t i = 0; i < 1000000000; ++i)
x += i;
return x;
}
Any compiler worth its salt will optimize this right out. The important thing to take away from this simple example is to know that the compiler is aware of the static-ness of lambda functions. You can use lambda functions as a great way to avoid repeating yourself without worrying about the overhead.
So what about our calls with std::function
?
Here, it is the polymorphism that kills us.
When the function do_test_loop
is instantiated with Function = std::function<void (uint64_t)>
, the compiler does
not know the behavior of func
; it could do anything (which is the entire point of std::function
).
The difference between the std::bind
and bound lambda expression are insignificant.
If you run the test a bunch of times, the bound lambda seems to always be a bit faster on my computer, but it does not
look statistically significant.
This performance will probably change in the future and on different machines, but if I had to guess, I would say it has
to do with the std::reference_wrapper
.
Just for grins, let’s look at the stack traces from the two functions.
std::bind
#0 test_accumulate_bind_function (x=@0x7fffffffe5d0, i=0) at lambda_vs_bind.cpp:106
#1 0x0000000000401111 in operator() (__args#0=0, this=<optimized out>) at /usr/local/include/gcc-4.6.2/functional:2161
#2 do_test_loop<std::function<void(long unsigned int)> > (func=<optimized out>, upper_limit=<optimized out>) at lambda_vs_bind.cpp:93
#3 test_accumulate_bind () at lambda_vs_bind.cpp:115
#4 0x0000000000401304 in run_test<unsigned long (*)()> (name=<optimized out>, func=0x401080 <test_accumulate_bind()>) at lambda_vs_bind.cpp:84
#5 0x0000000000401411 in main () at lambda_vs_bind.cpp:136
Lambda Expression
#0 std::_Function_handler<void(long unsigned int), test_accumulate_bound_lambda()::<lambda(uint64_t)> >::_M_invoke(const std::_Any_data &, unsigned long) (__functor=..., __args#0=0) at /usr/local/include/gcc-4.6.2/functional:1778
#1 0x0000000000400fa9 in operator() (__args#0=0, this=<optimized out> at /usr/local/include/gcc-4.6.2/functional:2161
#2 do_test_loop<std::function<void(long unsigned int)> > (func=<optimized out>, upper_limit=<optimized out>) at lambda_vs_bind.cpp:93
#3 test_accumulate_bound_lambda () at lambda_vs_bind.cpp:126
#4 0x0000000000401304 in run_test<unsigned long (*)()> (name=<optimized out>, func=0x400f20 <test_accumulate_bound_lambda()>) at lambda_vs_bind.cpp:84
#5 0x000000000040143e in main () at lambda_vs_bind.cpp:140
So the only real difference is the function called by std::function
‘s operator()
.
To make sense of why (and how) this happens, let’s take a quick look at the functional
implementation as of g++
version 4.6.2.
template<typename _Res, typename... _ArgTypes>
class function<_Res(_ArgTypes...)>
: public _Maybe_unary_or_binary_function<_Res, _ArgTypes...>,
private _Function_base
{
// a whole bunch of implementation details
private:
typedef _Res (*_Invoker_type)(const _Any_data&, _ArgTypes...);
_Invoker_type _M_invoker;
};
The interesting thing is that std::function
does not use virtual
; instead, it uses a function pointer to get the
job done.
There are a few advantages to doing it this way.
This allows you to use an std::function
without dealing with pointers or references – that complexity is internal
to the object (a Good Thing).
boost::bind
¶
What about the old method: boost::bind
?
To test this out, we can use the same code for std::bind
, but replace std
with boost
.
Accumulate (boost bind) 3223174
Accumulate (boost bound lambda) 4255098
Curiously, boost::bind
is consistently about 25% faster than the equivalent std::bind
.
The stack trace for calling the function looks about the same as std::bind
:
#0 test_accumulate_bind_function (x=@0x7fffffffe600, i=0) at lambda_vs_bind.cpp:114
#1 0x00000000004018a3 in operator() (a0=0, this=<optimized out>) at /usr/local/include/boost/function/function_template.hpp:1013
#2 do_test_loop<boost::function<void(long unsigned int)> > (upper_limit=<optimized out>, func=<optimized out>) at lambda_vs_bind.cpp:101
#3 test_accumulate_boost_bind () at lambda_vs_bind.cpp:144
#4 0x0000000000401f44 in run_test<unsigned long (*)()> (name=<optimized out>, func=0x401800 <test_accumulate_boost_bind()>) at lambda_vs_bind.cpp:92
#5 0x000000000040207e in main () at lambda_vs_bind.cpp:161
<functional>
¶
template<typename _Functor, typename... _ArgTypes>
inline
typename _Bind_helper<_Functor, _ArgTypes...>::type
bind(_Functor&& __f, _ArgTypes&&... __args)
{
typedef _Bind_helper<_Functor, _ArgTypes...> __helper_type;
typedef typename __helper_type::__maybe_type __maybe_type;
typedef typename __helper_type::type __result_type;
return __result_type(__maybe_type::__do_wrap(std::forward<_Functor>(__f)),
std::forward<_ArgTypes>(__args)...);
}
boost/bind/bind.hpp
¶
...with the macros expanded...
template<class F, class A1, class A2>
_bi::bind_t<_bi::unspecified, F, typename _bi::list_av_2<A1, A2>::type>
bind(F f, A1 a1, A2 a2)
{
typedef typename _bi::list_av_2<A1, A2>::type list_type;
return _bi::bind_t<_bi::unspecified, F, list_type> (f, list_type(a1, a2));
}
More Info¶
Source Code¶
Get the source code for this program lambda_vs_bind.cpp
.
It has been compiled and tested with g++ 4.6.2, but any compiler with good support for the C++0x standard should be
perfectly fine with it.
My Boost version is 1.47, but earlier and later versions will work just fine, since the boost::bind
semantics have
not changed for quite some time (and probably will not in the future).
If you wish to compile and run without Boost, then change the value of USE_BOOST
to 0
.