sebastiano.tronto.net

Source files and build scripts for my personal website
git clone https://git.tronto.net/sebastiano.tronto.net
Download | Log | Files | Refs | README

taming-cpp-motivation.md (7321B)


      1 # Taming C++, episode 1: motivation
      2 
      3 *This post is part of a [series](../../series)*
      4 
      5 C++ is pretty much the standard language in the software industry for
      6 any project where performance matters. I have used it a fair bit in
      7 the past - during my [IOI](https://ioinformatics.org) time and for
      8 a university course - but I used it only as "C with
      9 [STL](https://en.wikipedia.org/wiki/Standard_Template_Library)". I have
     10 also reviewed it for my job interview a couple of years ago, but I have
     11 not touched it since then. Given that I am at the early stages of my
     12 career as a software developer and that I am interested in performance,
     13 it's about time I learn modern C++!
     14 
     15 ## Motivation
     16 
     17 As much as I like programming and learning new stuff, this is something
     18 that requires some effort. And it turns out that keeping a copy of
     19 [A Tour of C++](https://www.stroustrup.com/tour3.html) on my desk and
     20 looking at its cover from time to time does not provide sufficient
     21 motivation to take up this new task.
     22 
     23 Starting a new project would be a good way to start playing with a new
     24 language, but at the moment I would rather keep working on my ongoing
     25 projects than start a new one. Writing a (series of) blog post(s)
     26 where I share my experience with the rest of the world seemed like a
     27 good alternative, but I needed some extra push to get started.
     28 
     29 Finally, a few weeks ago I have discovered Bert Hubert's series of posts
     30 [*Modern C++ for C programmers*](https://berthub.eu/articles/posts/c++-1).
     31 The first post in that series showed an example that got my attention:
     32 for a task as simple as sorting a list of integers, using their respective
     33 standard libraries, C++ outperforms C by a significant margin!
     34 
     35 ## Performance
     36 
     37 To get started with modern C++, I decided to repeat Bert's experiment.
     38 You can find the code I wrote in
     39 [this repository](https://git.tronto.net/taming-cpp).
     40 
     41 To sort a list of a hundred million integers, in C we can use `qsort()`
     42 from `stdlib.h` (see
     43 [sort.c](https://git.tronto.net/taming-cpp/file/sort-benchmark/integers/sort.c.html)):
     44 
     45 ```
     46 qsort(a, ARRAYSIZE, sizeof(int), compar);
     47 ```
     48 
     49 Where `compar()` is a comparison function such as
     50 
     51 ```
     52 int compar(const void *x, const void *y) {
     53 	return *(int *)x - *(int *)y;
     54 }
     55 ```
     56 
     57 In C++ we can use `sort()` from `algorithm` (see
     58 [sort.cpp](https://git.tronto.net/taming-cpp/file/sort-benchmark/integers/sort.cpp.html)):
     59 
     60 ```
     61 std::sort(a, a+ARRAYSIZE,
     62           [](const int &x, const int &y) { return x < y; });
     63 ```
     64 
     65 Here we use a
     66 [lambda expression](https://en.cppreference.com/w/cpp/language/lambda)
     67 instead of a comparison function.
     68 
     69 Modern C++ also offers parallelized
     70 version of common algorithms as part of the standard library, so if
     71 we want to completely humiliate poor C we can use this (see
     72 [sort_parallel.cpp](https://git.tronto.net/taming-cpp/file/sort-benchmark/integers/sort_parallel.cpp.html)):
     73 
     74 ```
     75 std::sort(std::execution::par, a, a+ARRAYSIZE,
     76           [](const int &x, const int &y) { return x < y; });
     77 ```
     78 
     79 Parallelizing stuff like this looks almost like cheating, but it wasn't
     80 without some struggle - see the "Gotchas" section below.
     81 
     82 So, what is the result? These are the times I get on my
     83 [Debian 12 desktop](https://sebastiano.tronto.net/blog/2023-10-15-build-time)
     84 (AMD Ryzen 7 7700):
     85 
     86 ```
     87 C time for 100000000 numbers: 9.815525s
     88 C++ time for 100000000 numbers: 4.87299s
     89 C++ time for 100000000 numbers (parallel): 0.488956s
     90 ```
     91 
     92 Even without parallelization, C++ is twice as fast as C!
     93 
     94 You might think that C++ is somehow optimizing for integer
     95 sorting. But this is not the case, as demonstrated by a similar experiment
     96 with *pairs* of integers (see
     97 [sort.c](https://git.tronto.net/taming-cpp/file/sort-benchmark/pairs/sort.c.html),
     98 [sort.cpp](https://git.tronto.net/taming-cpp/file/sort-benchmark/pairs/sort.cpp.html) and
     99 [sort_parallel.cpp](https://git.tronto.net/taming-cpp/file/sort-benchmark/pairs/sort_parallel.cpp.html)
    100 - I used a non-standard mixed lexicographic order to be reasonably sure the
    101 compiler does not come up with any ad-hoc optimization):
    102 
    103 ```
    104 C time for 100000000 pairs: 12.383896s
    105 C++ time for 100000000 pairs: 6.50033s
    106 C++ time for 100000000 pairs (parallel): 0.713616s
    107 ```
    108 
    109 ## Complexity is not for nothing
    110 
    111 As explained by Bert in his post, the reason
    112 the C++ version is faster is that by using
    113 [templates](https://en.cppreference.com/w/cpp/language/templates)
    114 the compiler can inline the call to the comparison function. C is not
    115 as sophisticated, so this is not possible: the only way a standard
    116 library function can call your custom code is via a
    117 [function pointer](https://en.wikipedia.org/wiki/Function_pointer).
    118 Regardless of possible performance issues with pointer
    119 dereferencing, calling a function without inlining it always causes
    120 some overhead. This is completely negligible for larger tasks, but for a
    121 small function that is called millions of times it makes a big difference!
    122 
    123 And all of this without even considering the elephant in the room, that is
    124 the fact that with virtually zero extra effort - but again, see the "Gotchas"
    125 section below - one can parallelize C++ STL algorithms and make them an order
    126 of magnitude faster on modern hardware! In C I would have needed to write a
    127 parallel sort on my own with something like
    128 [pthreads](https://en.wikipedia.org/wiki/Pthreads).
    129 
    130 Lesson learned for a C developer: sometimes the extra complexity of
    131 other languages does bring some benefit.
    132 
    133 ## Gotchas
    134 
    135 I mentioned above that I had some trouble compiling and running the
    136 parallel version of my code. In fact it took me almost two hours to make
    137 it work! This was for a couple of reasons.
    138 
    139 The first reason is that, at least on Linux with GCC + GLIBC, the C++
    140 standard library requires
    141 [TBB](https://en.wikipedia.org/wiki/Threading_Building_Blocks). I
    142 figured this out rather early, and it did not take me long to learn
    143 that I had to add an `-ltbb` option to my command line either. What
    144 took me an incredibly long time to understand is that `-ltbb` had to
    145 be put *at the end* of the command line options! To make it clear,
    146 something like this:
    147 
    148 ```
    149 $ g++ -O3 -std=c++20 -ltbb -o sort sort_parallel.cpp
    150 ```
    151 
    152 does not work, you have to write
    153 
    154 ```
    155 $ g++ -O3 -std=c++20 -o sort sort_parallel.cpp -ltbb
    156 ```
    157 
    158 Another problem I had was caused by my use of macros. You can see
    159 in the source files that I am using an `ARRAYSIZE` macro for the
    160 size of the array, and I provide a value for it at compile time.
    161 Originally I had called this macro `N`, but this clashed with some
    162 internal name used in TBB, and I got all sorts of walls of text
    163 of weird template errors - something C++ is infamous for.
    164 
    165 The last issue I had was that I was stupid and tried to compile my
    166 C++ code with a C compiler. Indeed, while debugging the issue with
    167 `-ltbb` I tried switching to [clang](https://clang.llvm.org), because
    168 I sometimes get better error messages with it. But instead of using
    169 `clang++` I used `clang`, which only compiles C.
    170 
    171 ## Until next time?
    172 
    173 Now that I have broken the ice with C++ I think it will be easier
    174 to continue studying it. I may or may not keep writing about this here:
    175 turning this into another blog series may be a good way to force myself
    176 to do this regularly, but on the other hand it may not be interesting
    177 for my readers. We'll see!
    178 
    179 *Next in the series: [RAII](../2024-12-26-taming-cpp-raii)*