sebastiano.tronto.net

Source files and build scripts for my personal website
git clone https://git.tronto.net/sebastiano.tronto.net
Download | Log | Files | Refs | README

taming-cpp-motivation.md (7211B)


      1 # Taming C++, episode 1: motivation
      2 
      3 C++ is pretty much the standard language in the software industry for
      4 any project where performance matters. I have used it a fair bit in
      5 the past - during my [IOI](https://ioinformatics.org) time and for
      6 a university course - but I used it only as "C with
      7 [STL](https://en.wikipedia.org/wiki/Standard_Template_Library)". I have
      8 also reviewed it for my job interview a couple of years ago, but I have
      9 not touched it since then. Given that I am at the early stages of my
     10 career as a software developer and that I am interested in performance,
     11 it's about time I learn modern C++!
     12 
     13 ## Motivation
     14 
     15 As much as I like programming and learning new stuff, this is something
     16 that requires some effort. And it turns out that keeping a copy of
     17 [A Tour of C++](https://www.stroustrup.com/tour3.html) on my desk and
     18 looking at its cover from time to time does not provide sufficient
     19 motivation to take up this new task.
     20 
     21 Starting a new project would be a good way to start playing with a new
     22 language, but at the moment I would rather keep working on my ongoing
     23 projects than start a new one. Writing a (series of) blog post(s)
     24 where I share my experience with the rest of the world seemed like a
     25 good alternative, but I needed some extra push to get started.
     26 
     27 Finally, a few weeks ago I have discovered Bert Hubert's series of posts
     28 [*Modern C++ for C programmers*](https://berthub.eu/articles/posts/c++-1).
     29 The first post in that series showed an example that got my attention:
     30 for a task as simple as sorting a list of integers, using their respective
     31 standard libraries, C++ outperforms C by a significant margin!
     32 
     33 ## Performance
     34 
     35 To get started with modern C++, I decided to repeat Bert's experiment.
     36 You can find the code I wrote in
     37 [this repository](https://git.tronto.net/taming-cpp).
     38 
     39 To sort a list of a hundred million integers, in C we can use `qsort()`
     40 from `stdlib.h` (see
     41 [sort.c](https://git.tronto.net/taming-cpp/file/sort-benchmark/integers/sort.c.html)):
     42 
     43 ```
     44 qsort(a, ARRAYSIZE, sizeof(int), compar);
     45 ```
     46 
     47 Where `compar()` is a comparison function such as
     48 
     49 ```
     50 int compar(const void *x, const void *y) {
     51 	return *(int *)x - *(int *)y;
     52 }
     53 ```
     54 
     55 In C++ we can use `sort()` from `algorithm` (see
     56 [sort.cpp](https://git.tronto.net/taming-cpp/file/sort-benchmark/integers/sort.cpp.html)):
     57 
     58 ```
     59 std::sort(a, a+ARRAYSIZE,
     60           [](const int &x, const int &y) { return x < y; });
     61 ```
     62 
     63 Here we use a
     64 [lambda expression](https://en.cppreference.com/w/cpp/language/lambda)
     65 instead of a comparison function.
     66 
     67 Modern C++ also offers parallelized
     68 version of common algorithms as part of the standard library, so if
     69 we want to completely humiliate poor C we can use this (see
     70 [sort_parallel.cpp](https://git.tronto.net/taming-cpp/file/sort-benchmark/integers/sort_parallel.cpp.html)):
     71 
     72 ```
     73 std::sort(std::execution::par, a, a+ARRAYSIZE,
     74           [](const int &x, const int &y) { return x < y; });
     75 ```
     76 
     77 Parallelizing stuff like this looks almost like cheating, but it wasn't
     78 without some struggle - see the "Gotchas" section below.
     79 
     80 So, what is the result? These are the times I get on my
     81 [Debian 12 desktop](https://sebastiano.tronto.net/blog/2023-10-15-build-time)
     82 (AMD Ryzen 7 7700):
     83 
     84 ```
     85 C time for 100000000 numbers: 9.815525s
     86 C++ time for 100000000 numbers: 4.87299s
     87 C++ time for 100000000 numbers (parallel): 0.488956s
     88 ```
     89 
     90 Even without parallelization, C++ is twice as fast as C!
     91 
     92 You might think that C++ is somehow optimizing for integer
     93 sorting. But this is not the case, as demonstrated by a similar experiment
     94 with *pairs* of integers (see
     95 [sort.c](https://git.tronto.net/taming-cpp/file/sort-benchmark/pairs/sort.c.html),
     96 [sort.cpp](https://git.tronto.net/taming-cpp/file/sort-benchmark/pairs/sort.cpp.html) and
     97 [sort_parallel.cpp](https://git.tronto.net/taming-cpp/file/sort-benchmark/pairs/sort_parallel.cpp.html)
     98 - I used a non-standard mixed lexicographic order to be reasonably sure the
     99 compiler does not come up with any ad-hoc optimization):
    100 
    101 ```
    102 C time for 100000000 pairs: 12.383896s
    103 C++ time for 100000000 pairs: 6.50033s
    104 C++ time for 100000000 pairs (parallel): 0.713616s
    105 ```
    106 
    107 ## Complexity is not for nothing
    108 
    109 As explained by Bert in his post, the reason
    110 the C++ version is faster is that by using
    111 [templates](https://en.cppreference.com/w/cpp/language/templates)
    112 the compiler can inline the call to the comparison function. C is not
    113 as sophisticated, so this is not possible: the only way a standard
    114 library function can call your custom code is via a
    115 [function pointer](https://en.wikipedia.org/wiki/Function_pointer).
    116 Regardless of possible performance issues with pointer
    117 dereferencing, calling a function without inlining it always causes
    118 some overhead. This is completely negligible for larger tasks, but for a
    119 small function that is called millions of times it makes a big difference!
    120 
    121 And all of this without even considering the elephant in the room, that is
    122 the fact that with virtually zero extra effort - but again, see the "Gotchas"
    123 section below - one can parallelize C++ STL algorithms and make them an order
    124 of magnitude faster on modern hardware! In C I would have needed to write a
    125 parallel sort on my own with something like
    126 [pthreads](https://en.wikipedia.org/wiki/Pthreads).
    127 
    128 Lesson learned for a C developer: sometimes the extra complexity of
    129 other languages does bring some benefit.
    130 
    131 ## Gotchas
    132 
    133 I mentioned above that I had some trouble compiling and running the
    134 parallel version of my code. In fact it took me almost two hours to make
    135 it work! This was for a couple of reasons.
    136 
    137 The first reason is that, at least on Linux with GCC + GLIBC, the C++
    138 standard library requires
    139 [TBB](https://en.wikipedia.org/wiki/Threading_Building_Blocks). I
    140 figured this out rather early, and it did not take me long to learn
    141 that I had to add an `-ltbb` option to my command line either. What
    142 took me an incredibly long time to understand is that `-ltbb` had to
    143 be put *at the end* of the command line options! To make it clear,
    144 something like this:
    145 
    146 ```
    147 $ g++ -O3 -std=c++20 -ltbb -o sort sort_parallel.cpp
    148 ```
    149 
    150 does not work, you have to write
    151 
    152 ```
    153 $ g++ -O3 -std=c++20 -o sort sort_parallel.cpp -ltbb
    154 ```
    155 
    156 Another problem I had was caused by my use of macros. You can see
    157 in the source files that I am using an `ARRAYSIZE` macro for the
    158 size of the array, and I provide a value for it at compile time.
    159 Originally I had called this macro `N`, but this clashed with some
    160 internal name used in TBB, and I got all sorts of walls of text
    161 of weird template errors - something C++ is infamous for.
    162 
    163 The last issue I had was that I was stupid and tried to compile my
    164 C++ code with a C compiler. Indeed, while debugging the issue with
    165 `-ltbb` I tried switching to [clang](https://clang.llvm.org), because
    166 I sometimes get better error messages with it. But instead of using
    167 `clang++` I used `clang`, which only compiles C.
    168 
    169 ## Until next time?
    170 
    171 Now that I have broken the ice with C++ I think it will be easier
    172 to continue studying it. I may or may not keep writing about this here:
    173 turning this into another blog series may be a good way to force myself
    174 to do this regularly, but on the other hand it may not be interesting
    175 for my readers. We'll see!