taming-cpp-raii.md (10417B)
1 # Taming C++, episode 2: RAII 2 3 *This post is part of a [series](../../series)* 4 5 After publishing 6 [the previous post on this topic](../2024-04-30-taming-cpp-motivation), 7 I have not actually done much with C++, besides reading some more about 8 it. However, a few weeks ago I attended a three-day C++ training at work, 9 so I decided to pick up this blog series again (thus actually turning 10 into a series). 11 12 For this episode I decided to focus on one of the defining features 13 of C++: resource management and RAII ("Resource Acquisition Is 14 Initialization"). So in this post I'll talk about constructors and 15 destructors and how to use them to make resource management safer than 16 with C. I'll also discuss copy and move operations, which are part of 17 the same family as constructors and destructors. 18 19 Since this post does not want to be a C++ tutorial, I am going 20 to explain little to none of the syntax. I hope it will be easy 21 to follow along anyway, even for someone who does not know C++. 22 In any case, you will find throughout this page many links to 23 [cppreference.com](https://en.cppreference.com/w/), an incredible resource 24 for all things C++. 25 26 As for the previous post, I added some code in a 27 [git repository](https://git.tronto.net/taming-cpp) with the 28 examples discussed here. 29 30 Let's start with one practical example! 31 32 ## Impress your friends with this simple trick 33 34 Consider the following C++ program: 35 36 ``` 37 class BadType { 38 // Stuff 39 }; 40 41 int main() { 42 BadType x; 43 return 0; 44 } 45 ``` 46 47 This program does nothing. Well, almost nothing. It allocates a variable 48 of a custom type called `BadType`. But then it does nothing with it, 49 it just exits returning a 0 value (success). So, assuming it compiles, 50 it can never fail, right? 51 52 Right? 53 54 No, of course not, otherwise I would not be asking. For example, the 55 class BadType may contain a *member variable* (for C programmers: 56 that's C++ for "struct field") too large to be 57 [allocated on the stack](https://en.wikipedia.org/wiki/Stack-based_memory_allocation), 58 such as an array of 100 million integers. 59 60 But there is a more interesting way to make this program fail by only 61 changing the definition of `BadType`. In fact, you can make this program 62 do pretty much everything you want by changing `BadType`'s *constructor* 63 to suit your needs (see 64 [surprising-error.cpp](https://git.tronto.net/taming-cpp/file/raii/surprising-error.cpp.html)): 65 66 ``` 67 class BadType { 68 BadType() { 69 BadType recursion_hell; 70 } 71 }; 72 ``` 73 74 The program now compiles without errors, and it immediately crashes. 75 And all it does is declaring variables! But in C++, every time a variable 76 of a *class type* is declared without any explicit initialization value, 77 the corresponding 78 [default constructor](https://en.cppreference.com/w/cpp/language/default_constructor) 79 is called. And declaring a variable of `BadType` inside its own default 80 constructor produces an artistic infinite recursion. 81 82 And it does not stop here either: whenever a variable 83 of a class type goes out of scope, the corresponding 84 [destructor](https://en.cppreference.com/w/cpp/language/destructor) 85 is called - see 86 [constructor-hello-world.cpp](https://git.tronto.net/taming-cpp/file/raii/constructor-hello-world.cpp.html): 87 88 ``` 89 #include <iostream> 90 91 class BadType { 92 public: 93 BadType() { 94 std::cout << "Variable created!" << std::endl; 95 } 96 97 ~BadType() { 98 std::cout << "Variable destroyed, bye bye" << std::endl; 99 } 100 }; 101 102 int main() { 103 BadType x; 104 return 0; 105 } 106 ``` 107 108 Of course, printing silly messages is not the point of constructors 109 and destructors. The point is managing resources such as memory, 110 files and network connections. 111 112 But before we get into that, let's take a moment to reflect on this 113 example. For me, the main point here is that C++ does a lot of stuff 114 under the hood. This is, as most things in C++, a double-edged sword: 115 on the one hand, you can implement all sorts of interesting mechanisms 116 of initialization and clean-up for your custum types; on the other hand, 117 you constantly have to keep in mind that all of this stuff exists just 118 to get the hang of a simple C++ program. 119 120 Ok, now let's talk about resource management! 121 122 ## Resource Acquisition Is Initialization (RAII) 123 124 [RAII](https://en.cppreference.com/w/cpp/language/raii) is a resource 125 management technique widely used in C++, but also in some other 126 languages. My personal interpretation is this: ONLY allocate resources 127 with `malloc()`, `new`, `fopen()` and other dangerous operations *in 128 constructors*, and ONLY de-allocate them with the respective `free()`, 129 `delete`, `fclose()` and other dangerous operations in the respective 130 *destructors*. 131 132 Let's see a classic example. Say you have a function `f()` that, for 133 some reason, needs to work with a large array locally. If you allocate 134 it on the heap with `new` or `malloc()`, you must remember to `delete` 135 or `free()` it in every place where the function returns: 136 137 ``` 138 bool f(unsgined big_number) { 139 // In C: int *a = malloc(big_number * sizeof(int)); 140 int *a = new int[big_number]; 141 142 if (/* some condition */) { 143 // Remember to release the memory here! 144 delete[] a; 145 return false; 146 } 147 148 // Do stuff... 149 150 // Also here! 151 delete[] a; 152 return true; 153 } 154 ``` 155 156 In C, a clean way to do this is to use the `goto` statement (yes, I know, 157 [considered harmful](https://en.wikipedia.org/wiki/Considered_harmful) 158 blah blah) more or less like this: 159 160 ``` 161 bool f(unsgined big_number) { 162 bool return_value = true; 163 int *a = malloc(big_number * sizeof(int)); 164 165 if (/* some condition */) { 166 return_value = false; 167 goto f_cleanup_and_return; 168 } 169 170 /* Do stuff... */ 171 172 f_cleanup_and_return: 173 free(a); 174 return return_value; 175 } 176 ``` 177 178 Which is all fine and good, but wouldn't it be better if this 179 de-allocation happened automatically based on the scope of the pointer 180 `a`, just like if we had allocated it on the stack? This can be achieved 181 in C++ using constructors and destructors, for example: 182 183 ``` 184 class ArrayThing { 185 public: 186 // Constructor 187 ArrayThing(unsigned n) { 188 buffer = new int[n]; 189 } 190 191 // Destructor 192 ~ArrayThing() { 193 delete[] buffer; 194 } 195 196 // You probably want something like this: 197 int& operator[](unsigned i) { 198 return buffer[i]; 199 } 200 private: 201 int *buffer; 202 }; 203 204 bool f(unsigned big_number) { 205 ArrayThing a(big_number); 206 207 if (/* some condition */) { 208 return false; // Destructor is called, a is cleaned! 209 } 210 211 // Do stuff... 212 213 return true; // Destructor is called, a is cleaned! 214 } 215 ``` 216 217 And this, as far as I understand it, is the essence of RAII. The same 218 concept applies not only to memory allocation, but also to other 219 resource-management operations, such as opening files or locking 220 a [mutex](https://en.wikipedia.org/wiki/Lock_(computer_science)). 221 222 The example above is only for illustrative purposes: in practice 223 if you want to achieve this result you should use a standard library 224 container such as 225 [`std::vector`](https://en.cppreference.com/w/cpp/container/vector) 226 or 227 [`std::array`](https://en.cppreference.com/w/cpp/container/array); 228 but these standard classes do pretty much the same thing under the hood. 229 230 ## Copying and moving 231 232 So far I have only talked about constructors and destructors, but C++ 233 offers control over two other mechanisms: *copy* and *move*. Both of 234 these come in two forms, a *constructor* form and an *assignment* form. 235 236 Copy and move operations can be summarized as follows: 237 238 * **Copy** is the operation that consists of creating or assigning a 239 `target` object from a `source` object of the same type, copying the 240 value of the source into the target. They act similarly to a regular 241 constructor; a copy assignment must also take care of cleaning up 242 the resources of the target object before copying the value. 243 * **Move** is the operation that consists of creating or assigning a 244 `target` object from a `source` object of the same **and then immediately 245 destroying the source object**, moving the value of the source into the 246 target. They act both as constructors for `target` and as destructors 247 for `source`; a move assignment must also take care of cleaning up 248 the resources of the target object before moving the value. 249 250 Copy operations happen whenever you create an object from another one, 251 for example with `T a(b)` or `a = b`. Move operations are perhaps a bit 252 harder to understand, but they also happen regularly; returning an object 253 from a function is a classic example, but they also come up when using 254 [smart pointers](https://en.cppreference.com/book/intro/smart_pointers). 255 256 I made a 257 [comprehensive example](https://git.tronto.net/taming-cpp/file/raii/all-constructors.cpp.html) 258 of how all of these operations work, so you can see when exactly each 259 of them is called. Do check it out if you are interested! 260 261 Finally, I have tried summarizing the construction, destruction, copy 262 and move operations in the table below: 263 264 |Operation |Signature |Construct target|Destroy (old) target|Destroy source| 265 |:------------------------------------------------------------------------------|:------------------|:--------------:|:------------------:|:------------:| 266 |Constructor |`T(...)` |✓ |N/A |N/A | 267 |[Destructor](https://en.cppreference.com/w/cpp/language/destructor) |`~T()` |❌ |✓ |N/A | 268 |[Copy constructor](https://en.cppreference.com/w/cpp/language/copy_constructor)|`T(T&)` |✓ |N/A |❌ | 269 |[Copy assignment](https://en.cppreference.com/w/cpp/language/copy_assignment) |`T& operator=(T&)` |✓ |✓ |❌ | 270 |[Move constructor](https://en.cppreference.com/w/cpp/language/move_constructor)|`T& T(T&&)` |✓ |N/A |✓ | 271 |[Move assignment](https://en.cppreference.com/w/cpp/language/move_assignment) |`T& operator=(T&&)`|✓ |✓ |✓ | 272 273 ## Conclusion 274 275 Manual resource management (in particular, memory management) and RAII 276 are defining features of C++, features that clearly set it apart from 277 other object-oriented languages like Java or C#. C++ gives you a lot 278 of control over the low-level details, and some powerful tools to make 279 use of it, in exchange for a lot of complexity that you must, at the 280 very least, be aware of.