sebastiano.tronto.net

Source files and build scripts for my personal website
git clone https://git.tronto.net/sebastiano.tronto.net
Download | Log | Files | Refs | README

taming-cpp-raii.md (10417B)


      1 # Taming C++, episode 2: RAII
      2 
      3 *This post is part of a [series](../../series)*
      4 
      5 After publishing
      6 [the previous post on this topic](../2024-04-30-taming-cpp-motivation),
      7 I have not actually done much with C++, besides reading some more about
      8 it. However, a few weeks ago I attended a three-day C++ training at work,
      9 so I decided to pick up this blog series again (thus actually turning
     10 into a series).
     11 
     12 For this episode I decided to focus on one of the defining features
     13 of C++: resource management and RAII ("Resource Acquisition Is
     14 Initialization"). So in this post I'll talk about constructors and
     15 destructors and how to use them to make resource management safer than
     16 with C. I'll also discuss copy and move operations, which are part of
     17 the same family as constructors and destructors.
     18 
     19 Since this post does not want to be a C++ tutorial, I am going
     20 to explain little to none of the syntax. I hope it will be easy
     21 to follow along anyway, even for someone who does not know C++.
     22 In any case, you will find throughout this page many links to
     23 [cppreference.com](https://en.cppreference.com/w/), an incredible resource
     24 for all things C++.
     25 
     26 As for the previous post, I added some code in a
     27 [git repository](https://git.tronto.net/taming-cpp) with the
     28 examples discussed here.
     29 
     30 Let's start with one practical example!
     31 
     32 ## Impress your friends with this simple trick
     33 
     34 Consider the following C++ program:
     35 
     36 ```
     37 class BadType {
     38 	// Stuff
     39 };
     40 
     41 int main() {
     42 	BadType x;
     43 	return 0;
     44 }
     45 ```
     46 
     47 This program does nothing. Well, almost nothing. It allocates a variable
     48 of a custom type called `BadType`. But then it does nothing with it,
     49 it just exits returning a 0 value (success). So, assuming it compiles,
     50 it can never fail, right?
     51 
     52 Right?
     53 
     54 No, of course not, otherwise I would not be asking. For example, the
     55 class BadType may contain a *member variable* (for C programmers:
     56 that's C++ for "struct field") too large to be
     57 [allocated on the stack](https://en.wikipedia.org/wiki/Stack-based_memory_allocation),
     58 such as an array of 100 million integers.
     59 
     60 But there is a more interesting way to make this program fail by only
     61 changing the definition of `BadType`. In fact, you can make this program
     62 do pretty much everything you want by changing `BadType`'s *constructor*
     63 to suit your needs (see
     64 [surprising-error.cpp](https://git.tronto.net/taming-cpp/file/raii/surprising-error.cpp.html)):
     65 
     66 ```
     67 class BadType {
     68 	BadType() {
     69 		BadType recursion_hell;
     70 	}
     71 };
     72 ```
     73 
     74 The program now compiles without errors, and it immediately crashes.
     75 And all it does is declaring variables! But in C++, every time a variable
     76 of a *class type* is declared without any explicit initialization value,
     77 the corresponding
     78 [default constructor](https://en.cppreference.com/w/cpp/language/default_constructor)
     79 is called. And declaring a variable of `BadType` inside its own default
     80 constructor produces an artistic infinite recursion.
     81 
     82 And it does not stop here either: whenever a variable
     83 of a class type goes out of scope, the corresponding
     84 [destructor](https://en.cppreference.com/w/cpp/language/destructor)
     85 is called - see
     86 [constructor-hello-world.cpp](https://git.tronto.net/taming-cpp/file/raii/constructor-hello-world.cpp.html):
     87 
     88 ```
     89 #include <iostream>
     90 
     91 class BadType {
     92 public:
     93 	BadType() {
     94 		std::cout << "Variable created!" << std::endl;
     95 	}
     96 
     97 	~BadType() {
     98 		std::cout << "Variable destroyed, bye bye" << std::endl;
     99 	}
    100 };
    101 
    102 int main() {
    103 	BadType x;
    104 	return 0;
    105 }
    106 ```
    107 
    108 Of course, printing silly messages is not the point of constructors
    109 and destructors. The point is managing resources such as memory,
    110 files and network connections.
    111 
    112 But before we get into that, let's take a moment to reflect on this
    113 example.  For me, the main point here is that C++ does a lot of stuff
    114 under the hood. This is, as most things in C++, a double-edged sword:
    115 on the one hand, you can implement all sorts of interesting mechanisms
    116 of initialization and clean-up for your custum types; on the other hand,
    117 you constantly have to keep in mind that all of this stuff exists just
    118 to get the hang of a simple C++ program.
    119 
    120 Ok, now let's talk about resource management!
    121 
    122 ## Resource Acquisition Is Initialization (RAII)
    123 
    124 [RAII](https://en.cppreference.com/w/cpp/language/raii) is a resource
    125 management technique widely used in C++, but also in some other
    126 languages.  My personal interpretation is this: ONLY allocate resources
    127 with `malloc()`, `new`, `fopen()` and other dangerous operations *in
    128 constructors*, and ONLY de-allocate them with the respective `free()`,
    129 `delete`, `fclose()` and other dangerous operations in the respective
    130 *destructors*.
    131 
    132 Let's see a classic example. Say you have a function `f()` that, for
    133 some reason, needs to work with a large array locally. If you allocate
    134 it on the heap with `new` or `malloc()`, you must remember to `delete`
    135 or `free()` it in every place where the function returns:
    136 
    137 ```
    138 bool f(unsgined big_number) {
    139 	// In C: int *a = malloc(big_number * sizeof(int));
    140 	int *a = new int[big_number];
    141 
    142 	if (/* some condition */) {
    143 		// Remember to release the memory here!
    144 		delete[] a;
    145 		return false;
    146 	}
    147 
    148 	// Do stuff...
    149 
    150 	// Also here!
    151 	delete[] a;
    152 	return true;
    153 }
    154 ```
    155 
    156 In C, a clean way to do this is to use the `goto` statement (yes, I know,
    157 [considered harmful](https://en.wikipedia.org/wiki/Considered_harmful)
    158 blah blah) more or less like this:
    159 
    160 ```
    161 bool f(unsgined big_number) {
    162 	bool return_value = true;
    163 	int *a = malloc(big_number * sizeof(int));
    164 
    165 	if (/* some condition */) {
    166 		return_value = false;
    167 		goto f_cleanup_and_return;
    168 	}
    169 
    170 	/* Do stuff... */
    171 
    172 f_cleanup_and_return:
    173 	free(a);
    174 	return return_value;
    175 }
    176 ```
    177 
    178 Which is all fine and good, but wouldn't it be better if this
    179 de-allocation happened automatically based on the scope of the pointer
    180 `a`, just like if we had allocated it on the stack? This can be achieved
    181 in C++ using constructors and destructors, for example:
    182 
    183 ```
    184 class ArrayThing {
    185 public:
    186 	// Constructor
    187 	ArrayThing(unsigned n) {
    188 		buffer = new int[n];
    189 	}
    190 
    191 	// Destructor
    192 	~ArrayThing() {
    193 		delete[] buffer;
    194 	}
    195 
    196 	// You probably want something like this:
    197 	int& operator[](unsigned i) {
    198 		return buffer[i];
    199 	}
    200 private:
    201 	int *buffer;
    202 };
    203 
    204 bool f(unsigned big_number) {
    205 	ArrayThing a(big_number);
    206 
    207 	if (/* some condition */) {
    208 		return false; // Destructor is called, a is cleaned!
    209 	}
    210 
    211 	// Do stuff...
    212 
    213 	return true; // Destructor is called, a is cleaned!
    214 }
    215 ```
    216 
    217 And this, as far as I understand it, is the essence of RAII. The same
    218 concept applies not only to memory allocation, but also to other
    219 resource-management operations, such as opening files or locking
    220 a [mutex](https://en.wikipedia.org/wiki/Lock_(computer_science)).
    221 
    222 The example above is only for illustrative purposes: in practice
    223 if you want to achieve this result you should use a standard library
    224 container such as
    225 [`std::vector`](https://en.cppreference.com/w/cpp/container/vector)
    226 or
    227 [`std::array`](https://en.cppreference.com/w/cpp/container/array);
    228 but these standard classes do pretty much the same thing under the hood.
    229 
    230 ## Copying and moving
    231 
    232 So far I have only talked about constructors and destructors, but C++
    233 offers control over two other mechanisms: *copy* and *move*. Both of
    234 these come in two forms, a *constructor* form and an *assignment* form.
    235 
    236 Copy and move operations can be summarized as follows:
    237 
    238 * **Copy** is the operation that consists of creating or assigning a
    239   `target` object from a `source` object of the same type, copying the
    240   value of the source into the target. They act similarly to a regular
    241   constructor; a copy assignment must also take care of cleaning up
    242   the resources of the target object before copying the value.
    243 * **Move** is the operation that consists of creating or assigning a
    244   `target` object from a `source` object of the same **and then immediately
    245   destroying the source object**, moving the value of the source into the
    246   target. They act both as constructors for `target` and as destructors
    247   for `source`; a move assignment must also take care of cleaning up
    248   the resources of the target object before moving the value.
    249 
    250 Copy operations happen whenever you create an object from another one,
    251 for example with `T a(b)` or `a = b`. Move operations are perhaps a bit
    252 harder to understand, but they also happen regularly; returning an object
    253 from a function is a classic example, but they also come up when using
    254 [smart pointers](https://en.cppreference.com/book/intro/smart_pointers).
    255 
    256 I made a
    257 [comprehensive example](https://git.tronto.net/taming-cpp/file/raii/all-constructors.cpp.html)
    258 of how all of these operations work, so you can see when exactly each
    259 of them is called. Do check it out if you are interested!
    260 
    261 Finally, I have tried summarizing the construction, destruction, copy
    262 and move operations in the table below:
    263 
    264 |Operation                                                                      |Signature          |Construct target|Destroy (old) target|Destroy source|
    265 |:------------------------------------------------------------------------------|:------------------|:--------------:|:------------------:|:------------:|
    266 |Constructor                                                                    |`T(...)`           |✓               |N/A                 |N/A           |
    267 |[Destructor](https://en.cppreference.com/w/cpp/language/destructor)            |`~T()`             |❌              |✓                   |N/A           |
    268 |[Copy constructor](https://en.cppreference.com/w/cpp/language/copy_constructor)|`T(T&)`            |✓               |N/A                 |❌            |
    269 |[Copy assignment](https://en.cppreference.com/w/cpp/language/copy_assignment)  |`T& operator=(T&)` |✓               |✓                   |❌            |
    270 |[Move constructor](https://en.cppreference.com/w/cpp/language/move_constructor)|`T& T(T&&)`        |✓               |N/A                 |✓             |
    271 |[Move assignment](https://en.cppreference.com/w/cpp/language/move_assignment)  |`T& operator=(T&&)`|✓               |✓                   |✓             |
    272 
    273 ## Conclusion
    274 
    275 Manual resource management (in particular, memory management) and RAII
    276 are defining features of C++, features that clearly set it apart from
    277 other object-oriented languages like Java or C#.  C++ gives you a lot
    278 of control over the low-level details, and some powerful tools to make
    279 use of it, in exchange for a lot of complexity that you must, at the
    280 very least, be aware of.