test-visibility-c-macro.md (7781B)
1 # Making functions public for tests only... with C macros! 2 3 *Note from the future (2024-09-05): in the first version of this post, 4 I used the identifier `_static` instead of `STATIC` for a macro. 5 I have recently found out that one should almost never use identifiers 6 starting with underscore, so I have edited this post accordingly. See 7 [this StackOverflow question](https://stackoverflow.com/questions/69084726/what-are-the-rules-about-using-an-underscore-in-a-c-identifier) 8 for an explanation.* 9 10 As a programmer, I often face this dilemma: should I make this 11 function private to improve encapsulation, or should I make it 12 public so that I can write tests for it? I believe this problem is 13 especially felt in scientific computing, or when implementing big, 14 complex algorithms whose many small substeps have no place in a 15 public interface, but should be unit-tested anyway. 16 17 Until recently, I essentially had two ways to deal with this (with 18 a strong preference for the first one): 19 20 * Make the function public, tests are important. Who cares about visibility. 21 * Make the function private and skip the tests. Errors will be caught when 22 testing the higher-level routine that calls this smaller function. 23 24 But a few days ago I thought of a cool trick (that realistically 25 has been known for at least 45 years, I just was not aware of it 26 before) to solve this problem for my C projects, using conditional 27 compilation. Let's dive in! 28 29 ## Function visibility in C 30 31 By default, functions in C are "public", by which I mean visible to any other 32 *[translation unit](https://en.wikipedia.org/wiki/Translation_unit_%28programming%29)* 33 (file). For example, say you have the following files: 34 35 `foo.c`: 36 37 ``` 38 int foo(int x, int y) { 39 return 42*x - 69*y; 40 } 41 ``` 42 43 `main.c`: 44 45 ``` 46 #include <stdio.h> 47 48 int foo(int, int); // Function prototype 49 50 int main() { 51 int z = foo(10, 1); 52 printf("%d\n", z); 53 return 0; 54 } 55 ``` 56 57 You can build them with `gcc foo.c main.c`, and the program will 58 run correctly and output `351`. Usually, the function prototype is 59 put in a separate `foo.h` file and it is included in `main.c` with 60 `#include "foo.h"`. 61 62 This works because a C program is built into an executable in two 63 steps: *[compiling](https://en.wikipedia.org/wiki/Compiler)* and 64 *[linking](https://en.wikipedia.org/wiki/Linker_(computing))*. 65 During the first of these two, each file is translated into 66 *[object code](https://en.wikipedia.org/wiki/Object_file)*; if the 67 compiler finds a reference to a function whose body is not present in 68 the same file - like our `foo()` in `main.c` - it does not complain, 69 but it trusts the programmer that this function is implemented somewhere 70 else. Then it is the turn of the linker, whose job is exactly is to 71 put together the object files and resolve these function calls; 72 the linker *does* complain if the body of `foo()` is nowhere to be found. 73 74 All of this is different for functions marked as `static`. These are 75 only visible inside the file where they are defined. 76 77 ## Why make functions `static`? 78 79 There are a couple of reasons why one should make (some) functions 80 `static`: 81 82 * As a hint to other programmers: similarly to the `private` modifier 83 in object oriented languages, `static` immediately communicates that 84 this function is only used locally, and will not be called from other 85 modules. It also prevents someone from calling it from another file 86 by mistake. 87 * As a hint to the compiler: if a compiler sees a `static` function, it 88 knows all the places where this function is called, and it can 89 choose to optimize out all the 90 [assembly boilerplate](https://en.wikipedia.org/wiki/Calling_convention) 91 related to function calls and 92 [inline it](https://en.wikipedia.org/wiki/Inline_expansion). 93 94 To illustrate the second point, I have put all the code of the 95 previous example in the same file [`main2.c`](./main2.c). You can 96 compile it with `gcc -O1 -S main2.c` to enable optimizations and 97 generate the assembly code instead of an exectuable. I have uploaded 98 the output here: [`main2.s`](./main2.s). Then you can do the same with 99 [`main3.c`](./main3.c), whose only difference is that `foo()` is now 100 static, and check the resulting [`main3.s`](./main3.s). 101 102 As you can see, the section labelled `foo:` has disappeared. This 103 is because the compiler knows that it will not be needed anywhere 104 else; it inlined it everywhere it saw a reference to it and called 105 it a day. 106 107 You may also see that `foo` was actually inlined in *both* examples, 108 and the call to it replaced by the constant `351`. Oh well, at least 109 the compiler got rid of some useless code in the second case, and 110 the binary will be smaller. 111 112 ## The trick 113 114 The trick I came up with is the following: 115 116 ``` 117 #ifdef TEST 118 #define STATIC 119 #else 120 #define STATIC static 121 #endif 122 ``` 123 124 Now put the snippet above at the top of the C file where the functions 125 you want to test are implemented and declare your functions as 126 `STATIC`. When you compile your code normally, 127 these functions will be compiled as `static`, but if you use the 128 `-DTEST` option, `STATIC` will expand to nothing and the functions 129 will be visible outside the file. 130 131 Here is a complete example. 132 133 [`foo4.c`](./foo4.c): 134 135 ``` 136 #include <stdio.h> 137 138 #ifdef TEST 139 #define STATIC 140 #else 141 #define STATIC static 142 #endif 143 144 STATIC int foo(int x, int y) 145 { 146 return 42*x - 69*y; 147 } 148 ``` 149 150 [`test4.c`](./test4.c) 151 152 ``` 153 #include <stdio.h> 154 155 int foo(int, int); 156 157 int main() { 158 int result = foo(1, 1); 159 160 if (result == -27) { 161 fprintf(stderr, "Test passed\n"); 162 return 0; 163 } else { 164 fprintf(stderr, "Test failed: expected -27, got %d\n", result); 165 return 1; 166 } 167 } 168 ``` 169 170 You can download the source files (links above) and try for yourself: 171 build with `gcc foo4.c test4.c` and you'll get a linker error 172 `undefined symbol: foo`; build with `gcc -DTEST foo4.c test4.c` and 173 run `./a.out` to see the test pass! 174 175 ## Related tricks 176 177 A few days before coming up with this trick, I had learned about a 178 similar use of C macros useful for debugging purposes. I wanted to 179 have some extra logging to be enabled only when I chose so, for 180 example when using a `-DDEBUG` option. What I used to do was throwing 181 `#ifdef`s all over my codebase, like this: 182 183 ``` 184 if (flob < 0) { 185 #ifdef DEBUG 186 fprintf(stderr, "Invalid value for flob: %d\n", flob); 187 #endif 188 return -1; 189 } 190 ``` 191 192 But what I have found (on the 193 [Wikipedia page on the C preprocessor](https://en.wikipedia.org/wiki/C_preprocessor)) 194 is that you can use a single `#ifdef` at the top of your file: 195 196 ``` 197 #ifdef DEBUG 198 #define DBG_LOG(...) fprintf(stderr, __VA_ARGS__) 199 #else 200 #define DBG_LOG(...) 201 #endif 202 203 /* More code ... */ 204 205 if (flob < 0) { 206 DBG_LOG("Invalid value for flob: %d\n", flob); 207 return -1; 208 } 209 ``` 210 211 Here I am using a *variadic macro*, which is supported in C99 but not, 212 as far as I know, in C89. If you want to try this out, you'll have to 213 build with `-std=c99` or a similar option. 214 215 Sometimes the part I want to conditionally compile is not just the 216 information logging, but the whole conditional expression. To do this, 217 I actually use something like this in my code: 218 219 ``` 220 #ifdef DEBUG 221 #define DBG_ASSERT(condition, value, ...) \ 222 if (!(condition)) { \ 223 fprintf(stderr, __VA_ARGS__); \ 224 return value; \ 225 } 226 #else 227 #define DBG_ASSERT(...) 228 #endif 229 230 /* More code ... */ 231 232 DBG_ASSERT(flob >= 0, -1, "Invalid value for flob: %d\n", flob); 233 ``` 234 235 Here `condition` can be any C expression. Macros are powerful! 236 237 ## Conclusion 238 239 Depending on your taste, you may find this a clean way to write 240 C code, or a disgusting hack that should never be used. 241 242 If you are working on a project where you can choose your own coding 243 style, I encourage you to try out tricks like this and see for 244 yourself if you like them or not. In the worst case, you'll make 245 mistakes and learn what *not* to do next time!