sebastiano.tronto.net

Source files and build scripts for my personal website
git clone https://git.tronto.net/sebastiano.tronto.net
Download | Log | Files | Refs | README

test-visibility-c-macro.md (7781B)


      1 # Making functions public for tests only... with C macros!
      2 
      3 *Note from the future (2024-09-05): in the first version of this post,
      4 I used the identifier `_static` instead of `STATIC` for a macro.
      5 I have recently found out that one should almost never use identifiers
      6 starting with underscore, so I have edited this post accordingly. See
      7 [this StackOverflow question](https://stackoverflow.com/questions/69084726/what-are-the-rules-about-using-an-underscore-in-a-c-identifier)
      8 for an explanation.*
      9 
     10 As a programmer, I often face this dilemma: should I make this
     11 function private to improve encapsulation, or should I make it
     12 public so that I can write tests for it? I believe this problem is
     13 especially felt in scientific computing, or when implementing big,
     14 complex algorithms whose many small substeps have no place in a
     15 public interface, but should be unit-tested anyway.
     16 
     17 Until recently, I essentially had two ways to deal with this (with
     18 a strong preference for the first one):
     19 
     20 * Make the function public, tests are important. Who cares about visibility.
     21 * Make the function private and skip the tests. Errors will be caught when
     22   testing the higher-level routine that calls this smaller function.
     23 
     24 But a few days ago I thought of a cool trick (that realistically
     25 has been known for at least 45 years, I just was not aware of it
     26 before) to solve this problem for my C projects, using conditional
     27 compilation.  Let's dive in!
     28 
     29 ## Function visibility in C
     30 
     31 By default, functions in C are "public", by which I mean visible to any other
     32 *[translation unit](https://en.wikipedia.org/wiki/Translation_unit_%28programming%29)*
     33 (file). For example, say you have the following files:
     34 
     35 `foo.c`:
     36 
     37 ```
     38 int foo(int x, int y) {
     39 	return 42*x - 69*y;
     40 }
     41 ```
     42 
     43 `main.c`:
     44 
     45 ```
     46 #include <stdio.h>
     47 
     48 int foo(int, int); // Function prototype
     49 
     50 int main() {
     51 	int z = foo(10, 1);
     52 	printf("%d\n", z);
     53 	return 0;
     54 }
     55 ```
     56 
     57 You can build them with `gcc foo.c main.c`, and the program will
     58 run correctly and output `351`. Usually, the function prototype is
     59 put in a separate `foo.h` file and it is included in `main.c` with
     60 `#include "foo.h"`.
     61 
     62 This works because a C program is built into an executable in two
     63 steps: *[compiling](https://en.wikipedia.org/wiki/Compiler)* and
     64 *[linking](https://en.wikipedia.org/wiki/Linker_(computing))*.
     65 During the first of these two, each file is translated into
     66 *[object code](https://en.wikipedia.org/wiki/Object_file)*; if the
     67 compiler finds a reference to a function whose body is not present in
     68 the same file - like our `foo()` in `main.c` - it does not complain,
     69 but it trusts the programmer that this function is implemented somewhere
     70 else. Then it is the turn of the linker, whose job is exactly is to
     71 put together the object files and resolve these function calls;
     72 the linker *does* complain if the body of `foo()` is nowhere to be found.
     73 
     74 All of this is different for functions marked as `static`. These are
     75 only visible inside the file where they are defined.
     76 
     77 ## Why make functions `static`?
     78 
     79 There are a couple of reasons why one should make (some) functions
     80 `static`:
     81 
     82 * As a hint to other programmers: similarly to the `private` modifier
     83   in object oriented languages, `static` immediately communicates that
     84   this function is only used locally, and will not be called from other
     85   modules. It also prevents someone from calling it from another file
     86   by mistake.
     87 * As a hint to the compiler: if a compiler sees a `static` function, it
     88   knows all the places where this function is called, and it can
     89   choose to optimize out all the
     90   [assembly boilerplate](https://en.wikipedia.org/wiki/Calling_convention)
     91   related to function calls and
     92   [inline it](https://en.wikipedia.org/wiki/Inline_expansion).
     93 
     94 To illustrate the second point, I have put all the code of the
     95 previous example in the same file [`main2.c`](./main2.c). You can
     96 compile it with `gcc -O1 -S main2.c` to enable optimizations and
     97 generate the assembly code instead of an exectuable. I have uploaded
     98 the output here: [`main2.s`](./main2.s). Then you can do the same with
     99 [`main3.c`](./main3.c), whose only difference is that `foo()` is now
    100 static, and check the resulting [`main3.s`](./main3.s).
    101 
    102 As you can see, the section labelled `foo:` has disappeared. This
    103 is because the compiler knows that it will not be needed anywhere
    104 else; it inlined it everywhere it saw a reference to it and called
    105 it a day.
    106 
    107 You may also see that `foo` was actually inlined in *both* examples,
    108 and the call to it replaced by the constant `351`. Oh well, at least
    109 the compiler got rid of some useless code in the second case, and
    110 the binary will be smaller.
    111 
    112 ## The trick
    113 
    114 The trick I came up with is the following:
    115 
    116 ```
    117 #ifdef TEST
    118 #define STATIC
    119 #else
    120 #define STATIC static
    121 #endif
    122 ```
    123 
    124 Now put the snippet above at the top of the C file where the functions
    125 you want to test are implemented and declare your functions as
    126 `STATIC`. When you compile your code normally,
    127 these functions will be compiled as `static`, but if you use the
    128 `-DTEST` option, `STATIC` will expand to nothing and the functions
    129 will be visible outside the file.
    130 
    131 Here is a complete example.
    132 
    133 [`foo4.c`](./foo4.c):
    134 
    135 ```
    136 #include <stdio.h>
    137 
    138 #ifdef TEST
    139 #define STATIC
    140 #else
    141 #define STATIC static
    142 #endif
    143 
    144 STATIC int foo(int x, int y)
    145 {
    146 	return 42*x - 69*y;
    147 }
    148 ```
    149 
    150 [`test4.c`](./test4.c)
    151 
    152 ```
    153 #include <stdio.h>
    154 
    155 int foo(int, int);
    156 
    157 int main() {
    158 	int result = foo(1, 1);
    159 
    160 	if (result == -27) {
    161 		fprintf(stderr, "Test passed\n");
    162 		return 0;
    163 	} else {
    164 		fprintf(stderr, "Test failed: expected -27, got %d\n", result);
    165 		return 1;
    166 	}
    167 }
    168 ```
    169 
    170 You can download the source files (links above) and try for yourself:
    171 build with `gcc foo4.c test4.c` and you'll get a linker error
    172 `undefined symbol: foo`; build with `gcc -DTEST foo4.c test4.c` and
    173 run `./a.out` to see the test pass!
    174 
    175 ## Related tricks
    176 
    177 A few days before coming up with this trick, I had learned about a
    178 similar use of C macros useful for debugging purposes. I wanted to
    179 have some extra logging to be enabled only when I chose so, for
    180 example when using a `-DDEBUG` option. What I used to do was throwing
    181 `#ifdef`s all over my codebase, like this:
    182 
    183 ```
    184 	if (flob < 0) {
    185 #ifdef DEBUG
    186 		fprintf(stderr, "Invalid value for flob: %d\n", flob);
    187 #endif
    188 		return -1;
    189 	}
    190 ```
    191 
    192 But what I have found (on the
    193 [Wikipedia page on the C preprocessor](https://en.wikipedia.org/wiki/C_preprocessor))
    194 is that you can use a single `#ifdef` at the top of your file:
    195 
    196 ```
    197 #ifdef DEBUG
    198 #define DBG_LOG(...) fprintf(stderr, __VA_ARGS__)
    199 #else
    200 #define DBG_LOG(...)
    201 #endif
    202 
    203 /* More code ... */
    204 
    205 	if (flob < 0) {
    206 		DBG_LOG("Invalid value for flob: %d\n", flob);
    207 		return -1;
    208 	}
    209 ```
    210 
    211 Here I am using a *variadic macro*, which is supported in C99 but not,
    212 as far as I know, in C89. If you want to try this out, you'll have to
    213 build with `-std=c99` or a similar option.
    214 
    215 Sometimes the part I want to conditionally compile is not just the
    216 information logging, but the whole conditional expression. To do this,
    217 I actually use something like this in my code:
    218 
    219 ```
    220 #ifdef DEBUG
    221 #define DBG_ASSERT(condition, value, ...)     \
    222 	if (!(condition)) {                   \
    223 		fprintf(stderr, __VA_ARGS__); \
    224 		return value;                 \
    225 	}
    226 #else
    227 #define DBG_ASSERT(...)
    228 #endif
    229 
    230 /* More code ... */
    231 
    232 	DBG_ASSERT(flob >= 0, -1, "Invalid value for flob: %d\n", flob);
    233 ```
    234 
    235 Here `condition` can be any C expression. Macros are powerful!
    236 
    237 ## Conclusion
    238 
    239 Depending on your taste, you may find this a clean way to write
    240 C code, or a disgusting hack that should never be used.
    241 
    242 If you are working on a project where you can choose your own coding
    243 style, I encourage you to try out tricks like this and see for
    244 yourself if you like them or not. In the worst case, you'll make
    245 mistakes and learn what *not* to do next time!