dc.md (11855B)
1 # The man page reading club: dc(1) 2 3 *This post is part of a [series](../../series)* 4 5 For this episode I have decided to go back to the basics, in multiple 6 ways. Indeed `dc`, the *desk calculator*, is: 7 8 * A calculator, the most basic functionality for a computer to be called 9 so - "computer" *literaly* means "calculator". 10 * A [stack-machine](https://en.wikipedia.org/wiki/Stack_machine), one of 11 the most basic 12 [Turing-complete](https://en.wikipedia.org/wiki/Turing-complete) 13 computational models. 14 * One of the oldest UNIX utilities, predating even the C language - in fact, 15 it was originally written in 16 [B](https://en.wikipedia.org/wiki/B_programming_language). 17 18 But is it also a practical tool to use? Let's find out! 19 20 ## dc(1) 21 22 *Follow along at [man.openbsd.org](http://man.openbsd.org/OpenBSD-7.2/dc)* 23 24 There are a few features marked as non-portable in the manual page, most 25 of them relevant to OpenBSD's [bc](http://man.openbsd.org/OpenBSD-7.2/bc) 26 implementation. To make the post a bit shorter, I have decided to skip 27 all of them. 28 29 The first few lines of the manual page explain that `dc` uses 30 [reverse Polish notation](https://en.wikipedia.org/wiki/Reverse_polish_notation): 31 numbers can be pushed onto a stack, and operations are performed on the top 32 (or top two) numbers on the stack, their result being pushed back onto the 33 stack to replace the operands. 34 35 `dc` allows to set an arbitrary precision (here called *scale*), as well 36 as different bases for input and output - for example, you may want to 37 input your numbers in binary and read the output in hexadecimal. The 38 output base can be any number greater than 1, but the input base must 39 be between 2 and 16. 40 41 The most basic operation you can perform is simply pushing a number 42 onto the stack. Letters A to F can be used to input numbers in bases 43 higher than 10, and negative numbers are written with an underscore `_` 44 instead of dash `-`. 45 46 The commands are listed in alphabetic order in the manual page, but I will 47 instead separate them in more logical sections. 48 49 ### Basic operations 50 51 The most basic operations are `+` (sum), `-` (subtraction), `*` 52 (multiplication), `/` (division), `%` (remainder or modulus) and 53 `^` (exponentiation). There is also `v` (square root). 54 55 For example the command `4 7-` results in `-3`. You can input it 56 like that, all on one line and without any whitespace between the 57 `7` and the `-`. But if you do, you won't get any output. Why? 58 59 ### Stack manipulation 60 61 Operations remove one or more numbers from the stack and push back the 62 result. So the answer to the previous question is: the result was pushed 63 onto the stack, but no instruction was given to print it. 64 65 This can be done with the command `p`, which prints the top number in 66 the stack. The command `f` prints the whole stack. Both of them leave 67 the stack unchanged. So for example: 68 69 ``` 70 $ echo '4 7-p' | dc 71 -3 72 ``` 73 74 *(Notice how we redirected the output of `echo` to be read by `dc` - 75 I'll never get tired of 76 [this](https://en.wikipedia.org/wiki/Pipeline_(Unix)))* 77 78 Commands that manipulate the stack are `c` to clear the whole stack 79 and `d` to duplicate the top element. The command `z` pushes onto the 80 stack the number of elements currently on the stack. 81 82 ### Scale and bases 83 84 As mentioned at the beginning, some global parameters can be set: 85 input base, output base and scale. This can be done with the commands 86 `i`, `o`, and `k`, respectively: each of them pops the top element 87 of the stack and uses it as value to set the respective global parameter. 88 89 The capitalized version of these commands, `I`, `O` and `K`, read 90 the value of the input base, the output base or scale respectively 91 and push it onto the stack 92 93 Each number on the stack has its own scale, too. This value is derived 94 from the global scale and the scales of the operands used to compute it. 95 More precisely: 96 97 ``` 98 For addition and subtraction, the scale of the result is the maximum 99 of scales of the operands. For division the scale of the result 100 is defined by the scale set by the k operation. For multiplication, 101 the scale is defined by the expression min(a+b,max(a,b,scale)), 102 where a and b are the scales of the operands, and scale is the scale 103 defined by the k operation. For exponentiation with a non-negative 104 exponent, the scale of the result is min(a*b,max(scale,a)), where 105 a is the scale of the base, and b is the value of the exponent. If 106 the exponent is negative, the scale of the result is the scale 107 defined by the k operation. 108 ``` 109 110 The command `X` can be used to replace the top number with its 111 scale. Similarly, the command `Z` replaces the top number with its 112 length, i.e. its number of digits (not counting eventual decimal point 113 or negative sign). 114 115 ### Registers and arrays 116 117 So far we have seen that `dc` can do everything that a rather basic 118 RPN calculator can do. Things are going to get 119 much more interesting in the next two sections. 120 121 `dc` allows the use of 256 *registers* to store data. Each register 122 is labelled by a single byte - in practice, an ASCII character. 123 This character can be anything, even a whitespace or a non-printable 124 character, so make sure not to put unneeded whitespace before a 125 register name. 126 127 The actual structure of registers was not very clear to me from the 128 manual page. I had to read the relevant section and command 129 descriptions a few times, and in the end I resorted to the ultimate 130 technique: try it out. (I skipped the "read the source code" step, 131 please forgive my impurity.) 132 133 It turns out that each register is a stack, each level of which 134 contains both a single number and an unbounded array of numbers. 135 The single number and the array can be manipulated separately. All 136 the values default to 0 if unset. 137 138 The command `sr` can be used to pop the top element of stack and 139 save it as the "single" value of register `r`. You can replace `r` 140 by any other ASCII character to manipulate other registers. To load 141 the "single" value from register `r` onto the main stack, you can 142 use `lr`; this command does not alter the state of the register. 143 144 To manipulate a register's array, you can use `;r` and `:r`: 145 146 ``` 147 :r Pop two values from the stack. The second value on the stack is 148 stored into the array r indexed by the top of stack. 149 150 ;r Pop a value from the stack. The value is used as an index into 151 register r. The value in this register is pushed onto the stack. 152 ``` 153 154 So for example `42 3:r` stores the number 42 in the third position of 155 the array of register `r`, and `3;r` retrieves this value. 156 157 So far so good. But I said that each register is actually a stack. What 158 did I mean by that? 159 160 The commands `Sr` and `Lr` (capital S and L) can be used for this: `Sr` 161 creates a new stack level on register `r`, pops the top value of the 162 main stack, and saves that value as the "single" value. In doing so, 163 a new level of the register's array is also created. Conversely, `Lr` 164 pops a level of register `r` and pushes its single value onto the main 165 stack, deleting the whole array saved on the level that was popped. 166 167 Let's work out an example to help us understand this. First, we push 168 some numbers on register `a`: 169 170 ``` 171 1sa 172 100 0:a 101 1:a 102 2:a 173 ``` 174 175 Now register `a` looks something like this: 176 177 ``` 178 Level 1 --- single value: 1 --- array: 100 101 102 0 0 ... 179 ``` 180 181 You can confirm this by running the commands `la 0;a 1;a 2;a f`, 182 which should output the numbers 102, 101, 100 and 1, one per line. 183 184 Now let's push another level onto the register with `2Sa`. The 185 register now looks something like this: 186 187 ``` 188 Level 2 --- single value: 2 --- array: 0 0 0 0 0 ... 189 Level 1 --- single value: 1 --- array: 100 101 102 0 0 ... 190 ``` 191 192 Running the same command as before (`la 0;a 1;a 2;a f`) should 193 now yield 0, 0, 0, 2. 194 195 Lastly, let's pop the top level of this register with `La`. Now 196 it should look like this again: 197 198 ``` 199 Level 1 --- single value: 1 --- array: 100 101 102 0 0 ... 200 ``` 201 202 And you can check this with the usual command. If you do, you'll 203 notice that the number `2` has also been pushed on the main stack 204 by the `La` command. 205 206 Phew, this was a long one! And we have not reached the most 207 interesting part yet... 208 209 ### Strings and macros 210 211 In `dc` you can work not only with numbers, but also with strings. 212 You can input a string by enclosing it in square brackets, like 213 this: `[Hello, World!]`. Square brackets can appear in a string 214 if they are either balanced or escaped by a backslash. 215 216 Strings can be pushed onto the main stack or saved in any register 217 like numbers. But what can you do with them? One thing you can do 218 is print them with the `P` command: 219 220 ``` 221 [Hello, World! 222 ]P 223 Hello, World! 224 ``` 225 226 As you can see, it is very easy to include a newline in a string. 227 228 But much more interesting is the fact that you can *execute* strings 229 with the `x` command. This allows you to create macros. For example, 230 say you want to evaluate the function `p(x)=x^2+2x-1`. Since we are 231 working in RPN, it is probably easier to rewrite `p(x)` as 232 `x(x+2)-1`. If your number `x` is on the stack, you can compute `p(x)` 233 with the commands `d2+*1-`. But what if you want to do this 234 multiple times? Here macros can help: 235 236 ``` 237 [d2+*1-]sp 238 ``` 239 240 Now we have saved the macro "evaluate p(x)" on the register `p`. We 241 can execute it any time we want by loading it with `lp` and then 242 executing it with `x`: 243 244 ``` 245 3 lpx 246 _2 lpx 247 1 lpx 248 f 249 ``` 250 251 Should give 2, -1, 14. 252 253 ### Conditionals 254 255 Lastly, we can control the flow of macro execution using conditionals: 256 257 ``` 258 <x >x =x !<x !>x !=x 259 The top two elements of the stack are popped and compared. 260 Register x is executed if they obey the stated relation. 261 ``` 262 263 Let's see a simple example: computing the average of all numbers 264 on the stack. 265 266 First we need to save the number of elements somewhere, say in the 267 register `n`. We can do this with `zsn`. Then we need to sum the 268 whole stack. We can do this by calling `+` until the stack is only 269 one element left... this sounds like a loop, but we can use recursion 270 instead: 271 272 ``` 273 [+z1<a]sa 274 ``` 275 276 This saves the the macro `[+z1<a]` in register `a`, achieving recursion: 277 the macro starts by summing the top two numbers, then pushes the number 278 of elements left onto the stack with `z`, followed by one. It then pops 279 these two numbers and calls itself if the top one is less then the second. 280 281 Putting this all together, we can compute the average of a bunch of 282 numbers, say to two decimal digits, like this: 283 284 ``` 285 10 12 11 9 8 10 11 10 10 286 2k 287 [+z1<a]sa 288 zsnlaxln/p 289 ``` 290 291 Not the most legible code, but quite short! 292 293 ## Conclusion 294 295 In the end I managed to write a rather lengthy post about something as 296 simple as a desk calculator. And I have even skipped some things, like 297 recursion levels and the `?` command! 298 299 Initially I wanted to write about 300 [bc(1)](http://man.openbsd.org/OpenBSD-7.2/bc), the other standard UNIX 301 calculator. It works with the more familiar infix notation and has 302 for loops, if / else statements and functions. I even wrote a 303 [small library of mathematical functions](https://git.tronto.net/bclibrary) 304 to show off! But in the end I thought it would be boring, so I decided 305 to learn and write about `dc` instead. In practice I am likely going to 306 use bc and my hand-written math library for most purposes - except 307 maybe computing averages, that was one example where the terseness of 308 `dc` can come in handy. 309 310 Fun fact (from the bc manual page): 311 312 ``` 313 bc is actually a preprocessor for dc(1), which it invokes automatically, 314 unless the -c (compile only) option is present. In this case the 315 generated dc(1) instructions are sent to the standard output, instead 316 of being interpreted by a running dc(1) process. 317 ``` 318 319 I think it would be a fun excercise to try and re-implement `dc`, and 320 then bc as a compiler to `dc` code. I could learn a few things about 321 compilers with this project! But for now I'll have to put it in the 322 ever-growing list of "one day, maybe" ideas.