sebastiano.tronto.net

Source files and build scripts for my personal website
git clone https://git.tronto.net/sebastiano.tronto.net
Download | Log | Files | Refs | README

dc.md (11855B)


      1 # The man page reading club: dc(1)
      2 
      3 *This post is part of a [series](../../series)*
      4 
      5 For this episode I have decided to go back to the basics, in multiple
      6 ways. Indeed `dc`, the *desk calculator*, is:
      7 
      8 * A calculator, the most basic functionality for a computer to be called
      9   so - "computer" *literaly* means "calculator".
     10 * A [stack-machine](https://en.wikipedia.org/wiki/Stack_machine), one of
     11   the most basic
     12   [Turing-complete](https://en.wikipedia.org/wiki/Turing-complete) 
     13   computational models.
     14 * One of the oldest UNIX utilities, predating even the C language - in fact,
     15   it was originally written in
     16   [B](https://en.wikipedia.org/wiki/B_programming_language).
     17 
     18 But is it also a practical tool to use? Let's find out!
     19 
     20 ## dc(1)
     21 
     22 *Follow along at [man.openbsd.org](http://man.openbsd.org/OpenBSD-7.2/dc)*
     23 
     24 There are a few features marked as non-portable in the manual page, most
     25 of them relevant to OpenBSD's [bc](http://man.openbsd.org/OpenBSD-7.2/bc)
     26 implementation. To make the post a bit shorter, I have decided to skip
     27 all of them.
     28 
     29 The first few lines of the manual page explain that `dc` uses
     30 [reverse Polish notation](https://en.wikipedia.org/wiki/Reverse_polish_notation):
     31 numbers can be pushed onto a stack, and operations are performed on the top
     32 (or top two) numbers on the stack, their result being pushed back onto the
     33 stack to replace the operands.
     34 
     35 `dc` allows to set an arbitrary precision (here called *scale*), as well
     36 as different bases for input and output - for example, you may want to
     37 input your numbers in binary and read the output in hexadecimal. The
     38 output base can be any number greater than 1, but the input base must
     39 be between 2 and 16.
     40 
     41 The most basic operation you can perform is simply pushing a number
     42 onto the stack. Letters A to F can be used to input numbers in bases
     43 higher than 10, and negative numbers are written with an underscore `_`
     44 instead of dash `-`.
     45 
     46 The commands are listed in alphabetic order in the manual page, but I will
     47 instead separate them in more logical sections.
     48 
     49 ### Basic operations
     50 
     51 The most basic operations are `+` (sum), `-` (subtraction), `*`
     52 (multiplication), `/` (division), `%` (remainder or modulus) and
     53 `^` (exponentiation). There is also `v` (square root).
     54 
     55 For example the command `4 7-` results in `-3`.  You can input it
     56 like that, all on one line and without any whitespace between the
     57 `7` and the `-`. But if you do, you won't get any output.  Why?
     58 
     59 ### Stack manipulation
     60 
     61 Operations remove one or more numbers from the stack and push back the
     62 result. So the answer to the previous question is: the result was pushed
     63 onto the stack, but no instruction was given to print it.
     64 
     65 This can be done with the command `p`, which prints the top number in
     66 the stack. The command `f` prints the whole stack.  Both of them leave
     67 the stack unchanged. So for example:
     68 
     69 ```
     70 $ echo '4 7-p' | dc
     71 -3
     72 ```
     73 
     74 *(Notice how we redirected the output of `echo` to be read by `dc` -
     75 I'll never get tired of
     76 [this](https://en.wikipedia.org/wiki/Pipeline_(Unix)))*
     77 
     78 Commands that manipulate the stack are `c` to clear the whole stack
     79 and `d` to duplicate the top element. The command `z` pushes onto the
     80 stack the number of elements currently on the stack.
     81 
     82 ### Scale and bases
     83 
     84 As mentioned at the beginning, some global parameters can be set:
     85 input base, output base and scale. This can be done with the commands
     86 `i`, `o`, and `k`, respectively: each of them pops the top element
     87 of the stack and uses it as value to set the respective global parameter.
     88 
     89 The capitalized version of these commands, `I`, `O` and `K`, read
     90 the value of the input base, the output base or scale respectively
     91 and push it onto the stack
     92 
     93 Each number on the stack has its own scale, too. This value is derived
     94 from the global scale and the scales of the operands used to compute it.
     95 More precisely:
     96 
     97 ```
     98 For addition and subtraction, the scale of the result is the maximum
     99 of scales of the operands.  For division the scale of the result
    100 is defined by the scale set by the k operation.  For multiplication,
    101 the scale is defined by the expression min(a+b,max(a,b,scale)),
    102 where a and b are the scales of the operands, and scale is the scale
    103 defined by the k operation.  For exponentiation with a non-negative
    104 exponent, the scale of the result is min(a*b,max(scale,a)), where
    105 a is the scale of the base, and b is the value of the exponent.  If
    106 the exponent is negative, the scale of the result is the scale
    107 defined by the k operation.
    108 ```
    109 
    110 The command `X` can be used to replace the top number with its
    111 scale. Similarly, the command `Z` replaces the top number with its
    112 length, i.e. its number of digits (not counting eventual decimal point
    113 or negative sign).
    114 
    115 ### Registers and arrays
    116 
    117 So far we have seen that `dc` can do everything that a rather basic
    118 RPN calculator can do. Things are going to get
    119 much more interesting in the next two sections.
    120 
    121 `dc` allows the use of 256 *registers* to store data. Each register
    122 is labelled by a single byte - in practice, an ASCII character.
    123 This character can be anything, even a whitespace or a non-printable
    124 character, so make sure not to put unneeded whitespace before a
    125 register name.
    126 
    127 The actual structure of registers was not very clear to me from the
    128 manual page. I had to read the relevant section and command
    129 descriptions a few times, and in the end I resorted to the ultimate
    130 technique: try it out.  (I skipped the "read the source code" step,
    131 please forgive my impurity.)
    132 
    133 It turns out that each register is a stack, each level of which
    134 contains both a single number and an unbounded array of numbers.
    135 The single number and the array can be manipulated separately. All
    136 the values default to 0 if unset.
    137 
    138 The command `sr` can be used to pop the top element of stack and
    139 save it as the "single" value of register `r`. You can replace `r`
    140 by any other ASCII character to manipulate other registers. To load
    141 the "single" value from register `r` onto the main stack, you can
    142 use `lr`; this command does not alter the state of the register.
    143 
    144 To manipulate a register's array, you can use `;r` and `:r`:
    145 
    146 ```
    147 :r   Pop two values from the stack.  The second value on the stack is
    148      stored into the array r indexed by the top of stack.
    149 
    150 ;r   Pop a value from the stack.  The value is used as an index into
    151      register r.  The value in this register is pushed onto the stack.
    152 ```
    153 
    154 So for example `42 3:r` stores the number 42 in the third position of
    155 the array of register `r`, and `3;r` retrieves this value.
    156 
    157 So far so good. But I said that each register is actually a stack. What
    158 did I mean by that?
    159 
    160 The commands `Sr` and `Lr` (capital S and L) can be used for this: `Sr`
    161 creates a new stack level on register `r`, pops the top value of the
    162 main stack, and saves that value as the "single" value. In doing so,
    163 a new level of the register's array is also created. Conversely, `Lr`
    164 pops a level of register `r` and pushes its single value onto the main
    165 stack, deleting the whole array saved on the level that was popped.
    166 
    167 Let's work out an example to help us understand this. First, we push
    168 some numbers on register `a`:
    169 
    170 ```
    171 1sa
    172 100 0:a 101 1:a 102 2:a
    173 ```
    174 
    175 Now register `a` looks something like this:
    176 
    177 ```
    178 Level 1 --- single value: 1 --- array: 100 101 102   0   0 ...
    179 ```
    180 
    181 You can confirm this by running the commands `la 0;a 1;a 2;a f`,
    182 which should output the numbers 102, 101, 100 and 1, one per line.
    183 
    184 Now let's push another level onto the register with `2Sa`. The
    185 register now looks something like this:
    186 
    187 ```
    188 Level 2 --- single value: 2 --- array:   0   0   0   0   0 ...
    189 Level 1 --- single value: 1 --- array: 100 101 102   0   0 ...
    190 ```
    191 
    192 Running the same command as before (`la 0;a 1;a 2;a f`) should
    193 now yield  0, 0, 0, 2.
    194 
    195 Lastly, let's pop the top level of this register with `La`. Now
    196 it should look like this again:
    197 
    198 ```
    199 Level 1 --- single value: 1 --- array: 100 101 102   0   0 ...
    200 ```
    201 
    202 And you can check this with the usual command. If you do, you'll
    203 notice that the number `2` has also been pushed on the main stack
    204 by the `La` command.
    205 
    206 Phew, this was a long one! And we have not reached the most
    207 interesting part yet...
    208 
    209 ### Strings and macros
    210 
    211 In `dc` you can work not only with numbers, but also with strings.
    212 You can input a string by enclosing it in square brackets, like
    213 this: `[Hello, World!]`. Square brackets can appear in a string
    214 if they are either balanced or escaped by a backslash.
    215 
    216 Strings can be pushed onto the main stack or saved in any register
    217 like numbers. But what can you do with them? One thing you can do
    218 is print them with the `P` command:
    219 
    220 ```
    221 [Hello, World!
    222 ]P
    223 Hello, World!
    224 ```
    225 
    226 As you can see, it is very easy to include a newline in a string.
    227 
    228 But much more interesting is the fact that you can *execute* strings
    229 with the `x` command. This allows you to create macros. For example,
    230 say you want to evaluate the function `p(x)=x^2+2x-1`. Since we are
    231 working in RPN, it is probably easier to rewrite `p(x)` as
    232 `x(x+2)-1`. If your number `x` is on the stack, you can compute `p(x)`
    233 with the commands `d2+*1-`. But what if you want to do this
    234 multiple times? Here macros can help:
    235 
    236 ```
    237 [d2+*1-]sp
    238 ```
    239 
    240 Now we have saved the macro "evaluate p(x)" on the register `p`. We
    241 can execute it any time we want by loading it with `lp` and then
    242 executing it with `x`:
    243 
    244 ```
    245 3 lpx
    246 _2 lpx
    247 1 lpx
    248 f
    249 ```
    250 
    251 Should give 2, -1, 14.
    252 
    253 ### Conditionals
    254 
    255 Lastly, we can control the flow of macro execution using conditionals:
    256 
    257 ```
    258 <x >x =x !<x !>x !=x
    259     The top two elements of the stack are popped and compared.
    260     Register x is executed if they obey the stated relation.
    261 ```
    262 
    263 Let's see a simple example: computing the average of all numbers
    264 on the stack.
    265 
    266 First we need to save the number of elements somewhere, say in the
    267 register `n`. We can do this with `zsn`. Then we need to sum the
    268 whole stack. We can do this by calling `+` until the stack is only
    269 one element left...  this sounds like a loop, but we can use recursion
    270 instead:
    271 
    272 ```
    273 [+z1<a]sa
    274 ```
    275 
    276 This saves the the macro `[+z1<a]` in register `a`, achieving recursion:
    277 the macro starts by summing the top two numbers, then pushes the number
    278 of elements left onto the stack with `z`, followed by one. It then pops
    279 these two numbers and calls itself if the top one is less then the second.
    280 
    281 Putting this all together, we can compute the average of a bunch of
    282 numbers, say to two decimal digits, like this:
    283 
    284 ```
    285 10 12 11 9 8 10 11 10 10
    286 2k
    287 [+z1<a]sa
    288 zsnlaxln/p
    289 ```
    290 
    291 Not the most legible code, but quite short!
    292 
    293 ## Conclusion
    294 
    295 In the end I managed to write a rather lengthy post about something as
    296 simple as a desk calculator. And I have even skipped some things, like
    297 recursion levels and the `?` command!
    298 
    299 Initially I wanted to write about
    300 [bc(1)](http://man.openbsd.org/OpenBSD-7.2/bc), the other standard UNIX
    301 calculator. It works with the more familiar infix notation and has
    302 for loops, if / else statements and functions. I even wrote a
    303 [small library of mathematical functions](https://git.tronto.net/bclibrary)
    304 to show off! But in the end I thought it would be boring, so I decided
    305 to learn and write about `dc` instead. In practice I am likely going to
    306 use bc and my hand-written math library for most purposes - except
    307 maybe computing averages, that was one example where the terseness of
    308 `dc` can come in handy.
    309 
    310 Fun fact (from the bc manual page):
    311 
    312 ```
    313 bc is actually a preprocessor for dc(1), which it invokes automatically,
    314 unless the -c (compile only) option is present.  In this case the
    315 generated dc(1) instructions are sent to the standard output, instead
    316 of being interpreted by a running dc(1) process.
    317 ```
    318 
    319 I think it would be a fun excercise to try and re-implement `dc`, and
    320 then bc as a compiler to `dc` code. I could learn a few things about
    321 compilers with this project! But for now I'll have to put it in the
    322 ever-growing list of "one day, maybe" ideas.