sebastiano.tronto.net

Source files and build scripts for my personal website
git clone https://git.tronto.net/sebastiano.tronto.net
Download | Log | Files | Refs | README

sh-2.md (14149B)


      1 # The man page reading club: sh(1) - part 2: commands and builtins
      2 
      3 *This post is part of a [series](../../series)*
      4 
      5 This is the second and last part of our exciting sh(1) manual page
      6 read.  This time we are going to learn about *commands* and *builtins*.
      7 In case you have missed it, check out the [first part](../2022-09-13-sh-1)
      8 where we dealt with the shell's grammar.
      9 
     10 I'll spare you the fan fiction this time - let's go straight to the
     11 technical part!
     12 
     13 As usual, you can follow along at
     14 [man.openbsd.org](https://man.openbsd.org/OpenBSD-7.1/sh)
     15 
     16 ## Commands
     17 
     18 The Commands section of the manual page starts like this:
     19 
     20 ```
     21      The shell first expands any words that are not variable assignments or
     22      redirections, with the first field being the command name and any
     23      successive fields arguments to that command.  It sets up redirections, if
     24      any, and then expands variable assignments, if any.  It then attempts to
     25      run the command.
     26 ```
     27 
     28 The next few paragraphs describe how the name of a command is
     29 interpreted.  There are two distinct cases: if the name contains
     30 any slashes, it is considered as a path to a file; if it does not,
     31 the shell tries to interpret it as a special builtin, as a shell
     32 function, as a non-special builtin (the difference between these
     33 two types of builtins will be explained later) or finally as the
     34 name of an executable file (binary or script) to be looked for in
     35 `$PATH`.
     36 
     37 The meaning of this variable is explained in the `ENVIRONMENT`
     38 section:
     39 
     40 ```
     41 PATH    Pathname to a colon separated list of directories used to search for
     42         the location of executable files.  A pathname of `.' represents the
     43         current working directory.  The default value of PATH on OpenBSD is:
     44 
     45             /usr/bin:/bin:/usr/sbin:/sbin:/usr/X11R6/bin:/usr/local/bin
     46 ```
     47 
     48 ### Grouping commands
     49 
     50 The manual page continues with explaining how to group commands
     51 together to create more complex commands. There are five ways to
     52 create a list of commands, and their syntax is always of the form
     53 
     54 ```
     55 	command SEP command SEP ...
     56 ```
     57 
     58 where `SEP` is one of the separators described below.
     59 
     60 * *Sequential lists*: One or more commands separated by a semicolon `;`
     61   are exectuted in order one after the other.
     62 * *Asynchronous lists*: One or more commands separated by an ampersand `&`
     63   are executed in parallel, each in a different subshell.
     64 * *Pipelines*: Two or more commands separated by a pipe `|` are executed
     65   in order, using the output of each command as input for the next one.
     66   Together with I/O redirection, that we have seen last time, pipelines are
     67   one of the "killer features" of UNIX that makes its shell such a powerful
     68   language that it is still widely appreciated more than fifty years after
     69   its introduction.
     70 * *AND lists*: Two or more commands separated by a double ampersand `&&`
     71   are executed in order, but a command is only run if the exit status of
     72   the previous command was zero.
     73 * *OR lists*: Two or more commands separated by a double pipe `||`
     74   are executed in order, but a command is only run if the exit status of
     75   the previous command was different from zero.
     76 
     77 The AND and OR lists can be combined by using a mix of `&&` and
     78 `||`.  The two operators have the same precedence.
     79 
     80 The exit status of a list of commands is equal to the exit status
     81 of the last commands executed, except for asynchronous lists where
     82 the exit status is always zero. For pipelines, the exit status can
     83 be inverted by putting an exclamation mark `!` at the beginning of
     84 the list.
     85 
     86 Now that I think about it, I have mentioned the exit status of a
     87 command a few times here and in the last episode, but I have never
     88 explained what it is.  Basically, every command concludes its
     89 execution by returning a number (exit status), which may be zero
     90 to indicate a succesful execution or anything different from zero
     91 to indicate a failure. This will become even more relevant soon.
     92 
     93 Finally, a list of commands can be treated as a single command by
     94 enclosing it in parentheses or in braces:
     95 
     96 ```
     97      Command lists, as described above, can be enclosed within `()' to have
     98      them executed in a subshell, or within `{}' to have them executed in the
     99      current environment:
    100 
    101 	   (command ...)
    102 	   { command ...; }
    103 
    104      Any redirections specified after the closing bracket apply to all
    105      commands within the brackets.  An operator such as `;' or a newline are
    106      needed to terminate a command list within curly braces.
    107 ```
    108 
    109 ### Flow control
    110 
    111 Much like any imperative programming language, the shell has some
    112 constructs that allow controlling the flow of the execution. The
    113 *for loop* is perhaps the most peculiar one. Its format is:
    114 
    115 ```
    116 	for name [in [word ...]]
    117 	do
    118 		command
    119 		...
    120 	done
    121 ```
    122 
    123 The commands are executed once for every item in the expansion of
    124 `[word ...]` and every time the value of the variable `name` is set
    125 to one of these items.  (check [the last episode](../2022-09-13-sh-1)
    126 for an explanation of text expansion).
    127 
    128 *While loops* are perhaps more familiar to regular programmers: a
    129 command called *condition* is run, and if its exit code is zero the
    130 body of the while loop is executed, and so on. The format is
    131 
    132 ```
    133 	while condition
    134 	do
    135 		command
    136 		...
    137 	done
    138 ```
    139 
    140 There is an opposite construct with `until` in place of `while`
    141 which executes the body as long as `condition` exits with non-zero
    142 status.
    143 
    144 A *case conditional* can be used to run commands depending on
    145 something matching a pattern. The format is
    146 
    147 ```
    148 	case word in
    149 		(pattern [| pattern ...]) command;;
    150 		...
    151 	esac
    152 ```
    153 
    154 Where `pattern` can be expressed using the usual filename globbing
    155 syntax that we briefly covered last time - see
    156 [glob(7)](https://man.openbsd.org/OpenBSD-7.1/glob.7) for more
    157 details.
    158 
    159 As an example, this short code snippet tries to determine the type
    160 of the file given as first argument from its extension:
    161 
    162 ```
    163 case "$1" in
    164 	(*.txt) echo "Text file";;
    165 	(*.wav | *.mp3 | *.ogg) echo "Music file";;
    166 	(*) echo "Something else";;
    167 esac
    168 ```
    169 
    170 Note that double quotes around the `$1` to avoid file names with
    171 spaces being considered as multiple words.
    172 
    173 The *if conditional* is also a classic construct that programmers
    174 are very familiar with. Its general format is
    175 
    176 ```
    177 	if conditional
    178 	then
    179 		command
    180 		...
    181 	elif conditional
    182 	then
    183 		command
    184 		...
    185 	else
    186 		command
    187 		...
    188 	fi
    189 ```
    190 
    191 Like for the `while` construct, `conditional` is a command that is
    192 run and its exit status is evaluated. `elif` is just short for
    193 "else, if...".
    194 
    195 Finally, the shell also has functions, that are basically groups
    196 of commands that can be given a name and executed when using that
    197 name as a command. Their syntax may be simpler than you expect:
    198 
    199 ```
    200 	function() command-list
    201 ```
    202 
    203 When defining functions it is common to write `command-list` in the
    204 `{ command ; command ; ... ; }` format. Replacing the semicolons
    205 with newlines we get the more familiar-looking structure
    206 
    207 ```
    208 	function() {
    209 		command
    210 		command
    211 		...
    212 	}
    213 ```
    214 
    215 ## Builtins
    216 
    217 The builtins are listed in alphabetic order in the manual page,
    218 which is very convenient when consulting it for reference, but it
    219 is not the best choice for a top-to-bottom read. So I'll shuffle
    220 them around and divide them into a few groups. I'll skip some stuff,
    221 but I'll try to cover what is important for regular use.
    222 
    223 But first, as promised at the beginning of the previous section,
    224 we need to explain the difference between "special" and regular
    225 builtins.
    226 
    227 ```
    228      A number of built-ins are special in that a syntax error can cause a
    229      running shell to abort, and, after the built-in completes, variable
    230      assignments remain in the current environment.  The following built-ins
    231      are special: ., :, break, continue, eval, exec, exit, export, readonly,
    232      return, set, shift, times, trap, and unset.
    233 ```
    234 
    235 ### More programming features
    236 
    237 As we have seen, the shell language includes some classical programming
    238 constructs, like `if` and `while`. There are more builtins that can be
    239 helpful these constructs: for example `true` and `false` are builtins
    240 that do nothing and return a zero and a non-zero value respectively,
    241 thus acting as sort of "boolean variables".
    242 
    243 The builtins `break` and `continue`, used inside a loop of any kind,
    244 behave exactly as in C.  The builtin `return` is used to exit the current
    245 function. An exit code may be specified as a parameter, to indicate
    246 success (0) or failure (any other number).
    247 
    248 ### Variables
    249 
    250 The builtin `read` can be used to get input from the user - or
    251 indeed from anywhere else, thanks to redirection:
    252 
    253 ```
    254 read [-r] name ...
    255 	Read a line from standard input.  The line is split into fields, with
    256 	each field assigned to a variable, name, in turn (first field
    257 	assigned to first variable, and so on).  If there are more fields
    258 	than variables, the last variable will contain all the remaining
    259 	fields.  If there are more variables than fields, the remaining
    260 	variables are set to empty strings.  A backslash in the input line
    261 	causes the shell to prompt for further input.
    262 
    263 	The options to the read command are as follows:
    264 
    265 	   -r	    Ignore backslash sequences.
    266 ```
    267 
    268 As an example of reading from something other than standard input,
    269 this short script takes a filename as an argument and prints each
    270 line of the file preceded by its line number:
    271 
    272 ```
    273 i=0
    274 while read line
    275 do
    276 	i=$((i+1))
    277 	echo $i: $line
    278 done < $1
    279 ```
    280 
    281 Notice that the redirector `< $1` is placed at the end of the `while`
    282 commend, after then closing `done`.
    283 
    284 The builtins `export` and `readonly` deal with permissions: the
    285 first is used to make a variable visible to all subsequently ran
    286 commands (by default it is not), while the latter is used to make
    287 a variable unchangeable. The syntax is the same for both:
    288 
    289 ```
    290 	command [-p] name[=value]
    291 ```
    292 
    293 If `=value` is given, the value is assigned to the variable before
    294 changing the permissions. The option `-p` is used to list out all
    295 the variables that are currently exported or set as read-only.
    296 
    297 ### Running commands
    298 
    299 If you want to run the commands contained in `file`, you can do so
    300 by using `. file` (the single dot is a builtin). For example you
    301 can list some commands that you want to run at the beginning of
    302 each shell session (e.g. aliases, see the next section) and run
    303 them with just one command.  Many other shells, such as ksh, run
    304 certain files like `.profile` at startup, but sh does not.
    305 
    306 If the commands you want to run are saved in variables or other
    307 parameters you can use `eval`. For example, the following script
    308 takes a command and its arguments as parameters, runs them and
    309 returns a different message depending on the exit code:
    310 
    311 ```
    312 if eval $@
    313 then 
    314 	echo "The command $@ ran happily"
    315 else
    316 	echo "Oh no! Something went wrong!"
    317 fi
    318 ```
    319 
    320 ### Aliases
    321 
    322 Aliases provide a nice shortcut sometimes, for example for shortening
    323 a long command name or for adding a certain set of options by
    324 default.
    325 
    326 Using `alias name=value` makes it so every time `name` is read by
    327 the shell as a command (i.e. not when it is an argument) it is
    328 replaced by `value`. For example using `alias off='shutdown -p now'`
    329 can be used to easily call the `shutdown` command with the common
    330 option `-p now` - check out [an older blog entry](../2022-07-07-shutdown)
    331 to learn about this surprisingly feature-rich command!
    332 
    333 Using just `alias name` tells you the value of the corresponding alias,
    334 if it is set. Using `alias` with no argument returns a list of all
    335 currently set aliases.  Contrary to variables, aliases are visible in
    336 every subshell.
    337 
    338 Finally, `unalias name` can be used to unset the corresponding
    339 alias; `unalias -a` unsets all currently set aliases.
    340 
    341 ### Moving around directories
    342 
    343 Next (a meaningless word, since we are going in our own completely
    344 arbitrary order) we have `cd` and `pwd`, which can be used to move around
    345 in the directory tree.
    346 
    347 `pwd` simply prints the current path - it is short for "Print Working
    348 Directory". The working directory is where files are looked for by
    349 the shell, for example when used as arguments for commands. If a
    350 file is not in the current working directory, its full path has to
    351 be specified in order to refer to it.
    352 
    353 The working directory can be changed with `cd path/to/new/directory`.
    354 If the path is not specified, it defaults to `$HOME`, the home
    355 directory of the current user. The path can also be a single dash
    356 `-`, meaning "return to the previous working directory". Finally,
    357 if the path does not start with a slash and is not found relatively
    358 to the current working directory, the variable `CDPATH`, which
    359 should contain a colon-separated list of directories, is read to
    360 try and find the new directory starting from there.
    361 
    362 ### Jobs
    363 
    364 The builtins `jobs`, `kill`, `bg` and `fg` can be used to manage multiple
    365 jobs running in the same shell. For example you can can run a command in
    366 the background with `command &`, and later kill it with `kill [id]` or
    367 bring it to the foreground with `fg [id]` (the `id` of the command will
    368 be printed by the shell when you run `command &`).
    369 
    370 I wanted to write something more about this, but I found the man
    371 page for sh a bit lacking. I had to rely on other resources, such
    372 as the manual page of [ksh(1)](https://man.openbsd.org/OpenBSD-7.1/ksh).
    373 I think I'll postpone *job control* to another entry. Stay tuned!
    374 
    375 *Update: [here](../2023-02-25-job-control) is the post on job control.*
    376 
    377 ### And finally...
    378 
    379 ```
    380 exit [n]
    381     Exit the shell with exit status n, or that of the last command executed.
    382 ```
    383 
    384 ## Conclusion
    385 
    386 I have skipped a few sections of the man page and many of the
    387 builtins, but I am happy with the result and I think we can end it
    388 here. After all, if I did not make any selection at all for these
    389 "reading club" entries, you could just read the manual page yourself,
    390 so what would the point be?
    391 
    392 I am not sure what I am going to cover in the next episode. On the one
    393 hand I should alternate between shorter pages and longer ones, mainly
    394 to avoid burning out by taking on too many huge projects. But on the
    395 other hand long pages are often more interesting.
    396 
    397 Anyway, I hope you enjoyed this long double-post and that you may have
    398 learnt something new. See you next time!
    399 
    400 *Next in the series: [tetris(6)](../2022-10-01-tetris)*