sebastiano.tronto.net

Source files and build scripts for my personal website
git clone https://git.tronto.net/sebastiano.tronto.net
Download | Log | Files | Refs | README

commit d328796d7faf4ec7ad671f0996c4ea7d3cabfe45
parent 1e4e1bb0ec89f63bca7180fdb0e37031915dea15
Author: Sebastiano Tronto <sebastiano@tronto.net>
Date:   Mon, 26 Sep 2022 11:45:12 +0200

Added blog post

Diffstat:
Asrc/blog/2022-09-20-sh-2/sh-2.md | 394+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Msrc/blog/blog.md | 1+
Msrc/blog/feed.xml | 7+++++++
3 files changed, 402 insertions(+), 0 deletions(-)

diff --git a/src/blog/2022-09-20-sh-2/sh-2.md b/src/blog/2022-09-20-sh-2/sh-2.md @@ -0,0 +1,394 @@ +# The man page reading club: sh(1) - part 2: commands and builtins + +This is the second and last part of our exciting sh(1) manual page +read. This time we are going to learn about *commands* and *builtins*. +In case you have missed it, check out the [first part](../2022-09-13-sh-1) +where we dealt with the shell's grammar. + +I'll spare you the fan fiction this time - let's go straight to the +technical part! + +As usual, you can follow along at +[man.openbsd.org](https://man.openbsd.org/OpenBSD-7.1/sh) + +## Commands + +The Commands section of the manual page starts like this: + +``` + The shell first expands any words that are not variable assignments or + redirections, with the first field being the command name and any + successive fields arguments to that command. It sets up redirections, if + any, and then expands variable assignments, if any. It then attempts to + run the command. +``` + +The next few paragraphs describe how the name of a command is +interpreted. There are two distinct cases: if the name contains +any slashes, it is considered as a path to a file; if it does not, +the shell tries to interpret it as a special builtin, as a shell +function, as a non-special builtin (the difference between these +two types of builtins will be explained later) or finally as the +name of an executable file (binary or script) to be looked for in +`$PATH`. + +The meaning of this variable is explained in the `ENVIRONMENT` +section: + +``` +PATH Pathname to a colon separated list of directories used to search for + the location of executable files. A pathname of `.' represents the + current working directory. The default value of PATH on OpenBSD is: + + /usr/bin:/bin:/usr/sbin:/sbin:/usr/X11R6/bin:/usr/local/bin +``` + +### Grouping commands + +The manual page continues with explaining how to group commands +together to create more complex commands. There are five ways to +create a list of commands, and their syntax is always of the form + +``` + command SEP command SEP ... +``` + +where `SEP` is one of the separators described below. + +* *Sequential lists*: One or more commands separated by a semicolon `;` + are exectuted in order one after the other. +* *Asynchronous lists*: One or more commands separated by an ampersand `&` + are executed in parallel, each in a different subshell. +* *Pipelines*: Two or more commands separated by a pipe `|` are executed + in order, using the output of each command as input for the next one. + Together with I/O redirection, that we have seen last time, pipelines are + one of the "killer features" of UNIX that makes its shell such a powerful + language that it is still widely appreciated more than fifty years after + its introduction. +* *AND lists*: Two or more commands separated by a double ampersand `&&` + are executed in order, but a command is only run if the exit status of + the previous command was zero. +* *OR lists*: Two or more commands separated by a double pipe `||` + are executed in order, but a command is only run if the exit status of + the previous command was different from zero. + +The AND and OR lists can be combined by using a mix of `&&` and +`||`. The two operators have the same precedence. + +The exit status of a list of commands is equal to the exit status +of the last commands executed, except for asynchronous lists where +the exit status is always zero. For pipelines, the exit status can +be inverted by putting an exclamation mark `!` at the beginning of +the list. + +Now that I think about it, I have mentioned the exit status of a +command a few times here and in the last episode, but I have never +explained what it is. Basically, every command concludes its +execution by returning a number (exit status), which may be zero +to indicate a succesful execution or anything different from zero +to indicate a failure. This will become even more relevant soon. + +Finally, a list of commands can be treated as a single command by +enclosing it in parentheses or in braces: + +``` + Command lists, as described above, can be enclosed within `()' to have + them executed in a subshell, or within `{}' to have them executed in the + current environment: + + (command ...) + { command ...; } + + Any redirections specified after the closing bracket apply to all + commands within the brackets. An operator such as `;' or a newline are + needed to terminate a command list within curly braces. +``` + +### Flow control + +Much like any imperative programming language, the shell has some +constructs that allow controlling the flow of the execution. The +*for loop* is perhaps the most peculiar one. Its format is: + +``` + for name [in [word ...]] + do + command + ... + done +``` + +The commands are executed once for every item in the expansion of +`[word ...]` and every time the value of the variable `name` is set +to one of these items. (check [the last episode](../2022-09-13-sh-1) +for an explanation of text expansion). + +*While loops* are perhaps more familiar to regular programmers: a +command called *condition* is run, and if its exit code is zero the +body of the while loop is executed, and so on. The format is + +``` + while condition + do + command + ... + done +``` + +There is an opposite construct with `until` in place of `while` +which executes the body as long as `condition` exits with non-zero +status. + +A *case conditional* can be used to run commands depending on +something matching a pattern. The format is + +``` + case word in + (pattern [| pattern ...]) command;; + ... + esac +``` + +Where `pattern` can be expressed using the usual filename globbing +syntax that we briefly covered last time - see +[glob(7)](https://man.openbsd.org/OpenBSD-7.1/glob.7) for more +details. + +As an example, this short code snippet tries to determine the type +of the file given as first argument from its extension: + +``` +case "$1" in + (*.txt) echo "Text file";; + (*.wav | *.mp3 | *.ogg) echo "Music file";; + (*) echo "Something else";; +esac +``` + +Note that double quotes around the `$1` to avoid file names with +spaces being considered as multiple words. + +The *if conditional* is also a classic construct that programmers +are very familiar with. Its general format is + +``` + if conditional + then + command + ... + elif conditional + then + command + ... + else + command + ... + fi +``` + +Like for the `while` construct, `conditional` is a command that is +run and its exit status is evaluated. `elif` is just short for +"else, if...". + +Finally, the shell also has functions, that are basically groups +of commands that can be given a name and executed when using that +name as a command. Their syntax may be simpler than you expect: + +``` + function() command-list +``` + +When defining functions it is common to write `command-list` in the +`{ command ; command ; ... ; }` format. Replacing the semicolons +with newlines we get the more familiar-looking structure + +``` + function() { + command + command + ... + } +``` + +## Builtins + +The builtins are listed in alphabetic order in the manual page, +which is very convenient when consulting it for reference, but it +is not the best choice for a top-to-bottom read. So I'll shuffle +them around and divide them into a few groups. I'll skip some stuff, +but I'll try to cover what is important for regular use. + +But first, as promised at the beginning of the previous section, +we need to explain the difference between "special" and regular +builtins. + +``` + A number of built-ins are special in that a syntax error can cause a + running shell to abort, and, after the built-in completes, variable + assignments remain in the current environment. The following built-ins + are special: ., :, break, continue, eval, exec, exit, export, readonly, + return, set, shift, times, trap, and unset. +``` + +### More programming features + +As we have seen, the shell language includes some classical programming +constructs, like `if` and `while`. There are more builtins that can be +helpful these constructs: for example `true` and `false` are builtins +that do nothing and return a zero and a non-zero value respectively, +thus acting as sort of "boolean variables". + +The builtins `break` and `continue`, used inside a loop of any kind, +behave exactly as in C. The builtin `return` is used to exit the current +function. An exit code may be specified as a parameter, to indicate +success (0) or failure (any other number). + +### Variables + +The builtin `read` can be used to get input from the user - or +indeed from anywhere else, thanks to redirection: + +``` +read [-r] name ... + Read a line from standard input. The line is split into fields, with + each field assigned to a variable, name, in turn (first field + assigned to first variable, and so on). If there are more fields + than variables, the last variable will contain all the remaining + fields. If there are more variables than fields, the remaining + variables are set to empty strings. A backslash in the input line + causes the shell to prompt for further input. + + The options to the read command are as follows: + + -r Ignore backslash sequences. +``` + +As an example of reading from something other than standard input, +this short script takes a filename as an argument and prints each +line of the file preceded by its line number: + +``` +i=0 +while read line +do + i=$((i+1)) + echo $i: $line +done < $1 +``` + +Notice that the redirector `< $1` is placed at the end of the `while` +commend, after then closing `done`. + +The builtins `export` and `readonly` deal with permissions: the +first is used to make a variable visible to all subsequently ran +commands (by default it is not), while the latter is used to make +a variable unchangeable. The syntax is the same for both: + +``` + command [-p] name[=value] +``` + +If `=value` is given, the value is assigned to the variable before +changing the permissions. The option `-p` is used to list out all +the variables that are currently exported or set as read-only. + +### Running commands + +If you want to run the commands contained in `file`, you can do so +by using `. file` (the single dot is a builtin). For example you +can list some commands that you want to run at the beginning of +each shell session (e.g. aliases, see the next section) and run +them with just one command. Many other shells, such as ksh, run +certain files like `.profile` at startup, but sh does not. + +If the commands you want to run are saved in variables or other +parameters you can use `eval`. For example, the following script +takes a command and its arguments as parameters, runs them and +returns a different message depending on the exit code: + +``` +if eval $@ +then + echo "The command $@ ran happily" +else + echo "Oh no! Something went wrong!" +fi +``` + +### Aliases + +Aliases provide a nice shortcut sometimes, for example for shortening +a long command name or for adding a certain set of options by +default. + +Using `alias name=value` makes it so every time `name` is read by +the shell as a command (i.e. not when it is an argument) it is +replaced by `value`. For example using `alias off='shutdown -p now'` +can be used to easily call the `shutdown` command with the common +option `-p now` - check out [an older blog entry](../2022-07-07-shutdown) +to learn about this surprisingly feature-rich command! + +Using just `alias name` tells you the value of the corresponding alias, +if it is set. Using `alias` with no argument returns a list of all +currently set aliases. Contrary to variables, aliases are visible in +every subshell. + +Finally, `unalias name` can be used to unset the corresponding +alias; `unalias -a` unsets all currently set aliases. + +### Moving around directories + +Next (a meaningless word, since we are going in our own completely +arbitrary order) we have `cd` and `pwd`, which can be used to move around +in the directory tree. + +`pwd` simply prints the current path - it is short for "Print Working +Directory". The working directory is where files are looked for by +the shell, for example when used as arguments for commands. If a +file is not in the current working directory, its full path has to +be specified in order to refer to it. + +The working directory can be changed with `cd path/to/new/directory`. +If the path is not specified, it defaults to `$HOME`, the home +directory of the current user. The path can also be a single dash +`-`, meaning "return to the previous working directory". Finally, +if the path does not start with a slash and is not found relatively +to the current working directory, the variable `CDPATH`, which +should contain a colon-separated list of directories, is read to +try and find the new directory starting from there. + +### Jobs + +The builtins `jobs`, `kill`, `bg` and `fg` can be used to manage multiple +jobs running in the same shell. For example you can can run a command in +the background with `command &`, and later kill it with `kill [id]` or +bring it to the foreground with `fg [id]` (the `id` of the command will +be printed by the shell when you run `command &`). + +I wanted to write something more about this, but I found the man +page for sh a bit lacking. I had to rely on other resources, such +as the manual page of [ksh(1)](https://man.openbsd.org/OpenBSD-7.1/ksh). +I think I'll postpone *job control* to another entry. Stay tuned! + +### And finally... + +``` +exit [n] + Exit the shell with exit status n, or that of the last command executed. +``` + +## Conclusion + +I have skipped a few sections of the man page and many of the +builtins, but I am happy with the result and I think we can end it +here. After all, if I did not make any selection at all for these +"reading club" entries, you could just read the manual page yourself, +so what would the point be? + +I am not sure what I am going to cover in the next episode. On the one +hand I should alternate between shorter pages and longer ones, mainly +to avoid burning out by taking on too many huge projects. But on the +other hand long pages are often more interesting. + +Anyway, I hope you enjoyed this long double-post and that you may have +learnt something new. See you next time! diff --git a/src/blog/blog.md b/src/blog/blog.md @@ -2,6 +2,7 @@ [RSS Feed](feed.xml) +* 2022-09-20 [The man page reading club: sh(1) - part 2: commands and builtins](2022-09-20-sh-2) * 2022-09-13 [The man page reading club: sh(1) - part 1: shell grammar](2022-09-13-sh-1) * 2022-09-10 [Long live netbooks!](2022-09-10-netbooks) * 2022-09-05 [Pipe man into col -b to get rid of \^H](2022-09-05-man-col) diff --git a/src/blog/feed.xml b/src/blog/feed.xml @@ -9,6 +9,13 @@ Thoughts about software, computers and whatever I feel like sharing </description> <item> +<title>The man page reading club: sh(1) - part 2: commands and builtins</title> +<link>https://sebastiano.tronto.net/blog/2022-09-20-sh-2</link> +<description>The man page reading club: sh(1) - part 2: commands and builtins</description> +<pubDate>2022-09-20</pubDate> +</item> + +<item> <title>The man page reading club: sh(1) - part 1: shell grammar</title> <link>https://sebastiano.tronto.net/blog/2022-09-13-sh-1</link> <description>The man page reading club: sh(1) - part 1: shell grammar</description>