sebastiano.tronto.net

Source files and build scripts for my personal website
git clone https://git.tronto.net/sebastiano.tronto.net
Download | Log | Files | Refs | README

commit 7dac2c3bc3ea70ed52b25a08566b760af41e6f1a
parent 57ef2d54ef65043cc3fcc18b474dad140bfe70aa
Author: Sebastiano Tronto <sebastiano@tronto.net>
Date:   Mon, 15 Jan 2024 12:01:24 +0100

Merge branch 'master' of tronto.net:sebastiano.tronto.net

Diffstat:
Msrc/blog/2023-12-03-sed/sed.md | 4+++-
Asrc/blog/2024-01-13-tr/tr.md | 109+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Msrc/series/series.md | 1+
3 files changed, 113 insertions(+), 1 deletion(-)

diff --git a/src/blog/2023-12-03-sed/sed.md b/src/blog/2023-12-03-sed/sed.md @@ -399,4 +399,6 @@ to take a small detour and talk about some other simple, special-purpose text filtering commands, such as `tr`, `head`, `fmt` and so on. Expect some short posts in this series before part 3 - after all, there are [uncountably many](https://en.wikipedia.org/wiki/Uncountable_set) -numbers between two and 3! +numbers between 2 and 3! + +*Next in the series: [tr](../2024-01-13-tr)* diff --git a/src/blog/2024-01-13-tr/tr.md b/src/blog/2024-01-13-tr/tr.md @@ -0,0 +1,109 @@ +# UNIX text filters, part 2.1 of 3: tr + +*This post is part of a [series](../../series)* + +In the [post about sed](../2023-12-03-sed) I have not discussed the +`y` command at all. This is because I realized it is just an +underpowered version of the `tr` command, that we are going to +explore in this post. + +The `tr` command is a simple utility that can perform +character-by-character substitutions and a couple of other things. +Like most UNIX utilities, it operates on standard input and standard +output by default. + +## Replacing + +The most basic form of a `tr` command is + +``` +$ tr string1 string2 +``` + +which replaces every occurrence of a character present in `string1` +with its corresponding character in `string2` - the first with the +first, the second with the second and so on. If `string2` is shorter +than `string1`, the last character is repeated as needed. + +For example + +``` +$ echo 'Hello!' | tr le 13 +H311o! +``` + +An equivalent `sed` command would be `sed 'y/le/13/'`. + +Like sed and [grep](../2023-08-20-grep), also `tr` supports the +standard character sets like `[:upper:]`, `[:alpha:]` and so on. +For example, the following command capitalizes every letter in the +input string: + +``` +$ echo 'Hello!' | tr [:lower:] [:upper:] +HELLO! +``` + +## Deleting + +With the `-d` option one can delete characters: + +``` +$ echo 'R42emo3vin0g all n66umber3s!' | tr -d 0-9 +Removing all numbers! +``` + +Here I have used the *character range* `0-9` instead of `[:digit:]`. +Other examples of valid character ranges are `A-Z`, `0-8` and `a-f`. + +The `-d` option can be combined with the `-c` option, which takes +the *complement* of a given set of characters: + +``` +$ echo 'R42emo3vin0g all non-n66umber3s!' | tr -cd '0-9\n' +4230663 +``` + +Notice that I have also added `\n` to our list of +characters, so that the newline at the end of the text is kept. + +A more complex example involving `tr -cd` is the following, which +I use to generate random passowrds: + +``` +$ cat /dev/random | tr -cd 'a-z0-9' | fold -w 12 | head -n 1 +ft82mtfsy5ps +``` + +Here `/dev/random` spits out random data, while the commands +`fold -w 12` and `head -n 1` are used to break the input text into +lines of 12 characters and take the first line of the input, +respectively. We'll talk about them in future posts. + +## Squeezing + +One more thing `tr` can do is *squeezing* consecutive identical +characters. For example: + +``` +$ echo Helllllo | tr -s l +Helo +``` + +The `-s` option can be combined with the `-c` or `-d` option, and +in this case the squeezing is performed last, squeezing all the +characters contained in the last given string: + +``` +$ echo 'Hellllo! 112233' | tr -s 'l_e' '123' +H31o! 123 +$ echo 'Hello!' | tr -ds '!' 'l' +Helo +``` + +## Conclusions + +This is pretty much all there is tu say about `tr`. All of this can +probably be done with a sufficiently complicated `sed` or `awk` +script, but it is definitely nice to have a simpler utility to perform +easy changes. diff --git a/src/series/series.md b/src/series/series.md @@ -29,6 +29,7 @@ of complexity: `grep`, `sed` and `awk`. Work in progress. * Part 0: [Regular expressions](../blog/2023-06-16-regex) * Part 1: [grep](../blog/2023-08-20-grep) * Part 2: [sed](../blog/2023-12-03-sed) +* Part 2.1: [tr](../blog/2024-01-13-tr) * Part 3: awk (coming "soon") ## The UNIX shell as an IDE