sed: Understanding pattern space and hold space

Pattern space and hold space are buffers where sed stores data. As we know sed processes one line at a time, so the current line(s) that are being processed are stored in pattern space.

Let me explain why I wrote line(s) when I already said sed processes one line at a time? It is because there are certain commands in sed like N that append the subsequent lines to the pattern space and hence there can be more than one line in the pattern space. When you do the following:

my-linux:~$ cat myfile.txt | sed -n '2p'

What is happening is:

-n - suppresses natural printing
p - prints the pattern space

The above command prints the pattern space. It can have more than one line too if other commands like N, G and H are used.

If you want to see the raw pattern space use the command l as below:

my-linux:~$ echo -e "linux\nubuntu\nsed" | sed -n '2l'

Raw pattern space displays the 2nd line as ubuntu$ indicating $ as end of line.

Hold space can be assumed to be empty as long as we specifically add something to it. Now consider the sed command G and h.

G - Append a newline to the contents of the pattern space, and then append the contents of the hold space to that of the pattern space.

h - (hold) Replace the contents of the hold space with the contents of the pattern space.

Lets see how these two work:

my-linux:~$ echo -e "linux\nubuntu\nsed" | sed -n "G;h;l"

Let’s see what is happening line by line:

When sed takes line no. 1 into pattern space, first the G command will work on it, and it will append a new line (\n) to it and then append contents of hold space to it (but hold space is empty so far). Then the command h is executed; it will replace the contents of hold space with contents of pattern space. So after the execution of first line, pattern space printed in raw form (l) yields linux\n$ and the hold space also has linux\n$ as its contents. Next, when line no. 2 is taken into pattern space, G will append a new line and then append contents of hold space (which is linux\n$). Then the h command replaces contents of hold space with that of pattern space, at this point both pattern space and hold space contain ubuntu\nlinux\n$. Similarly at the end of third line, pattern space contains sed\nubuntu\nlinux$.

In place of l if we use p in the above command this is what we get:

my-linux:~$ echo -e "linux\nubuntu\nsed" | sed -n "G;h;p"



You can notice that all the new line characters (\n) which were printed as-is when l was used are now printed in their real form.

We can slightly modify the above command to reverse the lines of input, like this:

my-linux:~$ echo -e "linux\nubuntu\nsed" | sed -n "1!G;h;$p"

1! means execute G on every line except line number 1

$p means print the last pattern space.

Leave a Reply

Your email address will not be published.