Difference Between grep, sed, and awk

Difference Between grep, sed,

and awk
Last modified: November 17, 2020

by Chin Ming Jun

Linux - Scripting

awk
grep
sed

1. Overview
In this article, we'll go through the command-line tools grep
sed
and awk
In particular, we'll
study the di erences in functionality among them.

When it comes to text processing in Linux, the three tools that come in pretty
handy are grep, sed, and awk. Although being totally di erent tools, their
functionality seems to overlap in simple scenarios. For example, to nd a
pattern in a le and print the matches to the standard output, we’ll nd that all
of them can do that.
However, if we stretch beyond this simple exercise, we’ll nd that grep
is only good for simple text matching and
On the other hand, in addition to match and print text, sed
o ers additional text transformation commands like substitution.
Finally, awk
scripting language that o ers a multitude of features that do not exists in
the former two.
Before we begin, it is important to know that the purpose of this article is to
make the distinction between these three tools clearer. Therefore, the
examples we are covering are just a small subset of what is possible with each
tool, especially in the case of sed and awk.

3. Text File
To facilitate our discussion, let’s de ne a text le log.txt:

Timestamp Category Message

1598843202 INFO Booting up system
1598843402 INFO Booting up critical service: Authorization
1598843502 INFO System booted successfully
1598853502 INFO User admin requested access for userlist
1598863888 ERROR User annonymous attempt to access protected
resource without credentials
1598863891 INFO System health check status: passed
1598863901 ERROR Requested resource not found
1598864411 INFO User admin logged out

The grep command searches for lines matching a regex pattern and prints
those matching lines to the standard output. It is useful when we need a
quick way to find out whether a particular pattern exists or not in the given

4.1. Basic Syntax

The syntax for grep is as follows:


PATTERN is a regex pattern defining what we want to find in the content of

the les speci ed by the FILE argument. The OPTIONS optional parameters
are ags that modify the behavior of grep.

4.2. Searching for Lines That Match a Regex Pattern

Let’s say we want to extract the ERROR events from log.txt. We can do that
with grep:

$ grep "ERROR" log.txt

1598863888 ERROR User annonymous attempt to access protected
resource without credentials
1598863901 ERROR Requested resource not found

What happens here is that grep will scan through the lines in log.txt and print
those lines containing the word ERROR to the standard output.

4.3. Inverting the Match

We can invert the match using the -v ag:

grep -v "INFO" log.txt

When we execute the command above, grep will print every line in the
log.txt, except those lines matching the pattern INFO. 3/12
Sometimes, we may want to print the preceding or succeeding line around the
matchings. To print the ve lines after a match, we can use the ag -A:
grep -A 5 ERROR log.txt

On the other hand, to print the ve lines before a match, we can use the ag -

grep -B 5 ERROR log.txt

Finally, the ag -C allows us to print both the ve lines before and the ve lines
after a match:

grep -C 5 ERROR log.txt

5. sed
The sed command is a stream editor that works on streams of characters. It's a
more powerful tool than grep as it o ers more options for text processing
purposes, including the substitute command, which sed is most commonly
known for.

5.1. Basic Syntax

The sed command has the following general syntax:


The OPTIONS are optional flags that can be applied on sed to modify its
behavior. Next, the SCRIPT argument is the sed script that will be executed on
every line for the les that are speci ed by the FILE argument.

5.2. Script Structure 4/12
The sed script has the following structure: x


Where addr is the condition applied to the lines of the text file. It can be a fixed
number or a regex pattern that is tested against the content of a line before
processing it.
Next, the X character represents the sed command to execute. For example,
the substitute command, which is denoted with a single character.
Finally, additional options can be passed to the sed command to specify its

5.3. Using sed as grep

As a starter, let’s see how we can duplicate the functionality of grep using sed:

sed -n '/ERROR/ p' log.txt

By default, sed will print every line it is scanning to the standard output
stream. To disable this automatic printing, we can use the ag -n.
Next, it will run the script that comes after the ag -n and look for the regex
pattern ERROR on every line in log.txt. If there is a match, sed will print the line
to standard output because we’re using the p command in the script. Finally,
we pass log.txt as the name of the file we want sed to work on as the final

5.4. Substituting Matched String With Replacement

The sed‘s substitute command has the following structure:


When there is a match on a line for pattern, sed will substitute it with
For example, if we want to substitute the word ERROR in our log.txt with the
word CRITICAL we can run: 5/12
sed 's/ERROR/CRITICAL/' log.txt

5.5. Modifying Files in Place x

If we want sed to persist the change on the file it is operating on, we can use
the ag -i along with a su x. Before making changes in place, sed will create a
backup of the le and append the su x to this backup lename. For instance,
when we run:

sed -ibackup 's/ERROR/CRITICAL/' log.txt

log.txt will be duplicated and renamed to log.txtbackup before sed applies the

changes in place.

5.6. Restricting to a Speci c Line Number

We can limit the sed command so it only operates on a speci c line number
using the addr slot in the script:

sed '3 s/ERROR/CRITICAL/' log.txt

This will run the script only on line 3 of log.txt.

Furthermore, we can specify a range of line numbers:

sed '3,5 s/ERROR/CRITICAL/' log.txt

In this case, sed will run the script on lines  3 to 5 of log.txt.

In addition, we can specify the bound with a regex pattern:

sed -n '3,/ERROR/ p' log.txt

Here, sed will print the lines of log.txt starting from line number 3, and ending
when it nds the rst line that matches the pattern /ERROR/. 6/12
6. awk x

The awk is a full-fledged programming language that is comparable to Perl

It not only offers a multitude of built-in
functions for string, arithmetic, and time manipulation but also allows the
user to de ne his own functions just like any regular scripting language.
Let’s take a look at some examples of how it works.

6.1. Basic Syntax

The awk syntax is of the following form:

awk [options] script file

It will execute the script against every line in the le. Let’s now expand the
structure of the script:


The pattern is a regex pattern that will be tested against every input line. If a
line matches the pattern, awk will then execute the script de ned in action on
that line. If the pattern condition is absent, the action will be executed on
every line.

6.2. Replicating grep with awk

As we did with sed, let’s take a look at how we can emulate grep‘s
functionality using awk:

awk '/ERROR/{print $0}' log.txt

The code above will nd the regex pattern ERROR in the log.txt le and print
the matching line to the standard output.

6.3. Substituting the Matching String 7/12
Similarly, we can use the awk's built-in method

gsub to substitute all ERROR occurrences with CRITICAL just like in the sed
awk '{gsub(/ERROR/, "CRITICAL")}{print}' log.txt

The method gsub takes as arguments a regex pattern and the replacement
string. Then, awk print the line to the standard output.

6.4. Adding Header and Footer to the Document

In awk, there’s a BEGIN block that will execute before it starts processing any
line of the le. On the other hand, there is also an END block that allows us to
de ne what should be run after all the lines have been processed.
Let’s use BEGIN and END blocks to add a header and a footer to our text

$ awk 'BEGIN {print "LOG SUMMARY\n--------------"} {print} END {print "---

-----------\nEND OF LOG SUMMARY"}' log.txt
Timestamp Category Message
1598843202 INFO Booting up system
1598843402 INFO Booting up critical service: Authorization
1598843502 INFO System booted successfully
1598853502 INFO User admin requested access for userlist
1598863888 ERROR User annonymous attempt to access protected
resource without credentials
1598863891 INFO System health check status: passed
1598863901 ERROR Requested resource not found
1598864411 INFO User admin logged out

6.5. Column Manipulation 8/12
Processing documents having a rows and columns structure (CSV style) is

when awk really shines. For instance, we can easily print the rst and second
column, and skip the third one of our log.txt:

awk '{print $1, $2}' log.txt

6.6. Custom Field Separator

By default, awk handles white spaces as a delimiter. If the processing text is
using a delimiter that is not white space (a comma, for example), we can
specify it with the ag -F:

awk -F "," '{print $1, $2}' log.txt

6.7. Arithmetic Operation

The ability of awk to carry out arithmetic operations makes gather some
numerical info about a text le easy. For example, let’s calculate the number of
ERROR event occurrences in log.txt:

awk '{count[$2]++} END {print count["ERROR"]}' log.txt

In the script above, awk stores the counts of each distinct value Category
column in the variable count. Then the script prints the count value at the end.

6.8. Numeric Comparison

Being a full- edged scripting language, awk readily understands decimal
values. This makes text processing easy when we need our script to interpret
values as a number rather than as a simple string.
For example, let’s say we want to get all the log entries older than the
timestamp 1598863888, we can use a greater than comparator: 9/12
$ awk '{ if ($1 > 1598863888 ) {print $0} }' log.txt

1598863891 INFO System health check status: passed
1598863901 ERROR Requested resource not found
1598864411 INFO User admin logged ou

From the output, we can see that the command only prints log lines that are
recorded later than the speci ed timestamp.

7. Conclusion
In this article, we started o with a basic introduction to grep, sed, and awk.
Then, we showed the usage of grep on simple text scanning and matching.
Next, we saw how sed is more useful than grep when we want to transform
our text.
Finally, we’ve demonstrated how awk is capable of replicating grep and sed
functionality while additionally providing more features for advanced text

