Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

1/25/2021 Difference Between grep, sed, and awk | Baeldung on Linux

(/linux/) (https://www.baeldung.com/linux/)
x

Difference Between grep, sed,


and awk
Last modi ed: November 17, 2020

by Chin Ming Jun (https://www.baeldung.com/linux/author/chinmingjun)

Linux - Scripting (https://www.baeldung.com/linux/category/scripting)


awk (https://www.baeldung.com/linux/tag/awk)
grep (https://www.baeldung.com/linux/tag/grep)
sed (https://www.baeldung.com/linux/tag/sed)

1. Overview
In this article, we’ll go through the command-line tools grep
(https://man7.org/linux/man-pages/man1/grep.1.html), sed
(https://man7.org/linux/man-pages/man1/sed.1.html), and awk
(https://man7.org/linux/man-pages/man1/awk.1p.html). In particular, we’ll
study the di erences in functionality among them.

2. Background

https://www.baeldung.com/linux/grep-sed-awk-differences 1/12
1/25/2021 Difference Between grep, sed, and awk | Baeldung on Linux

When it comes to text processing in Linux, the three tools that come in pretty x
handy are grep, sed, and awk. Although being totally di erent tools, their
functionality seems to overlap in simple scenarios. For example, to nd a
pattern in a le and print the matches to the standard output, we’ll nd that all
of them can do that. x
However, if we stretch beyond this simple exercise, we’ll nd that grep
(/linux/common-text-search) is only good for simple text matching and
printing.
On the other hand, in addition to match and print text, sed (/linux/sed-editor)
o ers additional text transformation commands like substitution.
Finally, awk (/linux/awk-guide), being the most powerful of these tools, is a
scripting language that o ers a multitude of features that do not exists in
the former two.
Before we begin, it is important to know that the purpose of this article is to
make the distinction between these three tools clearer. Therefore, the
examples we are covering are just a small subset of what is possible with each
tool, especially in the case of sed and awk.

3. Text File
To facilitate our discussion, let’s de ne a text le log.txt:

Timestamp Category Message


1598843202 INFO Booting up system
1598843402 INFO Booting up critical service: Authorization
1598843502 INFO System booted successfully
1598853502 INFO User admin requested access for userlist
1598863888 ERROR User annonymous attempt to access protected
resource without credentials
1598863891 INFO System health check status: passed
1598863901 ERROR Requested resource not found
1598864411 INFO User admin logged out

4. grep

https://www.baeldung.com/linux/grep-sed-awk-differences 2/12
1/25/2021 Difference Between grep, sed, and awk | Baeldung on Linux

The grep command searches for lines matching a regex pattern and prints x
those matching lines to the standard output. It is useful when we need a
quick way to nd out whether a particular pattern exists or not in the given
input.
x

4.1. Basic Syntax


The syntax for grep is as follows:

grep [OPTIONS] PATTERN [FILE...]

PATTERN is a regex pattern de ning what we want to nd in the content of


the les speci ed by the FILE argument. The OPTIONS optional parameters
are ags that modify the behavior of grep.

4.2. Searching for Lines That Match a Regex Pattern


Let’s say we want to extract the ERROR events from log.txt. We can do that
with grep:

$ grep "ERROR" log.txt


1598863888 ERROR User annonymous attempt to access protected
resource without credentials
1598863901 ERROR Requested resource not found

What happens here is that grep will scan through the lines in log.txt and print
those lines containing the word ERROR to the standard output.

4.3. Inverting the Match


We can invert the match using the -v ag:

grep -v "INFO" log.txt

When we execute the command above, grep will print every line in the
log.txt, except those lines matching the pattern INFO.
https://www.baeldung.com/linux/grep-sed-awk-differences 3/12
1/25/2021 Difference Between grep, sed, and awk | Baeldung on Linux

4.4. Printing Preceding or Succeeding Lines x

Sometimes, we may want to print the preceding or succeeding line around the
matchings. To print the ve lines after a match, we can use the ag -A:
x
grep -A 5 ERROR log.txt

On the other hand, to print the ve lines before a match, we can use the ag -
B:

grep -B 5 ERROR log.txt

Finally, the ag -C allows us to print both the ve lines before and the ve lines
after a match:

grep -C 5 ERROR log.txt

5. sed
The sed command is a stream editor that works on streams of characters. It’s a
more powerful tool than grep as it o ers more options for text processing
purposes, including the substitute command, which sed is most commonly
known for.

5.1. Basic Syntax


The sed command has the following general syntax:

sed [OPTIONS] SCRIPT FILE...

The OPTIONS are optional ags that can be applied on sed to modify its
behavior. Next, the SCRIPT argument is the sed script that will be executed on
every line for the les that are speci ed by the FILE argument.

5.2. Script Structure


https://www.baeldung.com/linux/grep-sed-awk-differences 4/12
1/25/2021 Difference Between grep, sed, and awk | Baeldung on Linux

The sed script has the following structure: x

[addr]X[options]

Where addr is the condition applied to the lines of the text le. It can be a xed x
number or a regex pattern that is tested against the content of a line before
processing it.
Next, the X character represents the sed command to execute. For example,
the substitute command, which is denoted with a single character.
Finally, additional options can be passed to the sed command to specify its
behavior.

5.3. Using sed as grep


As a starter, let’s see how we can duplicate the functionality of grep using sed:

sed -n '/ERROR/ p' log.txt

By default, sed will print every line it is scanning to the standard output
stream. To disable this automatic printing, we can use the ag -n.
Next, it will run the script that comes after the ag -n and look for the regex
pattern ERROR on every line in log.txt. If there is a match, sed will print the line
to standard output because we’re using the p command in the script. Finally,
we pass log.txt as the name of the le we want sed to work on as the nal
argument.

5.4. Substituting Matched String With Replacement


The sed‘s substitute command has the following structure:

's/pattern/replacement/'

When there is a match on a line for pattern, sed will substitute it with
replacement.
For example, if we want to substitute the word ERROR in our log.txt with the
word CRITICAL we can run:
https://www.baeldung.com/linux/grep-sed-awk-differences 5/12
1/25/2021 Difference Between grep, sed, and awk | Baeldung on Linux

sed 's/ERROR/CRITICAL/' log.txt x

5.5. Modifying Files in Place x

If we want sed to persist the change on the le it is operating on, we can use
the ag -i along with a su x. Before making changes in place, sed will create a
backup of the le and append the su x to this backup lename. For instance,
when we run:

sed -ibackup 's/ERROR/CRITICAL/' log.txt

log.txt will be duplicated and renamed to log.txtbackup before sed applies the


changes in place.

5.6. Restricting to a Speci c Line Number


We can limit the sed command so it only operates on a speci c line number
using the addr slot in the script:

sed '3 s/ERROR/CRITICAL/' log.txt

This will run the script only on line 3 of log.txt.


Furthermore, we can specify a range of line numbers:

sed '3,5 s/ERROR/CRITICAL/' log.txt

In this case, sed will run the script on lines  3 to 5 of log.txt.


In addition, we can specify the bound with a regex pattern:

sed -n '3,/ERROR/ p' log.txt

Here, sed will print the lines of log.txt starting from line number 3, and ending
when it nds the rst line that matches the pattern /ERROR/.

https://www.baeldung.com/linux/grep-sed-awk-differences 6/12
1/25/2021 Difference Between grep, sed, and awk | Baeldung on Linux

6. awk x

The awk is a full- edged programming language that is comparable to Perl


(https://perldoc.perl.org/perl.html). It not only o ers a multitude of built-in x
functions for string, arithmetic, and time manipulation but also allows the
user to de ne his own functions just like any regular scripting language.
Let’s take a look at some examples of how it works.

6.1. Basic Syntax


The awk syntax is of the following form:

awk [options] script file

It will execute the script against every line in the le. Let’s now expand the
structure of the script:

'(pattern){action}'

The pattern is a regex pattern that will be tested against every input line. If a
line matches the pattern, awk will then execute the script de ned in action on
that line. If the pattern condition is absent, the action will be executed on
every line.

6.2. Replicating grep with awk


As we did with sed, let’s take a look at how we can emulate grep‘s
functionality using awk:

awk '/ERROR/{print $0}' log.txt

The code above will nd the regex pattern ERROR in the log.txt le and print
the matching line to the standard output.

6.3. Substituting the Matching String


https://www.baeldung.com/linux/grep-sed-awk-differences 7/12
1/25/2021 Difference Between grep, sed, and awk | Baeldung on Linux

Similarly, we can use the awk‘s built-in method x


(https://www.gnu.org/software/gawk/manual/html_node/Built_002din.html)
gsub to substitute all ERROR occurrences with CRITICAL just like in the sed
example:
x
awk '{gsub(/ERROR/, "CRITICAL")}{print}' log.txt

The method gsub takes as arguments a regex pattern and the replacement
string. Then, awk print the line to the standard output.

6.4. Adding Header and Footer to the Document


In awk, there’s a BEGIN block that will execute before it starts processing any
line of the le. On the other hand, there is also an END block that allows us to
de ne what should be run after all the lines have been processed.
Let’s use BEGIN and END blocks to add a header and a footer to our text
document:

$ awk 'BEGIN {print "LOG SUMMARY\n--------------"} {print} END {print "---


-----------\nEND OF LOG SUMMARY"}' log.txt
LOG SUMMARY
--------------
Timestamp Category Message
1598843202 INFO Booting up system
1598843402 INFO Booting up critical service: Authorization
1598843502 INFO System booted successfully
1598853502 INFO User admin requested access for userlist
1598863888 ERROR User annonymous attempt to access protected
resource without credentials
1598863891 INFO System health check status: passed
1598863901 ERROR Requested resource not found
1598864411 INFO User admin logged out
--------------
END OF LOG SUMMARY

6.5. Column Manipulation

https://www.baeldung.com/linux/grep-sed-awk-differences 8/12
1/25/2021 Difference Between grep, sed, and awk | Baeldung on Linux

Processing documents having a rows and columns structure (CSV style) is x


when awk really shines. For instance, we can easily print the rst and second
column, and skip the third one of our log.txt:

awk '{print $1, $2}' log.txt x

6.6. Custom Field Separator


By default, awk handles white spaces as a delimiter. If the processing text is
using a delimiter that is not white space (a comma, for example), we can
specify it with the ag -F:

awk -F "," '{print $1, $2}' log.txt

6.7. Arithmetic Operation


The ability of awk to carry out arithmetic operations makes gather some
numerical info about a text le easy. For example, let’s calculate the number of
ERROR event occurrences in log.txt:

awk '{count[$2]++} END {print count["ERROR"]}' log.txt

In the script above, awk stores the counts of each distinct value Category
column in the variable count. Then the script prints the count value at the end.

6.8. Numeric Comparison


Being a full- edged scripting language, awk readily understands decimal
values. This makes text processing easy when we need our script to interpret
values as a number rather than as a simple string.
For example, let’s say we want to get all the log entries older than the
timestamp 1598863888, we can use a greater than comparator:

https://www.baeldung.com/linux/grep-sed-awk-differences 9/12
1/25/2021 Difference Between grep, sed, and awk | Baeldung on Linux

$ awk '{ if ($1 > 1598863888 ) {print $0} }' log.txt x


1598863891 INFO System health check status: passed
1598863901 ERROR Requested resource not found
1598864411 INFO User admin logged ou

From the output, we can see that the command only prints log lines that are
recorded later than the speci ed timestamp.

7. Conclusion
In this article, we started o with a basic introduction to grep, sed, and awk.
Then, we showed the usage of grep on simple text scanning and matching.
Next, we saw how sed is more useful than grep when we want to transform
our text.
Finally, we’ve demonstrated how awk is capable of replicating grep and sed
functionality while additionally providing more features for advanced text
processing.

Join the discussion

{} [+]

https://www.baeldung.com/linux/grep-sed-awk-differences 10/12
1/25/2021 Difference Between grep, sed, and awk | Baeldung on Linux

x
1 COMMENT Oldest

CATEGORIES
ADMINISTRATION (/LINUX/CATEGORY/ADMINISTRATION)
FILES (/LINUX/CATEGORY/FILES)
FILESYSTEMS (/LINUX/CATEGORY/FILESYSTEMS)
INSTALLATION (/LINUX/CATEGORY/INSTALLATION)
NETWORKING (/LINUX/CATEGORY/NETWORKING)
PROCESSES (/LINUX/CATEGORY/PROCESSES)
SCRIPTING (/LINUX/CATEGORY/SCRIPTING)
SEARCH (/LINUX/CATEGORY/SEARCH)
SECURITY (/LINUX/CATEGORY/SECURITY)
WEB (/LINUX/CATEGORY/WEB)

SERIES
LINUX FILES (/LINUX/LINUX-FILES-SERIES)
LINUX SCRIPTING (/LINUX/LINUX-SCRIPTING-SERIES)

ABOUT
ABOUT BAELDUNG (/ABOUT)
THE FULL ARCHIVE (HTTPS://WWW.BAELDUNG.COM/LINUX/FULL_ARCHIVE)
EDITORS (HTTPS://WWW.BAELDUNG.COM/EDITORS)

TERMS OF SERVICE (HTTPS://WWW.BAELDUNG.COM/TERMS-OF-SERVICE)

https://www.baeldung.com/linux/grep-sed-awk-differences 11/12
1/25/2021 Difference Between grep, sed, and awk | Baeldung on Linux

PRIVACY POLICY (HTTPS://WWW.BAELDUNG.COM/PRIVACY-POLICY) x


COMPANY INFO (HTTPS://WWW.BAELDUNG.COM/BAELDUNG-COMPANY-INFO)
CONTACT (/CONTACT)

https://www.baeldung.com/linux/grep-sed-awk-differences 12/12

You might also like