Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 35

Advanced Scripting in Unix

SED, AWK, Makefile & GDB


Advanced Filters
• egrep, fgrep
• sed
• awk
egrep , fgrep
• Egrep
– Extended Grep used to search for more than one patterns at a
time
– used as egrep ‘pattern1|pattern2’ filename
– Options used
• n - prints the line numbers
• c - count of lines matching the pattern
• v - inverses the output of the search
• fgrep - similar to the egrep command
– Alternate patterns are seperated by a new line character.
– Usage:
• $fgrep ‘a
• >b ‘ filename
SED
 sed is a stream editor

 Reading lines using sed


Examples:
sed ‘4q’ file read first 4 lines of the file and quit
sed –n ‘1,4p’ file print line 1 to 4
sed –n ‘1,2!p’ file print the expect the first two lines
sed ‘G’ file adds a blank line after each line
sed ‘4G’ file adds a blank line after 4th line.
“ sed- substitution ”

 “s” for substitution

Syntax:
– sed 's/old/new/' <file_ip >file_op

– There are four parts to this substitute command:


• s Substitute command
• /../../ Delimiter
• old Regular Expression Pattern String
• New Replacement string

5
“sed- substitution ”
 If you want to change a pathname that contains a slash, you
could use the backslash to quote the slash
– sed 's/\/usr\/local\/bin/\/common\/bin/' <file_ip >file_op

– sed 's_/usr/local/bin_/common/bin_' <file_ip >file_op

# substitute (find & replace) "foo" with "bar" on each line


sed 's/foo/bar/' # replaces only 1st instance in a line
sed 's/foo/bar/4' # replaces only 4th instance in a line
sed 's/foo/bar/g' # replaces ALL instances in a line

# substitute "foo" with "bar" ONLY for lines which contain "baz"
sed '/baz/s/foo/bar/g'

# substitute "foo" with "bar" EXCEPT for lines which contain "baz"
sed '/baz/!s/foo/bar/g'

6
“sed multiple commands”
 Below command replaces “red” with “RED” and “white” with “WHITE”
• sed –n ‘s/red/RED/gp’ test | sed ‘s/white/WHITE/g’

• sed –e ‘s/red/RED/p’ –e ‘s/white/WHITE/q’ test

 sed –f scriptname:
Syntax: sed –f sedscript < ip_file

 If you have (multiple) large number of sed commands you can


create sed command file and execute with –f option
Example: Edit a file “sedcmd” and add the lines below
s/red/RED/g
s/white/WHITE/g
and execute it as follows
sed –f sedcmd <test_ip >test_op

7
“sed adding records”
 Records can be appended, inserted, deleted as below
Example1:
sed ‘a\
LINE 1 ADDED\
LINE2 ADDED’ test

Example2:
sed `/orange/ a\
ABOVE LINE HAS orange IN IT ‘ test

sed ‘3i\
Insert a line at line 3\
‘ test

8
“sed deleting records”
Example:
sed ‘2d’ file: delete 2nd line of the file
sed ‘1,4d’ file: delete first 4 lines of the file
sed ‘1,4!d’ file: delete all other lines except first four
sed ‘s/^[ \t]*//’: del leading spaces and tabs at the front of each line
sed ‘s/[ \t]*$//’: del trailing spaces and tabs at the end of each line

9
AWK
awk

– Dynamic regular expressions


• Text substitution and pattern matching functions
– Additional built-in functions and variables
– New operators and statements
– Input from more than one file
– Access to command line arguments
Running an AWK Program
• There are several ways to run an Awk program
– awk ‘program’ input_file(s)
• program and input files are provided as command-
line arguments
– awk ‘program’
• program is a command-line argument; input is
taken from standard input (yes, awk is a filter!)
– awk -f program_file_name input_files
• program is read from a file
Awk as a Filter
• Since Awk is a filter, you can also use pipes with other
filters to massage its output even further
• Suppose you want to print the data for each employee
along with their pay and have it sorted in order of
increasing pay

awk ‘{ printf(“%6.2f %s\n”, $2 * $3, $0) }’ emp.data | sort


Structure of an AWK Program
• An Awk program consists of: BEGIN{action}
– An optional BEGIN segment pattern {action}
• For processing to execute
pattern {action}
prior to reading input
– pattern - action pairs .
• Processing for input data .
• For each pattern matched, .
the corresponding action is
pattern { action}
taken
– An optional END segment END {action}
• Processing after end of
input data
BEGIN and END
• Special pattern BEGIN matches before the first input line
is read; END matches after the last input line has been
read
• This allows for initial and wrap-up processing
BEGIN { print “NAME RATE HOURS”; print “” }
{ print }
END { print “total number of employees is”, NR }
Pattern-Action Pairs
• Both are optional, but one or the other is required
– Default pattern is match every record
– Default action is print record
• Patterns
– BEGIN and END
– expressions
• $3 < 100
• $4 == “Asia”
– string-matching
• /regex/ - /^.*$/
• string - abc
– matches the first occurrence of regex or string in the
record
Selection
• Awk patterns are good for selecting specific lines from
the input for further processing
• Selection by Comparison
– $2 >=5 { print }
• Selection by Computation
– $2 * $3 > 50 { printf(“%6.2f for %s\n”, $2 * $3, $1) }
• Selection by Text Content
– $1 == “Susie”
– /Susie/
• Combinations of Patterns
– $2 >= 4 || $3 >= 20
Data Validation
• Validating data is a common operation
• Awk is excellent at data validation
– NF != 3 { print $0, “number of fields not equal to 3” }
– $2 < 3.35 { print $0, “rate is below minimum wage” }
– $2 > 10 { print $0, “rate exceeds $10 per hour” }
– $3 < 0 { print $0, “negative hours worked” }
– $3 > 60 { print $0, “too many hours worked” }
Regular Expressions in Awk
• Awk uses the same regular expressions we’ve been using
– ^ $ beginning of/end of field
– . any character
– [abcd] character class
– [^abcd] negated character class
– [a-z] range of characters
– (regex1|regex2) alternation
– * zero or more occurrences of preceding expression
– + one or more occurrences of preceding expression
– ? zero or one occurrence of preceding expression
Awk Variables
• $0, $1, $2, … ,$NF
• NR - Number of records read
• FNR - Number of records read from current file
• NF - Number of fields in current record
• FILENAME - name of current input file
• F - Field separator, space or TAB by default
• OFS - Output field separator, space by default
• ARGC/ARGV - Argument Count, Argument Value array
– Used to get arguments from the command line
awk ‘$2 ~ /^[s]/’ emps

It returns the lines whose second field starts with letter s.

awk ‘$1 ~ /199[789]/ && $4 > 2000’ persons

It returns the lines whose first field contains either


1997,1998,1999 and fourth field is greater than 20000
Operators
• = assignment operator; sets a variable equal to a value
or string
• == equality operator; returns TRUE is both sides are
equal
• != inverse equality operator
• && logical AND
• || logical OR
• ! logical NOT
• <, >, <=, >= relational operators
Computing with AWK
• Counting is easy to do with Awk
$3 > 15 { emp = emp + 1}
END { print emp, “employees worked more than 15 hrs”}
• Computing Sums and Averages is also simple
{ pay = pay + $2 * $3 }
END { print NR, “employees”
print “total pay is”, pay
print “average pay is”, pay/NR
}
Handling Text
• One major advantage of Awk is its ability to handle
strings as easily as many languages handle numbers
• Awk variables can hold strings of characters as well as
numbers, and Awk conveniently translates back and forth
as needed
• This program finds the employee who is paid the most
per hour
$2 > maxrate { maxrate = $2; maxemp = $1 }
END { print “highest hourly rate:”, maxrate, “for”, maxemp }
String Concatenation
• String Concatenation
– New strings can be created by combining old ones
{ names = names $1 “ “ }
END { print names }
• Printing the Last Input Line
– Although NR retains its value after the last input line
has been read, $0 does not
{ last = $0 }
END { print last }
Command Line Arguments
• Accessed via built-ins ARGC and ARGV
• ARGC is set to the number of command line arguments
• ARGV[ ] contains each of the arguments
– For the command line
– awk ‘script’ filename
• ARGC == 2
• ARGV[0] == “awk”
• ARGV[1] == “filename
• the script is not considered an argument
ARGC/ARGV in Action

#argv.awk – get a cmd line argument and display


BEGIN {if(ARGC != 2)
{print "Not enough arguments!"}
else
{print "Good evening,", ARGV[1]}
}
BEGIN {if(ARGC != 3)
{print "Not enough arguments!"
print "Usage is awk -f script in_file field_separator"
exit}
else
{FS=ARGV[2]
delete ARGV[2]}
}
$1 ~ /..3/ {print $1 "'s name in real life is", $5; ++nr}
END {print; print "There are", nr, "students registered in your class."}
getline
• How do you get input into your awk script other than on
the command line?
• The getline function provides input capabilities
• getline is used to read input from either the current input
or from a file or pipe
• getline returns 1 if a record was present, 0 if an end-of-
file was encountered, and –1 if some error occurred
getline from stdin
#getline.awk - demonstrate the getline function
BEGIN {print "What is your first name and major? "
while (getline > 0)
print "Hi", $1 ", your major is", $2 "."
}
getline From a File
#getline1.awk - demo getline with a file
BEGIN {while (getline <"emp.data" >0)
print $0}
getline From a Pipe
#getline2.awk - show using getline with a pipe
BEGIN {{while ("who" | getline)
nr++}
print "There are", nr, "people logged on clyde right
now."}
Simple Output From AWK
• Printing Every Line
– If an action has no pattern, the action is performed for all input
lines
• { print } will print all input lines on stdout
• { print $0 } will do the same thing
• Printing Certain Fields
– Multiple items can be printed on the same output line with a
single print statement
– { print $1, $3 }
– Expressions separated by a comma are, by default, separated
by a single space when output
Formatted Output
• printf provides formatted output
• Syntax is printf(“format string”, var1, var2, ….)
• Format specifiers
– %c – single character
– %d - number
– %f - floating point number
– %s - string
– \n - NEWLINE
– \t - TAB
• Format modifiers
– - left justify in column
– n column width
– .n number of decimal places to print
printf Examples
• printf(“I have %d %s\n”, how_many, animal_type)
– format a number (%d) followed by a string (%s)
• printf(“%-10s has $%6.2f in their account\n”, name, amount)
– prints a left justified string in a 10 character wide field and a float
with 2 decimal places in a six character wide field
• printf(“%10s %-4.2f %-6d\n”, name, interest_rate, account_number
> "account_rates")
– prints a right justified string in a 10 character wide field, a left
justified float with 2 decimal places in a 4 digit wide field and a
left justified decimal number in a 6 digit wide field to a file
• printf(“\t%d\t%d\t%6.2f\t%s\n”, id_no, age, balance, name >>
"account")
– appends a TAB separated number, number, 6.2 float and a
string to a file

You might also like