Download as pdf or txt
Download as pdf or txt
You are on page 1of 206

Jagannath Institute of Management Sciences

Lajpat Nagar

BCA
Sem III

Linux
LINUX ENVIRONMENT
BCA III (JU)

UNIT I

UNIX & LINUX:- The Operating System: Linux history, Linux features, Linux
distributions, Linux’s relationship to Unix, Overview of Linux architecture,
Installation, Booting, Login and Shutdown Process, Start up scripts, controlling
processes, system processes (an overview).
Linux Internals - System Calls, Process Management, Memory Management, Disk
and filesystems ,Networking ,Security, Graphical User Interface, Device Drivers
Getting help in Linux with –help, whatis, man command, info command, simple
commands like date,whoami, who, w, cal, bc ,hostname,uname, concept of
aliases etc Linux filesystem types ext2, ext3, ext4,Basic linux directory structure
and the functions of different directories basic directory navigation commands
like cd, mv, copy,rm,cat command , less command, runlevel (importance of
/etc/inittab)

Unix Programming

 Introduction To Operating System 



 Types Of Operating System 

 Unix Operating System 

 Concept Of Unix Operating System 

 History Of Unix 

What is an Operating System?


An operating system (OS) is a resource manager. It takes the form of a set of
software routines that allow users and application programs to access system
resources (e.g. the CPU, memory, disks, modems, printers network cards etc.) in a
safe, efficient and abstract way.
For example, an OS ensures safe access to a printer by allowing only one
application program to send data directly to the printer at any one time. An OS
encourages efficient use of the CPU by suspending programs that are waiting for
I/O operations to complete to make way for programs that can use the CPU more
productively. An OS also provides convenient abstractions (such as files rather
than disk locations) which isolate application programmers and users from the
details of the underlying hardware.

Fig. 1.1: General operating system architecture

Fig. 1.1 presents the architecture of a typical operating system and shows how an
OS succeeds in presenting users and application programs with a uniform
interface without regard to the details of the underlying hardware. We see that:
 The operating system kernel is in direct control of the underlying hardware.
The kernel provides low-level device, memory and processor management
functions (e.g. dealing with interrupts from hardware devices, sharing the
 processor among multiple programs, allocating memory for programs etc.) 
 Basic hardware-independent kernel services are exposed to higher-level
programs through a library of system calls (e.g. services to create a file,
begin execution of a program, or open a logical network connection to
 another computer). 
 Application programs (e.g. word processors, spreadsheets) and system
utility programs (simple but useful application programs that come with the
operating system, e.g. programs which find text inside a group of files)
make use of system calls. Applications and system utilities are launched
using a shell (a textual command line interface) or a graphical user interface
that provides direct user interaction. 

Operating systems (and different flavours of the same operating system) can be
distinguished from one another by the system calls, system utilities and user
interface they provide, as well as by the resource scheduling policies
implemented by the kernel.

OPERATING SYSTEM

 An Operating System is a software program designed to act as an interface


between a user and a computer. 


 It controls the computer hardware, manages system resources, and
supervises the interaction between the system and its users. 


 The operating system forms a base on which application software is
developed and executed. 
Functions of Operating System

 Command Interpretation: Function of OS is to translate commands,


entered by the user through input devices into a language understandable
by the CPU. 


 Peripheral Management: OS takes care of all attached peripheral devices,
Communication between these devices and the CPU is overseen by the OS. 


 Memory Management: OS handles the extremely important job of
allocating memory for the various prcesses running on the system.It also
has to clean up after a process –dispose off left-over data in memory. 


 Process Management: This is required if several programs have to run
concurrently.CPU time would then have to be rationed out by the OS to
ensure that no program get more than its fair share of CPU time. 

Types Of Operating System:

There are so many different types of Operating System but here we will be
categorizing it into two types. These are:

 Single-User Operating System 



 Muti-User Operating System 

Single-User Operating System:

 Most popular example is Personal Computer(PC) 




 It is a general purpose system, can execute program to perform wide
varaiety of tasks, designed for use by one person at a time. 

 E.g. MS-DOS 
 Very popular because of low cost of hardware, and wide range of
software’s available for it. 

Multi-User Operating System:


 
 Used when a number of programs have to be run simultaneously or

 
 Common resources like printers and disks are to be shared by a number of users.

 
 It can be used by more than one person at a time.


Multi-user is a term that defines an operating  system that allows
 concurrent access by multiple users of a computer.


Time-sharing systems are multi-user systems. Most batch processing
systems for mainframe computers may also be considered "multi-user", to

avoid leaving the CPU idle while it waits for I/O operations to complete.
 However, the term "Multi-tasking" is more common in this context.


An example is a Unix server where multiple remote users have access (via
Telnet) to the Unix shell prompt at the same time. Another example uses
multiple Xsessions spread across multiple monitors powered by a single
machine.

In Short we can say :

Operating systems can be classified as follows:


Multi-User : Allows two or more users to run programs at the same time.
 
Some operating systems permit hundreds or even thousands of concurrent
users.
  Multiprocessing : Supports running a program on more than one CPU. 
  Multitasking : Allows more than one program to run concurrently.


Multithreading : Allows different parts of a single program to run
concurrently.


Real Time: Responds to input instantly. General-purpose operating
systems, such as DOS and UNIX, are not real-time.

UNIX:


UNIX is a giant Operating System developed by Ken Thompson and Dennis
Ritche developed at Bell Laboratories USA in 1969, which has practically
everything an operating system should have, and several features which
other operating system never had.


It has a number of profound and diverse concepts developed and perfected
over a period of time.


Its elegance and richness go beyond the commands and  tools that
constitute it, while simplicity permeates the entire system


every hardware and today champions the cause of Open
It runs practically on
Source Movement.


Unix today is a mature operating system, and is used heavily in a large variety of
scientific, engineering, and mission critical applications. Interest
in Unix has grown substantially in recent years because of the
proliferation of the Linux (a Unix look-alike) operating system.


 developments are graphical interfaces, MOTIF, X Windows, and Open
Recent
View

  Major suppliers today 


  Sun Microsystems (SPARC) 
  Data General (AVION) 
  IBM (RS6000 AIX) 
 Hewlett Packard 
History of UNIX

1960s MULTICS project (MIT, GE, AT&T)


1970s AT&T Bell Labs
1970s/80s UC Berkeley
1980s DOS imitated many UNIX ideas Commercial UNIX
fragmentation GNU Project
1990s Linux
NOW UNIX is widespread and available from many sources, both
free and commercial

UNIX Systems

SunOS/Solaris Sun Microsystems


Digital Unix (Tru64) Digital/Compaq
HP-UX Hewlett Packard
Irix SGI
UNICOS Cray
SCO Open Server Santa Cruz Operation
SCO UnixWare Santa Cruz Operation
NetBSD, FreeBSD UC Berkeley / the Net
Linux RedHat,SuSe,Caldera Linus Torvalds / the Net

A Brief History of UNIX


UNIX has been a popular OS for more than two decades because of its multi-user,
multi-tasking environment, stability, portability and powerful networking
capabilities. What follows here is a simplified history of how UNIX has developed
In the late 1960s, researchers from General Electric, MIT and Bell Labs launched a
joint project to develop an ambitious multi-user, multi-tasking OS for mainframe
computers known as MULTICS (Multiplexed Information and Computing System).
MULTICS failed (for some MULTICS enthusiasts "failed" is perhaps too strong a
word to use here), but it did inspire Ken Thompson, who was a researcher at Bell
Labs, to have a go at writing a simpler operating system himself.
He wrote a simpler version of MULTICS on a PDP7 in assembler and called his
attempt UNICS (Uniplexed Information and Computing System).
Because memory and CPU power were at a premium in those days, UNICS
(eventually shortened to UNIX) used short commands to minimize the space
needed to store them and the time needed to decode them - hence the tradition
of short UNIX commands we use today, e.g. ls, cp, rm, mv etc.
Ken Thompson then teamed up with Dennis Ritchie, the author of the first C
compiler in 1973. They rewrote the UNIX kernel in C - this was a big step forwards
in terms of the system's portability - and released the Fifth Edition of UNIX to
universities in 1974.

The Seventh Edition, released in 1978, marked a split in UNIX development into
two main branches: SYSV (System 5) and BSD (Berkeley Software Distribution).
BSD arose from the University of California at Berkeley where Ken Thompson
spent a sabbatical year. Its development was continued by students at Berkeley
and other research institutions.

SYSV was developed by AT&T and other commercial companies. UNIX flavours
based on SYSV have traditionally been more conservative, but better supported
than BSD-based flavours.

The latest incarnations of SYSV (SVR4 or System 5 Release 4) and BSD Unix are
actually very similar.

Linux is a free open source UNIX OS for PCs that was originally developed in 1991
by Linus Torvalds, a Finnish undergraduate student.

Linux is neither pure SYSV or pure BSD. Instead, incorporates some features from
each (e.g. SYSV-style startup files but BSD-style file system layout) and aims to
conform with a set of IEEE standards called POSIX (Portable Operating System
Interface). To maximise code portability, it typically supports SYSV, BSD and POSIX
system calls (e.g. poll, select, memset, memcpy, bzero and bcopy are all
supported).

The open source nature of Linux means that the source code for the Linux kernel
is freely available so that anyone can add features and correct deficiencies.
This approach has been very successful and what started as one person's project
has now turned into a collaboration of hundreds of volunteer developers from
around the globe. The open source approach has not just successfully been
applied to kernel code, but also to application programs for Linux.

As Linux has become more popular, several different development streams or


distributions have emerged, e.g. Redhat, Slackware, Mandrake, Debian, and
Caldera. A distribution comprises a prepackaged kernel, system utilities, GUI
interfaces and application programs.

Redhat is the most popular distribution because it has been ported to a large
number of hardware platforms (including Intel, Alpha, and SPARC), it is easy to
use and install and it comes with a comprehensive set of utilities and applications
including the X Windows graphics system, GNOME and KDE GUI environments,
and the StarOffice suite (an open source MS-Office clone for Linux).

Architecture of the Linux Operating System


Linux has all of the components of a typical OS (at this point you might like to
refer back to Fig 1.1):

Kernel
The Linux kernel includes device driver support for a large number of PC hardware
devices (graphics cards, network cards, hard disks etc.), advanced processor and
memory management features, and support for many different types of
filesystems (including DOS floppies and the ISO9660 standard for CDROMs). In
terms of the services that it provides to application programs and system utilities,
the kernel implements most BSD and SYSV system calls, as well as the system calls
described in the POSIX.1 specification.

The kernel (in raw binary form that is loaded directly into memory at system
startup time) is typically found in the file /boot/vmlinuz, while the source files can
usually be found in /usr/src/linux.The latest version of the Linux kernel sources
can be downloaded from http://www.kernel.org/.
Shells and GUIs

Linux supports two forms of command input: through textual command line shells
similar to those found on most UNIX systems (e.g. sh - the Bourne shell, bash - the
Bourne again shell and csh - the C shell) and through graphical interfaces (GUIs)
such as the KDE and GNOME window managers. If you are connecting remotely to
a server your access will typically be through a command line shell.

System Utilities

Virtually every system utility that you would expect to find on standard
implementations of UNIX (including every system utility described in the POSIX.2
specification) has been ported to Linux. This includes commands such as ls, cp,
grep, awk, sed, bc, wc, more, and so on. These system utilities are designed to be
powerful tools that do a single task extremely well (e.g. grep finds text inside files
while wc counts the number of words, lines and bytes inside a file). Users can
often solve problems by interconnecting these tools instead of writing a large
monolithic application program.

Like other UNIX flavours, Linux's system utilities also include server programs
called daemons which provide remote network and administration services (e.g.
telnetd and sshd provide remote login facilities, lpd provides printing services,
httpd serves web pages, crond runs regular system administration tasks
automatically). A daemon (probably derived from the Latin word which refers to a
beneficient spirit who watches over someone, or perhaps short for "Disk And
Execution MONitor") is usually spawned automatically at system startup and
spends most of its time lying dormant (lurking?) waiting for some event to occur.

Application programs

Linux distributions typically come with several useful application programs as


standard. Examples include the emacs editor, xv (an image viewer), gcc (a C
compiler), g++ (a C++ compiler), xfig (a drawing package), latex (a powerful
typesetting language) and soffice (StarOffice, which is an MS-Office style clone
that can read and write Word, Excel and PowerPoint files).
Redhat Linux also comes with rpm, the Redhat Package Manager which makes it
easy to install and uninstall application programs.

1.4. Properties of Linux

1.4.1. Linux Pros

A lot of the advantages of Linux are a consequence of Linux' origins, deeply


rooted in UNIX, except for the first advantage, of course:

 Linux is free:

As in free beer, they say. If you want to spend absolutely nothing, you don't even
have to pay the price of a CD. Linux can be downloaded in its entirety from the
Internet completely for free. No registration fees, no costs per user, free updates,
and freely available source code in case you want to change the behavior of your
system.

 Most of all, Linux is free as in free speech:

The license commonly used is the GNU Public License (GPL). The license says that
anybody who may want to do so, has the right to change Linux and eventually to
redistribute a changed version, on the one condition that the code is still available
after redistribution. In practice, you are free to grab a kernel image, for instance
to add support for teletransportation machines or time travel and sell your new
code, as long as your customers can still have a copy of that code.

 Linux is portable to any hardware platform: 



A vendor who wants to sell a new type of computer and who doesn't know
what kind of OS his new machine will run (say the CPU in your car or
washing machine), can take a Linux kernel and make it work on his
hardware, because documentation related to this activity is freely available. 

 Linux was made to keep on running: 

As with UNIX, a Linux system expects to run without rebooting all the time.
That is why a lot of tasks are being executed at night or scheduled 
automatically for other calm moments, resulting in higher availability
during busier periods and a more balanced use of the hardware. This
property allows for Linux to be applicable also in environments where people
don't have the time or the possibility to control their systems night and day.

 Linux is secure and versatile: 



The security model used in Linux is based on the UNIX idea of security,
which is known to be robust and of proven quality. But Linux is not only fit
for use as a fort against enemy attacks from the Internet: it will adapt
equally to other situations, utilizing the same high standards for security.
Your development machine or control station will be as secure as your
firewall. 

 Linux is scalable: 

From a Palmtop with 2 MB of memory to a petabyte storage cluster with
hundreds of nodes: add or remove the appropriate packages and Linux fits
all. You don't need a supercomputer anymore, because you can use Linux
to do big things using the building blocks provided with the system. If you
want to do little things, such as making an operating system for an
embedded processor or just recycling your old 486, Linux will do that as
well. 

 The Linux OS and most Linux applications have very short debug-times: 

Because Linux has been developed and tested by thousands of people,
both errors and people to fix them are usually found rather quickly. It
sometimes happens that there are only a couple of hours between
discovery and fixing of a bug. 

1.4.2. Linux Cons

 There are far too many different distributions: 



"Quot capites, tot rationes", as the Romans already said: the more people,
the more opinions. At first glance, the amount of Linux distributions can be
frightening, or ridiculous, depending on your point of view. But it also
means that everyone will find what he or she needs. You don't need to be
an expert to find a suitable release. 
When asked, generally every Linux user will say that the best distribution is
the specific version he is using. So which one should you choose? Don't
worry too much about that: all releases contain more or less the same set
of basic packages. On top of the basics, special third party software is
added making, for example, TurboLinux more suitable for the small and
medium enterprise, RedHat for servers and SuSE for workstations.
However, the differences are likely to be very superficial. The best strategy
is to test a couple of distributions; unfortunately not everybody has the
time for this. Luckily, there is plenty of advice on the subject of choosing
your Linux. A quick search on Google, using the keywords "choosing your
distribution" brings up tens of links to good advise. The Installation HOWTO
also discusses choosing your distribution.

 Linux is not very user friendly and confusing for beginners: 



It must be said that Linux, at least the core system, is less userfriendly to
use than MS Windows and certainly more difficult than MacOS, but... In
light of its popularity, considerable effort has been made to make Linux
even easier to use, especially for new users. More information is being
released daily, such as this guide, to help fill the gap for documentation
available to users at all levels. 

 Is an Open Source product trustworthy? 

How can something that is free also be reliable? Linux users have the choice
whether to use Linux or not, which gives them an enormous advantage
compared to users of proprietary software, who don't have that kind of
freedom. After long periods of testing, most Linux users come to the
conclusion that Linux is not only as good, but in many cases better and
faster that the traditional solutions. If Linux were not trustworthy, it would
have been long gone, never knowing the popularity it has now, with
millions of users. Now users can influence their systems and share their
remarks with the community, so the system gets better and better every
day. It is a project that is never finished, that is true, but in an ever changing
environment, Linux is also a project that continues to strive for perfection. 
1.5. Linux Flavors

1.5.1. Linux and GNU

Although there are a large number of Linux implementations, you will find a lot of
similarities in the different distributions, if only because every Linux machine is a
box with building blocks that you may put together following your own needs and
views.

Linux may appear different depending on the distribution, your hardware and
personal taste, but the fundamentals on which all graphical and other interfaces
are built, remain the same. The Linux system is based on GNU tools (Gnu's Not
UNIX), which provide a set of standard ways to handle and use the system. All
GNU tools are open source, so they can be installed on any system. Most
distributions offer pre-compiled packages of most common tools, such as RPM
packages on RedHat and Debian packages (also called deb or dpkg) on Debian, so
you needn't be a programmer to install a package on your system.

A list of common GNU software:

 Bash: The GNU shell 


 GCC: The GNU C Compiler 
  GDB: The GNU Debugger 
 Coreutils: a set of basic UNIX-style utilities, such as ls, cat and chmod 
  Findutils: to search and find files 
 Fontutils: to convert fonts from one format to another or make new fonts 
 The Gimp: GNU Image Manipulation Program 
  Gnome: the GNU desktop environment 

 Emacs: a very powerful editor 
 Ghostscript and Ghostview: interpreter and graphical frontend
for PostScript files. 

 GNU Photo: software for interaction with digital cameras 
 Octave: a programming language, primarily intended to
perform numerical computations and image processing. 
  GNU SQL: relational database system 
 Radius: a remote authentication and accounting server 
To install missing or new packages, you will need some form of software
management. The most common implementations include RPM and dpkg. RPM is
the RedHat Package Manager, which is used on a variety of Linux systems,
eventhough the name does not suggest this. Dpkg is the Debian package
management system, which uses an interface called apt-get, that can manage
RPM packages as well. Other third party software vendors may have their own
installation procedures, sometimes resembling the InstallShield and such, as
known on MS Windows and other platforms.

1.5.2. GNU/Linux

The Linux kernel is not part of the GNU project but uses the same license as GNU
software. A great majority of utilities and development tools (the meat of your
system), which are not Linux-specific, are taken from the GNU project. Because
any usable system must contain both the kernel and at least a minimal set of
utilities, some people argue that such a system should be called aGNU/Linux
system.

In order to obtain the highest possible degree of independence between


distributions, this is the sort of Linux that we will discuss throughout this course. If
we are not talking about a GNU/Linux system, the specific distribution, version or
program name will be mentioned.

The following are very good choices for novices:

 Fedora Core 
 Debian 
  SuSE Linux 

 Mandriva (former MandrakeSoft) 
 Knoppix: an operating system that runs from your CD-ROM, you don't
need to install anything. 

Boot process, Init and shutdown

Introduction

A lot of Linux systems use lilo, the LInux LOader for booting operating systems.
GRUB is easier to use and more flexible.
The boot process

When an x86 computer is booted, the processor looks at the end of the system
memory for the BIOS (Basic Input/Output System) and runs it. The BIOS program
is written into permanent read-only memory and is always available for use. The
BIOS provides the lowest level interface to peripheral devices and controls the
first step of the boot process.

The BIOS tests the system, looks for and checks peripherals, and then looks for a
drive to use to boot the system. Usually it checks the floppy drive (or CD-ROM
drive on many newer systems) for bootable media, if present, and then it looks to
the hard drive. The order of the drives used for booting is usually controlled by a
particular BIOS setting on the system. Once Linux is installed on the hard drive of
a system, the BIOS looks for a Master Boot Record (MBR) starting at the first
sector on the first hard drive, loads its contents into memory, then passes control
to it.

This MBR contains instructions on how to load the GRUB (or LILO) boot-loader,
using a pre-selected operating system. The MBR then loads the boot-loader,
which takes over the process (if the boot-loader is installed in the MBR). In the
default Red Hat Linux configuration, GRUB uses the settings in the MBR to display
boot options in a menu. Once GRUB has received the correct instructions for the
operating system to start, either from its command line or configuration file, it
finds the necessary boot file and hands off control of the machine to that
operating system.

4.2.3. GRUB features

This boot method is called direct loading because instructions are used to directly
load the operating system, with no intermediary code between the boot-loaders
and the operating system's main files (such as the kernel). The boot process used
by other operating systems may differ slightly from the above, however. For
example, Microsoft's DOS and Windows operating systems completely overwrite
anything on the MBR when they are installed without incorporating any of the
current MBR's configuration. This destroys any other information stored in the
MBR by other operating systems, such as Linux. The Microsoft operating systems,
as well as various other proprietary operating systems, are loaded using a chain
loading boot method. With this method, the MBR points to the first sector of the
partition holding the operating system, where it finds the special files necessary
to actually boot that operating system.

GRUB supports both boot methods, allowing you to use it with almost any
operating system, most popular file systems, and almost any hard disk your BIOS
can recognize.

GRUB contains a number of other features; the most important include:

 GRUB provides a true command-based, pre-OS environment on x86


machines to allow maximum flexibility in loading operating systems with

certain options or gathering information about the system. 
 GRUB supports Logical Block Addressing (LBA) mode, needed to access
many IDE and all SCSI hard disks. Before LBA, hard drives could encounter a

1024-cylinder limit, where the BIOS could not find a file after that point. 
 GRUB's configuration file is read from the disk every time the system boots,
preventing you from having to write over the MBR every time you change
the boot options. 

4.2.4. Init

The kernel, once it is loaded, finds init in sbin and executes it.

When init starts, it becomes the parent or grandparent of all of the processes that
start up automatically on your Linux system. The first thing init does, is reading its
initialization file, /etc/inittab. This instructs init to read an initial configuration
script for the environment, which sets the path, starts swapping, checks the file
systems, and so on. Basically, this step takes care of everything that your system
needs to have done at system initialization: setting the clock, initializing serial
ports and so forth.

Then init continues to read the /etc/inittab file, which describes how the system
should be set up in each run level and sets the default run level. A run level is a
configuration of processes. All UNIX-like systems can be run in different process
configurations, such as the single user mode, which is referred to as run level 1 or
run level S (or s). In this mode, only the system administrator can connect to the
system. It is used to perform maintenance tasks without risks of damaging the
system or user data. Naturally, in this configuration we don't need to offer user
services, so they will all be disabled. Another run level is the reboot run level, or
run level 6, which shuts down all running services according to the appropriate
procedures and then restarts the system.

Init run levels

The idea behind operating different services at different run levels essentially
revolves around the fact that different systems can be used in different ways.
Some services cannot be used until the system is in a particular state, or mode,
such as being ready for more than one user or having networking available.

Available run levels are generally described in /etc/inittab, which is partially


shown below:

#
# inittab This file describes how the INIT process should set up
# the system in a certain run-level.

# Default run level. The run levels are:


# 0 - halt (Do NOT set initdefault to this)
# 1 - Single user mode
# 2 - Multiuser, without NFS
# (The same as 3, if you do not have networking)
# 3 - Full multiuser mode
# 4 - unused
# 5 - X11
# 6 - reboot (Do NOT set initdefault to this)
#
id:5:initdefault:
<--cut-->

Feel free to configure unused run levels (commonly run level 4) as you see fit.
Many users configure those run levels in a way that makes the most sense for
them while leaving the standard run levels as they are by default. This allows
them to quickly move in and out of their custom configuration without disturbing
the normal set of features at the standard run levels.

Tools

The chkconfig or update-rc.d utilities, when installed on your system, provide a


simple command-line tool for maintaining the /etc/init.d directory hierarchy.
These relieve system administrators from having to directly manipulate the
numerous symbolic links in the directories under /etc/rc[x].d.
In addition, some systems offer the ntsysv tool, which provides a text-based
interface; you may find this easier to use than chkconfig's command-line
interface. On SuSE Linux, you will find the yast and insserv tools. For Mandrake
easy configuration, you may want to try DrakConf, which allows among other
features switching between run levels 3 and 5. In Mandriva this became the
Mandriva Linux Control Center.

Shutdown

UNIX was not made to be shut down, but if you really must, use the shutdown
command. After completing the shutdown procedure, the -h option will halt the
system, while -r will reboot it.

The reboot and halt commands are now able to invoke shutdown if run when the
system is in run levels 1-5, and thus ensure proper shutdown of the system,but it
is a bad habit to get into, as not all UNIX/Linux versions have this feature.

If your computer does not power itself down, you should not turn off the
computer until you see a message indicating that the system is halted or finished
shutting down, in order to give the system the time to unmount all partitions.
Being impatient may cause data loss.

Boot sequence summary

  BIOS 
 Master Boot Record (MBR) 
  Kernel 
 init 

BIOS

Load boot sector from one of:

  Floppy 
 CDROM 
  SCSI drive 
 IDE drive 
Master Boot Record

MBR (loaded from /dev/hda or /dev/sda) contains: o



lilo
  
load kernel (image=), or
 
load partition boot sector (other=)
o DOS
  
load "bootable" partition boot sector (set with fdisk)
 partition boot sector (eg /dev/hda2) contains: 
o DOS
 
loadlin
o lilo

kernel

LILO

One minute guide to installing a new kernel

 edit /etc/lilo.conf
o duplicate image= section, eg:
o image=/bzImage-2.2.12
o label=12
o read-only
o man lilo.conf for details
  run /sbin/lilo 
 (copy modules) 
 reboot to test 

Kernel

 initialise devices 
 (optionally loads initrd, see below) 
 mount root FS 
o specified by lilo or loadin
o kernel prints:
  
VFS: Mounted root (ext2 filesystem) readonly.
 run /sbin/init, PID 1 
o can be changed with boot=
o init prints:
 
INIT: version 2.76 booting

initrd

Allows setup to be performed before root FS is mounted

 lilo or loadlin loads ram disk image 


 kernel runs /linuxrc 
o load modules
o initialise devices
o /linuxrc exits
 "real" root is mounted 
 kernel runs /sbin/init 

Details in /usr/src/linux/Documentation/initrd.txt

/sbin/init

  reads /etc/inittab 
 runs script defined by this line: 
o si::sysinit:/etc/init.d/rcS
 switches to runlevel defined by 
o id:3:initdefault:

sysinit

 debian: /etc/init.d/rcS which runs


o /etc/rcS.d/S* scripts 


symlinks to /etc/init.d/*

 o /etc/rc.boot/* (depreciated)
 redhat: /etc/rc.d/rc.sysinit script

which o load modules 
o check root FS and mount RW
o mount local FS 
o setup network 
o mount remote FS

Run Levels

 0 halt 
 1 single user 
  2-4 user defined 
 5 X11 
 6 Reboot 
 Default in /etc/inittab, eg 
o id:3:initdefault:
 Change using /sbin/telinit 

Run Level programs

 Run programs for specified run level 


 /etc/inittab lines: 
o 1:2345:respawn:/sbin/getty 9600 tty1
  
Always running in runlevels 2, 3, 4, or 5
 
Displays login on console (tty1)
o 2:234:respawn:/sbin/getty 9600 tty2
  
Always running in runlevels 2, 3, or 4
 
Displays login on console (tty2)
o l3:3:wait:/etc/init.d/rc 3
  
Run once when switching to runlevel 3.
 
Uses scripts stored in /etc/rc3.d/
o ca:12345:ctrlaltdel:/sbin/shutdown -t1 -a -r now
 
Run when control-alt-delete is pressed

Boot Summary

 lilo
/etc/lilo.conf
o
 debian runs 
o /etc/rcS.d/S* and /etc/rc.boot/
o /etc/rc3.d/S* scripts
 redhat runs 
o /etc/rc.d/rc.sysinit
o /etc/rc.d/rc3.d/S* scripts

Software Installation

Introduction

Applications shall either be packaged in the RPM packaging format as defined in


this specification, or supply an installer which is LSB conforming (for example,
calls LSB commands and utilities).

Note: Supplying an RPM format package is encouraged because it makes systems


easier to manage. This specification does not require the implementation to use
RPM as the package manager; it only specifies the format of the package file.

Applications are also encouraged to uninstall cleanly.

A package in RPM format may include a dependency on the LSB Core and other
LSB specifications.Packages that are not in RPM format may test for the presence
of a conforming implementation by means of the lsb_release utility.

Implementations shall provide a mechanism for installing applications in this


packaging format with some restrictions listed below.

Note: The implementation itself may use a different packaging format for its own
packages, and of course it may use any available mechanism for installing the LSB-
conformant packages.

Package File Format

An RPM format file consists of 4 sections, the Lead, Signature, Header, and the
Payload. All values are stored in network byte order.

RPM File Format

Lead

Signature
Header
Payload
These 4 sections shall exist in the order specified.

The lead section is used to identify the package file.

The signature section is used to verify the integrity, and optionally, the
authenticity of the majority of the package file.

The header section contains all available information about the package. Entries
such as the package's name, version, and file list, are contained in the header.

The payload section holds the files to be installed.

UNIX COMMANDS

Starting A UNIX Session – Logging In

A user of a Unix Based System works at a user terminal.After the boot procedure
is completed that is the Operating System is loaded in memory , the following
message appears at each user’s terminal:

Login:

Each user has an identification called the user name, which has to be entered
when the login : message appears.

The user is then asked to enter password.

Unix keeps track of all user names and information about identity is a special
file.If the login name entered does not matches with any other user names, it
displays the login message again This ensures that only authorized people use the
system.

When a valid user name is entered at the terminal, the ‘$’ symbol is displayed on
the screen. This is the Unix Prompt.

When a user is login in, he/she is taken to his/her Home directory.


Ending a Unix Session – Logging Out

Once the user has logged into the system, the user’s work session continues until
the user instructs the shell to terminate the session.

This is done by

 pressing Ctrl+D keys together OR 


  typing exit at the $ prompt 
 logout(It sometimes get fail) 
The system then displays login: prompt on the screen.

In order to maintain security of the files, the user should NEVER leave the
terminal without logging out.

Shell as a Command Interpreter:

The Shell is an intermediary program which interprets the commands that are
typed at the terminal and translates them into commands that the kernel
understands. The shell thus acts as a blanket around the kernel and eliminates the
need for the programmer to communicate directly with the kernel.

It is UNIX unique feature that all UNIX commands exist as a utility program. These
programs are located in individual files in one of the system directories such as
/bin, /etc or /usr/bin. The shell can be considered as a master utility program
which enables a user to gain access to all other utilities and resources of a
computer.

 E.g. When after logging in user


types ls<directory name> 

 The shell : 

1. Reads the command
2. Searches for and locates the file with that name in the directories
containing commands
  Loads it into memory 
 Executes it 

 After the execution is completed, the shell once again displays the ‘$’
 prompt, conveying that it is ready for another command. 
 If the shell cannot locate the file corresponding the command, it issues an
error message and then displays the ‘$’sign 

Types of Shells:

The shell runs like any other program under the UNIX system. Hence, one shell
program can replace the other, or call another shell program. Due to this feature,
a number of shells have been developed in response to different needs of the
user. Some of the popular shells are:

Bourne Shell:

This is the original command processor developed at AT &T


and named after its developer, Stephen R. Bourne. This is the official shell that is
distributed with UNIX systems. The Bourne shell is the fastest Unix Command
processor available and can be used on all UNIX systems. The Bourne shell is the
most widely used shell at present. The executable file name is sh.

C Shell:

This is the another command processor developed by William Joy and


others at University of California at Berkeley, and gets its name from its
programming language which resembles the C Programming language in Syntax.
The executable file name is csh.

Korn Shell:

Developed by David Korn., this combines the best feature of both the
above shells. Although very few systems currently offer this shell, it is slowly
gaining popularity. The executable filename is ksh.
Restricted Shell:

This is the restricted version of the Bourne shell. It is typically used for
guest logins- users who are not part of the system and in secure installations
where users must be restricted to work only in their own limited environments.
The executable file name is rsh.

Directory Commands:

ALL UNIX COMMANDS HAS TO BE ENTERED IN LOWER CASE.

The HOME Variable:

When you login to the system, UNIX automatically places you in a directory called
the home directory. It is created by the system when a user account is opened.
Home directory can be changed also.

To know the home directory, variable used is Home.

$ echo $HOME

It will display the absolute pathname, which simply a sequence of directory


names separated by slashes. It shows the File location with reference to the top
i.e. root. This pathname is stored in the file /etc/passwd.

To identify current directory:

The pwd (print working directory) command is used to display the path name of
the current directory.

$ pwd

To Change the current directory

The cd (Change Directory) Command is used to change the directory to the


directory specified.

$ cd<directory name>

$ cd /user
/user – name of the directory you want to change

To go to the parent directory:

The two dots are used with cd command to go to the parent directory.

$ cd..

Note: The cd command without any pathname takes the user back to the HOME
Directory.

To create a directory:

The mkdir (make directory) command is used to create directories.

$ mkdir < dir_name>

To remove a directory:

The rmdir(removes directory) removes the directory specified.

$rmdir<dir_name>

A directory can be deleted only if it is :

  Empty(does not contain files or subdirectories) 


 Not the current directory 

Listing the contents of a directory:

The ls command is used to display the name of files and subdirectories in a


directory.

$ ls <directory_name>
Ls Options:

ls has a very large number of options. Few are:

 -x(Multiple Columns):Produces multicolumnar output. It is convenient for


 viewing the filenames. 
 -F(Identifying Directories and Executables)To identify that in files how many
 are directories and executable files then –F option is used. 
 -Fx(along with directories and executables gives a multicolumnar output)In
the output two symbols will be find that tags the filename. First the * which
indicates the file contains the executable code and second the / symbol
 which indicates the directory. 
 -a(To show hidden files) Normally ls command doesn’t displays the list of
 hidden files.Using –a it displays all hidden files too all beginning with dot(.) 
 .profile: contains set of instructions that are performed when a user logs in,
 conceptually similar to AUTOXEC.BAT file of DOS 
 .exrc: Contains a sequence of startup instructions for vi editor. 

Option Description

-x Multicolumnar Output
-F Marks executable with *,directories with / and
symbolic links with @
-a Shows all filenames beginning with dot including . and ..
-R Recursive list
-r Sorts filename in reverse order
-l Long Listing in ASCII collating Sequence seven
attributes of a file
-1 one filename in each file
-d dirname lists only dirname if dirname is a directory
-t Sorts filename by last modification time
-lt Sorts listing by last modification time
-u Sorts filename by last access Time
-lu Sorts by ASCII collating Sequence but listing shows last
access time.
-lut As above but sorted by access Time
-i Displays inode number.
Listing Directory Contents(-x):

$ ls –x firstfile secondfile

Here firstfile and secondfile are two directories anf above command will display
the contents of both the directory.

Recursive Listing(-R):

It lists all files and subdirectories in a directory tree. Similar to Dir/s command of
DOS , this traversal of the directory tree is done recursively till there are no
subdirectories left.

$ ls –xR

Few File Commands

Displaying Contents Of A File:

cat is used to display the contents of a file. cat also accepts one filename as
arguments.

e.g. Cat file1 file2

Here file1 and file2 are two files and there contents displayed one after another
sequentially.

Options for cat command:

To display Nonprinting Characters (-v) cat is normally used for displaying text file
only. Executables, when seen with cat, simply display junk. To display nonprinting
ASCII Characters in the input this option is used.

Numbering Lines(-n) : To debug a program if lines are displayed along with the
line number , it makes arror debugging easy.

But in vi editor line numbers are already contained and it will not work out. To do
the same pr command is used.
To Create a File:

Cat command is also used to create a file.To do this Enter the command cat,
followed by the >(the right cheveron) character and the filename ;

e.g. cat > foo

It means that the output goes to the filename following > symbol.After writing the
contents use Ctrl+d to close the file.

Ctrl+d(signifies the end of the input to the system)This is the eof character used
by UNIX systems

cat is a versatile command. It can be used to create, display, concatenate and


append to files. More important, it doesn’t restrict itself to handling files only; it
also acts on a stream. You can supply the input to cat not only by specifying a
filename, but also from the output of another command.

Copying a File(cp):

The cp(copy) command copies a file or a group of files. It creates an exact image
of the file on disk with a different name. The syntax requires at least two
filenames to be specified in the command line.When both are ordinary files, the
first is copied to the second.

cp chap01 unit1

If destination file doesn’t exists, it will first be created before copying takes place.

Few examples:

cp chap01unit1 contents of first file is Copied to another

cp chap01 progs/unit1 chap01 copied to unit1 under progs

cp chap01 progs chap01 retains its name under progs


Dot(.) signifies the current directory as the destination. To copy the file .profile
from /home/sharma to your current directory, you can use either of the two
commands:

cp /home/sharma/ .profile .profile Destination is a file

cp /home/sharma/.profile .Destination is the current directory

cp can also be used to copy more than one file with a single invocation of the
command, but in this case the last filename must be a directory.

To copy many files to the one directory i.e. progs

cp chap01 chap02 chap03 progs

cp chap* progs

cp Options

Interactive Copying(-i)

The –i(interactive) option warns the user before overwriting


the destination file.If unit1 exists cp prompts for a response

$ cp –i chap01 unit1

Copying Directory Structure(-R)

To copy entire directory structure -R option is used. The following command


copies all files and subdirectories

cp –R progs newprogs

If newprogs directory doesn’t exists cp command creates it along


with its sub directories and if exists progs becomes a subdirectory under
newprogs.
Deleting Files(rm):

Files can be deleted using rm command. It can delete more than one file with a
single invocation.

rm chap01 chap02 chap03

A file once deleted can’t be recovered.

rm progs/chap01 progs/chap02

rm * It removes all files.

Rm options

Interactive Deletion(-i) This option makes the command ask the user
for confirmation before removing each file.

$ rm –i chap01 chap02 chap03

Recursive Deletion(-r or –R) It performs a tree walk- athorough recursive


search for all subdirectories and files within these subdirectories.

$ rm –r *

Forcing Removal(-f) rm will prompt you for removal if a file is write protected.
The –f option overrides this minor protection also.It will force removal even if
the files are write-protected . When you combine the –r option with it, it could
be most risky thing to do.
$ rm –rf *

Renaming Files(mv) :

To rename a file mv command is used. It has two functions:

  It renames a file (or directory). 


 It moves a group of files to a different directory. 

mv doesn’t create a copy of a file, it merely renames it.


$ mv chap01 man01
If a file already exits it overwrites its contents otherwise a create and copy
the contents of chap01 file.

$ mv chap01 chap02 chap03 progs

Move all the files under progs directory.

$ mv pis perdir

It will rename a directory from pis to perdir.

Mv Options

Interactive move(-i):
To make the command ask user confirmation before moving a
file .

$ mv –i chap01 man01.

Paging Output (more):


The man command displays its output a page at a
time as it sends its output to a pager program.Unix offers the more pager and
Linux offers more but less is its standard pager.

To view the page one at a time more command is used.

More chap01.
q is used to exit.

To Count Words, Line,Characters:


UNIX has a universal word Counting
Program. The he Command used for it is wc which can count words, lines,
charaters all depending upon the options used.It takes one or more filenames as
its arguments and display a four columnar output.

$ wc file1
Options:
-l : To count lines -

w : To count words

-c : To count characters

Multiple Filenames:

When used with multiple filenames ,wc produces a line for each file as well as a
total count.

$ wc chap01 chap02 chap03

Displaying Data In Octal(od):


The od command is used to display some
non printing characters contained by the file.It makes it possible by displaying the
ASCII octal value of its input.The –b option is used ti display this value for each
character.

$ od –b file1

The option –b and –c (character ) is used to display character and octal both
output together

$ od –bc file1

Comparing two files(cmp):


To compare two files cmp command is
used. It compares the files byte by byte and the location of the first mismatch
is echoed on the screen.

$ cmp file1 file2

Option used

-l(list):
This option is used to give a detailed list of
bytes numbers and the differing bytes in octal for each character that differs in
both files.

$ cmp –l file1 file2

To know Common:
To know the commands common in files comm
command is used.In this case both the files should be sorted. When we run
comm, it displays three columnar output.

The first line contains unique to the first file, second column shows unique to the
second file and third column contains common to both the files.
$ comm file[12] comparing file1 and file2
$ comm -3 file1 file2
$ comm -13 file1 file2

Converting One File To Another(diff):


Used to display file differences.It also tells you
that which lines in one file have to be changed to make the two files identical.

$ diff file1 file2


Or
$ diff file[12]

Diff uses certain special symbols and instructions to indicate the changes that are
required to make two files identical.These instructions are used by sed command
by the most powerful commands on the system.

0a1,2 means appending two lines after line 0.

2c4 changes line 2 which is line 4 in the second file

4d5 deletes line 4


COMPRESSING AND ARCHIVING FILES:

To conserve disk space you require to compress


large and infrequently used files. Unix provides few compression and
decompression utilities:

 compress and uncompress (.Z) 


 gzip and gunzip(.gz) 
 bzip2 and bunzip2(.bz2) 
 zip and unzip(.zip) 

E.g.:

$ wc –c libc.html
$ gzip libc.html
$ wc –c libc.html.gz

$wc –c User_Guide.ps; gzip User_Guide.ps; wc –c User_Guide.ps.gz

$ gzip –l libc.html.gz User_Guide.ps.gz

Gzip options:

-d :
To restore the original and uncompressed file you have two options.

 gzip –d 
 gunzip 

$gunzip libc.html.gz
$gzip –d libc.html
$gunzip libc.html.gz User_Guide.ps.gz

-r(Recursive Compression) :
It compresses all files found in subdirectories.This
option is used for decompression also.
$ gzip –r progs
$gunzip –r
progs Or
$ gzip –dr progs

-c (Writing to Terminal):
When used with gzip, this option doesn’t create a
compressed file, but sends the compressed output to the terminal.

The same option used with gunzip also shows compresses contents of
a compressed file on the terminal.The original file remains unchanged.

The Archival Program(tar):


Tar is used to create a disk archive that contains a
group of files or an entire directory structure.Options used with tar command are:

-c Create an archive
-x Extract files from archive
-t Display files in archive
-f arch Name of the archive arch.

e.g.:

$ tar –cvf archive.tar libc.html Us_guide.ps


-v verbose It display the progress while tar
works. -f specifies the name of a archive file.

To compress a archived tar file gzip can be used


$ gzip archive.tar

To extract files:
$ gunzip archive.tar.gz
$tar –xvf archive.tar

To view the archive:


$tar –tvf archive.tar
Compressing and archiving together(zip and unzip):
Zip diesn’t compress as much as gzip but it combines the compressing
function of gzip with the archival function of tar.Instead of using two ou can
use one only i.e. zip

$zip archive.zip libc.html User_Guide.ps


This command never overwrites a file if already exists, it updates it or
appends the information.

Recursive Compression(-r) :
It descends the tree structure except that it also
compresses files

$ zip -r sumit_home.zip

Using unzip files can be restored.While restoration if file already exists it asks for
the user permission.

Viewing the Archive(-v) :


To view the compressed archive –v option is
used. The list shows both compressed and uncompressed size of each file in the
archive along with the percentage of compression archived.

$unzip –v archive.zip
UNIT – II

Files: File Concept, File System Structure, Inodes, File Attributes, File types, The
Linux File System: Basic Princples, Pathnames, Mounting and Unmounting File
Systems, Different File Types, File Permissions, Disk Usage Limits, Directory
Structure Standard Input and Output, Redirecting input and Output, Using Pipes
to connect processes, tee command, Linux File Security, permission types,
examining permissions, changing permissions(symbolic method numeric
method),default permissions and umask Vi editor basics, three modes of vi
editor,concept of inodes,inodes and directories,cp and inodes ,mv and inodes rm
and inodes,symbolic links and hard links,mount and umount command, creating
archives, tar,gzip,gunzip,bzip2,bunzip2(basic usage of these commands )

2.1 File
A file is a collection of information, which can be data, an application, and
documents; in fact, under UNIX, a file can contain anything. When a file is created,
UNIX assigns the file a unique internal number (called an inode).

Unlike the old DOS Files, a UNIX file doesn’t contain the eof mark (End Of File).A
file size is not an attribute, kept in a separate area in the hard disk. It is not
directly accessible to human, only accessible to kernel.

UNIX treats directories and devices as files as well. A directory is simply a folder
where you store files and other directories. All physical devices like the hard disk,
memory, CD-ROM, printer and modem are also treated as file and even shell,
kernel and main memory of the system are also treated as a file.

2.2 The UNIX Filesystem

The UNIX operating system is built around the concept of a filesystem which is
used to store all of the information that constitutes the long-term state of the
system. This state includes the operating system kernel itself, the executable files
for the commands supported by the operating system, configuration information,
temporary workfiles, user data, and various special files that are used to give
controlled access to system hardware and operating system functions.
Every item stored in a UNIX filesystem belongs to one of four types:

Ordinary files

Ordinary files can contain text, data, or program information. Files cannot contain
other files or directories. Unlike other operating systems, UNIX filenames are not
broken into a name part and an extension part (although extensions are still
frequently used as a means to classify files). Instead they can contain any
keyboard character except for '/' and be up to 256 characters long (note however
that characters such as *,?,# and & have special meaning in most shells and
should not therefore be used in filenames). Putting spaces in filenames also
makes them difficult to manipulate - rather use the underscore '_'.
The most common file type, also known as regular file.

It contains only data as a stream of characters. All programs you write belongs to
this file type.

Ordinary files are of two types. They are

  Text file 
 Binary file 

Text file: contains only printable characters and you can often view the contents
and make sense out of them.

All java, C program sources, shell and Perl scripts are text files.

A text files contains lines of characters where every lines are terminated by new
line character (Line Feed-LF- viewed by OD command).

Binary File:

Contains both printable and unprintable characters that cover the entire ASCII
range.
Most UNIX commands are binary files. Object code and executables produced by
compiling a program are also binary files.

Picture, sound and video files are also binary files.

Using cat command for binary files gives unreadable output and even disturbs
terminal settings.

Directories

Directories are containers or folders that hold files, and other directories.
A directory contains no data. It keeps detail of files and subdirectories it contains.

UNIX file system is organized with a number of directories and subdirectories and
you can create them as and when you need them.

Often required to group a set of files related to a specific application. This also
benefited to have the same file names under different directories.

A directory file contains the entry for every file and subdirectory that it houses.
Each entry has two components:

  The filename 
 A unique identification for the file or directory(called the inode number) 

A directory contains the filename not the file’s content.

You cannot write a directory file, but you can perform some action that makes the
kernel write a directory. E.g. when you create a file or remove a file kernel
automatically updates its corresponding directory by adding or removing the
entry (inode number and filename) associated with the file.

NOTE: The name of a file can only be found in its directory; the file itself doesn’t
contain its own name or any other attributes, like its size or access rights.
Devices

To provide applications with easy access to hardware devices, UNIX allows them
to be used in much the same way as ordinary files. There are two types of devices
in UNIX - block-oriented devices which transfer data in blocks (e.g. hard disks) and
character-oriented devices that transfer data on a byte-by-byte basis (e.g.
modems and dumb terminals).
A device file is a special file which does not contain a stream of characters, in fact
it doesn’t contains anything at all. Every file has some attributes that are not
stored in the file but elsewhere on the disk. It’s the attribute of a device file that
entirely governs the operation of the device. The kernel identifies a device from
its attributes and then uses them to operate the device.

You usually print a file, install software from CD-ROM, backup a file to tape, these
all activities are performed by reading or writing the file representing the
device.E.g. You print a file by writing the file representing the printer. When you
restore files from tape, you read the file associated with the tape drive..

The kernel takes care of this “reflection” by mapping these files to their respective
devices.

It is advantageous to treat devices a s files as some of the command s used to


access ordinary files also work with device files.

Device filenames are generally found inside a single directory structure, /dev.

Links
A link is a pointer to another file. There are two types of links - a hard link to a file
is indistinguishable from the file itself. A soft link (or symbolic link) provides an
indirect pointer or shortcut to a file. A soft link is implemented as a directory file
entry containing a pathname.
2.3 Typical UNIX Directory Structure
The UNIX filesystem is laid out as a hierarchical tree structure which is anchored
at a special top-level directory known as the root (designated by a slash '/').
Because of the tree structure, a directory can have many child directories, but
only one parent directory. Fig. 2.1 illustrates this layout.

Fig. 2.1: Part of a typical UNIX filesystem tree

To specify a location in the directory hierarchy, we must specify a path through


the tree. The path to a location can be defined by an absolute path from the root
/, or as a relative path from the current working directory. To specify a path, each
directory along the route from the source to the destination must be included in
the path, with each directory in the sequence being separated by a slash. To help
with the specification of relative paths, UNIX provides the shorthand "." for the
current directory and ".." for the parent directory. For example, the absolute path
to the directory "play" is /home/will/play, while the relative path to this directory
from "zeb" is ../will/play.

Fig. 2.2 shows some typical directories you will find on UNIX systems and briefly
describes their contents. Note that these although these subdirectories appear as
part of a seamless logical filesystem, they do not need be present on the same
hard disk device; some may even be located on a remote machine and
accessed across a network.

Directory Typical Contents


/ The "root" directory
/bin Essential low-level system utilities
/usr/bin Higher-level system utilities and application programs
Superuser system utilities (for performing system
/sbin
administration tasks)
Program libraries (collections of system calls that can be
/lib included in programs by a compiler) for low-level system
utilities
/usr/lib Program libraries for higher-level user programs
/tmp Temporary file storage space (can be used by any user)
User home directories containing personal file space for
/home or
each user. Each directory is named after the login of the
/homes
user.
/etc UNIX system configuration and information files
/dev Hardware devices
A pseudo-filesystem which is used as an interface to the
/proc kernel. Includes a sub-directory for each active program (or
process).

Fig. 2.2: Typical UNIX directories

When you log into UNIX, your current working directory is your user home
directory. You can refer to your home directory at any time as "~" and the
home directory of other users as "~<login>". So ~will/play is another way for
user jane to specify an absolute path to the directory /homes/will/play.
User will may refer to the directory as ~/play.
The Parent-Child Relationship:

All files in UNIX are “related” to one another.

The File System in UNIX is a collection of all these related files organized in a
hierarchical structure.

The implicit feature of every UNIX file system is that there is a top, which serves
as a reference point for all files. This top is called as root and is represented by
a front slash (/)

Root is actually a directory. It is conceptually different from the user-id root used
by the system administrator to log in.

The root directory has a number of sub directories under it. These
subdirectories can have number of subdirectories and other files under them.

Every file, apart from root must have parent, and it should be possible to
trace the ultimate parentage of a file to root.

From the administrative point of view the entire file system comprises of two
groups of files.

The first group contains the files that are made available during
system installation.

The second group consists of files for the users to edit. Contents of directories
would change as more software, utilities are added to the system.Users can have
their own files to write program, send and receive mails and to create temporary
files.

First Group:

/bin and /usr/bin : directories where all commonly used UNIX commands are
found. (Binaries hence bin). Path command shows its list.

/sbin and /usr/sbin : can have list of commands can only be executed by system
administrator. System administrator’s PATH command can show these files.
/etc : contains Configuration files of the system. Your login name and
password are stored in files like /etc/passwd and etc/shadow.

Can change a very important aspect of system functioning by editing a text file
in this directory.

/dev : Contains all device files. These files don’t occupy space on disk. There could
be more subdirectories like pts, dsk and rdsk in this directory.

/lib and /usr/lib : contains all library files in binary form. You will require
to link your C Programs with files these directory.

/usr/include : contains the standard header files used by C


Programs. The statement #include<stdio.h> used in most C program refers to
the file stdio.h in this directory.

/usr/share/man : This is where the man pages are stored. There are
separate subdirectories here that contain the pages for each section.

e.g. : the man page of ls found in /usr/share/man/man1, where the 1 in man 1


represents the section 1 of the UNIX manual. These directories may have different
names on your system.

SECOND GROUP:

/tmp : The directories where users are allowed to create temporary


files. These files are wiped away regularly by the system.

/var : The variable part of the file system. Contains all your print jobs
and your outgoing and incoming mail.

/home : Users are housed here. Kumar would have his home
directory in /home/Kumar. Your system may use a different location for
home directory.

2.4 Directory and File Handling Commands


This section describes some of the more important directory and file handling
commands.
pwd (print [current] working directory)

pwd displays the full absolute path to the your current location in the
filesystem. So

$ pwd
/usr/bin

implies that /usr/bin is the current working directory.

ls (list directory)

ls lists the contents of a directory. If no target directory is given, then the contents
of the current working directory are displayed. So, if the current working directory
is /,

$ ls
bin dev home mnt share usr var
boot etc lib proc sbin tmp vol

Actually, ls doesn't show you all the entries in a directory - files and directories
that begin with a dot (.) are hidden (this includes the directories '.' and '..' which
are always present). The reason for this is that files that begin with a . usually
contain important configuration information and should not be changed under
normal circumstances. If you want to see all files, ls supports the -a option:

$ ls -a

Even this listing is not that helpful - there are no hints to properties such as the
size, type and ownership of files, just their names. To see more detailed
information, use the -l option (long listing), which can be combined with the -
a option as follows:

$ ls -a -l
(or, equivalently,)
$ ls -al

Each line of the output looks like this:


where:

type is a single character which is either 'd' (directory), '-' (ordinary file), 'l'
(symbolic link), 'b' (block-oriented device) or 'c' (character-oriented device).

permissions is a set of characters describing access rights. There are 9 permission


characters, describing 3 access types given to 3 user categories. The three access
types are read ('r'), write ('w') and execute ('x'), and the three users categories
are the user who owns the file, users in the group that the file belongs to and
other users (the general public). An 'r', 'w' or 'x' character means the
corresponding permission is present; a '-' means it is absent.

links refers to the number of filesystem links pointing to the file/directory (see
the discussion on hard/soft links in the next section).

owner is usually the user who created the file or directory.

group denotes a collection of users who are allowed to access the file according
to the group access rights specified in the permissions field.

size is the length of a file, or the number of bytes used by the operating system
to store the list of files in a directory.

date is the date when the file or directory was last modified (written to). The
- u option display the time when the file was last accessed (read).

name is the name of the file or directory.

ls supports more options. To find out what they are, type:

$ man ls

man is the online UNIX user manual, and you can use it to get help with
commands and find out about what options are supported. It has quite a terse
style which is often not that helpful, so some users prefer to the use the
(non-standard) info utility if it is installed:

$ info ls

more and less (catenate with pause)

$ more target-file(s)

displays the contents of target-file(s) on the screen, pausing at the end of each
screenful and asking the user to press a key (useful for long files). It also
incorporates a searching facility (press '/' and then type a phrase that you want
to look for).

You can also use more to break up the output of commands that produce
more than one screenful of output as follows (| is the pipe operator, which will
be discussed in the next chapter):

$ ls -l | more

less is just like more, except that has a few extra features (such as allowing users
to scroll backwards and forwards through the displayed file). less not a standard
utility, however and may not be present on all UNIX systems.

2.5 Making Hard and Soft (Symbolic) Links


Direct (hard) and indirect (soft or symbolic) links from one file or directory
to another can be created using the ln command.

$ ln filename linkname

creates another directory entry for filename called linkname (i.e. linkname is a
hard link). Both directory entries appear identical (and both now have a link
count of 2). If either filename or linkname is modified, the change will be
reflected in the other file (since they are in fact just two different directory
entries pointing to the same file).
$ ln -s filename linkname

creates a shortcut called linkname (i.e. linkname is a soft link). The shortcut
appears as an entry with a special type ('l'):

$ ln -s hello.txt
bye.txt $ ls -l
bye.txt
lrwxrwxrwx 1 will finance 13 bye.txt -> hello.txt
$

The link count of the source file remains unaffected. Notice that the permission
bits on a symbolic link are not used (always appearing as rwxrwxrwx). Instead
the permissions on the link are determined by the permissions on the target
(hello.txt in this case).

Note that you can create a symbolic link to a file that doesn't exist, but not a
hard link. Another difference between the two is that you can create symbolic
links across different physical disk devices or partitions, but hard links are
restricted to the same disk partition. Finally, most current UNIX implementations
do not allow hard links to point to directories.

2.6 Specifying multiple filenames


Multiple filenames can be specified using special pattern-matching characters.
The rules are:

1. '?' matches any single character in that position in the filename.


2. '*' matches zero or more characters in the filename. A '*' on its own
will match all files. '*.*' matches all files with containing a '.'.
3. Characters enclosed in square brackets ('[' and ']') will match any filename
that has one of those characters in that position.
4. A list of comma separated strings enclosed in curly braces ("{" and "}") will
be expanded as a Cartesian product with the surrounding characters.
For example:
  ??? matches all three-character filenames. 
 ?ell? matches any five-character filenames with 'ell' in the middle. 
  he* matches any filename beginning with 'he'. 
 [m-z]*[a-l] matches any filename that begins with a letter from 'm' to 'z'
 and ends in a letter from 'a' to 'l'. 
 {/usr,}{/bin,/lib}/file expands to /usr/bin/file
/usr/lib/file /bin/file and /lib/file. 

Note that the UNIX shell performs these expansions (including any filename
matching) on a command's arguments before the command is executed.

2.7 Quotes
As we have seen certain special characters (e.g. '*', '-','{' etc.) are interpreted in a
special way by the shell. In order to pass arguments that use these characters to
commands directly (i.e. without filename expansion etc.), we need to use special
quoting characters. There are three levels of quoting that you can try:

Try insert a '\' in front of the special character.

Use double quotes (") around arguments to prevent most expansions.

Use single forward quotes (') around arguments to prevent all expansions.

There is a fourth type of quoting in UNIX. Single backward quotes (`) are used to
pass the output of some command as an input argument to another. For example:

$
hostname
rose
$ echo this machine is called
`hostname` this machine is called rose
3.2 File and Directory Permissions

File Directory
Permission
User can look at
read the contents of User can list the files in the directory
the file
User can modify
User can create new files and remove
write the contents of
existing files in the directory
the file
User can change into the directory, but
User can use the
cannot list the files unless (s)he has read
execute filename as a
permission. User can read files if (s)he has
UNIX command
read permission on them.

Fig 3.1: Interpretation of permissions for files and directories

FILE ATTRIBUTES

A File also has a number of attributes that are changeable by well defined
rules.The Unix file system lets user access other files not belonging to them
without infringing on security.There are so many attributes of file but here we
consider only basic file attributes.

Basic File Attributes: There are two basic file attributes:

  Permission 
 Ownership 
To List File Attributes(ls –l):

This –l(long) option is used to display most attributes of file like its
permission,size, and ownership details.ls looks up the file’s inode to fetch its
sttributes.ls –l lists seven attributes of all files in the current directory.

$ ls –l

The list is always preceded by the word total n where n is a number denotes
number of blocks occupied by these files in the disk, each block consisting of 512
bytes(1024 in Linux).

File type and permissions:

The First Column shows the type and permissions associated with each file.

The First Character in this column is always -, which means that the file is ordinary
one.When at the same position it is d , it indicates a directory.

After this you can see a series of characters that can take the values r, w, x and -
.In UNIX you can have three types of file permissions i.e. Read , Write and
Execute.

The second column indicates the number of links associated with the file.

Ownership: When you create a file ,you automically become its owner.

The third column shows the owner name of the file.The owner has a full authority
to tamper with the file’s contents and permissions.

Group Ownership: When opening a user account, the system administrator also
assigns user to some group.The fourth column represents the group owner of the
file.The group has a set of privileges distinct from others as well as the owner.

File size: The fifth column shows the size of file in bytes i.e. the amount of data it
contains.It is only a character count of the file and not a measure of the disk space
that it occupies.
Last Modification Time: The sixth, seventh and eighth columns indicate the last
Modification time of the file, which is stored to the nearest Second. If the file is
less than a year old since its last modification date, the year won’t be displayed.

Changing File Permission (chmod):

User means owner, A file or directory is created


with a default set of permissions,.Default settings write protects a file from all
except the user, though all the user may have read access.To know your system’s
default create a file. And then use ls –l.The chmod command is used to set the
permissions of one or more files for all the three categories of users(user,group
and others).It can be run only by the user and the superuser.It can be used in two
ways:

In a relative manner by specifying the changes to the current permissions

In an absolute manner by specifying the final permission.

Relative Permission:

When changing permission in a relative manner, chmod only changes the


permission specified in the command line and leaves the permissions
unchanged.In this mode it uses the follwing syntax:

$ chmod category operation permission filename(s)

Chmod takes as its argument an expression comprising some letters and symbols
that completely describe the user category and the type of permissionbeing
assigned or removed.The expression contains three component:

User category(usres,group,others)

The operation to be performed(assign or remove a permission)

The type of permission (read,write,execute)


E.g. :

$chmod u+x file1

$chmod ugo+x file1

$chmod go-r file1

$chmod u+x file1 file2 file3

$chmod a-x,go+r file1

$chmod o+wx file1

Abbreviation Used by chmod

S.No. Category Operation Permission

1. u-User +-Assigns permission r-Read permission

2. g-Group --Removes Permission w-Write Permission

3. o-Others =-asigns absolute permission x-Execute Permision

4. a-All(ugo)

Absolute Permissions

To set all nine permission bits explicitly Absolute Permissions are used.The
expressions used by chmod here is a string of three octal numbers(base-8).Each
type of permission is assigned a number as shown:

Read permission-4

Write Permission-2

Execute permission-1
For each category we add the numbers that represent the assigned
permissions.For instance 6 represents read and write permissions and
7 represents all.

E.g. :

$chmod 666 file1

$chmod 644 file1

$chmod 761 file1

.chmod can be used recursively also.

It descends a directory hierarchy and apply the expression to every file


and subdirectory it finds.This is done with –R(recursive option)

$chmod –R a+x shellscripts

To apply on the home directory:

$chmod –R 755 . Works on hidden files also

$chmod –R a+x * Leaves out hidden files.

chgrp (change group)

$ chgrp group files

can be used to change the group that a file or directory belongs to. It
also supports a -R option.
3.3 Inspecting File Content
Besides cat there are several other useful utilities for investigating the contents of
files:

file filename(s)

file analyzes a file's contents for you and reports a high-level description of what
type of file it appears to be:

$ file myprog.c letter.txt


webpage.html myprog.c: C program
text
letter.txt: English text webpage.html:
HTML document text

file can identify a wide range of files but sometimes gets understandably
confused (e.g. when trying to automatically detect the difference between C++
and Java code).

head, tail filename

head and tail display the first and last few lines in a file respectively. You can
specify the number of lines as an option, e.g.

$ tail -20 messages.txt


$ head -5 messages.txt

tail includes a useful -f option that can be used to continuously monitor the last
few lines of a (possibly changing) file. This can be used to monitor log files, for
example:

$ tail -f /var/log/messages

continuously outputs the latest additions to the system log file.


objdump options binaryfile

objdump can be used to disassemble binary files - that is it can show the machine
language instructions which make up compiled application programs and system
utilities.

od options filename (octal dump):od can be used to displays the contents of


a binary or text file in a variety of formats, e.g.

$ cat
hello.txt
hello world
$ od -c hello.txt
0000000 h e l l o w o r l d \n
0000014
$ od -x hello.txt
0000000 6865 6c6c 6f20 776f 726c 640a
0000014

There are also several other useful content inspectors that are non-standard (in
terms of availability on UNIX systems) but are nevertheless in widespread use.
They are summarised in Fig. 3.2.

File type Typical extension Content viewer


Portable Document Format .pdf acroread
Postscript Document .ps ghostview
DVI Document .dvi xdvi
JPEG Image .jpg xv
GIF Image .gif xv
MPEG movie .mpg mpeg_play
WAV sound file .wav realplayer
HTML document .html netscape
Fig 3.2: Other file types and appropriate content viewers.
3.4 Finding Files
There are at least three ways to find files when you don't know their
exact location:

find

If you have a rough idea of the directory tree the file might be in (or even if you
don't and you're prepared to wait a while) you can use find:

$ find directory -name targetfile -print

find will look for a file called targetfile in any part of the directory tree
rooted at directory. targetfile can include wildcard characters. For example:

$ find /home -name "*.txt" -print 2>/dev/null

will search all user directories for any file ending in ".txt" and output any matching
files (with a full absolute or relative path). Here the quotes (") are necessary to
avoid filename expansion, while the 2>/dev/null suppresses error messages
(arising from errors such as not being able to read the contents of directories for
which the user does not have the right permissions).

find can in fact do a lot more than just find files by name. It can find files by type
(e.g. -type f for files, -type d for directories), by permissions (e.g. -perm o=r for all
files and directories that can be read by others), by size (-size) etc. You can also
execute commands on the files you find. For example,

$ find . -name "*.txt" -exec wc -l '{}' ';'

counts the number of lines in every text file in and below the current directory.
The '{}' is replaced by the name of each file found and the ';' ends the -execclause.

For more information about find and its abilities, use man find and/or info find.
which (sometimes also called whence) command

If you can execute an application program or system utility by typing its name
at the shell prompt, you can use which to find out where it is stored on disk. For
example:

$ which
ls /bin/ls

locate string

find can take a long time to execute if you are searching a large filespace (e.g.
searching from / downwards). The locate command provides a much faster way of
locating all files whose names match a particular search string. For example:

$ locate ".txt"

will find all filenames in the filesystem that contain ".txt" anywhere in their full
paths.

One disadvantage of locate is it stores all filenames on the system in an index that
is usually updated only once a day. This means locate will not find files that have
been created very recently. It may also report filenames as being present even
though the file has just been deleted. Unlike find, locate cannot track down files
on the basis of their permissions, size and so on.

3.5 Finding Text in Files


grep (General Regular Expression Print)

$ grep options pattern files

grep searches the named files (or standard input if no files are named) for
lines that match a given pattern. The default behaviour of grep is to print out
the matching lines. For example:

$ grep hello *.txt


searches all text files in the current directory for lines containing "hello". Some of
the more useful options that grep provides are:
-c (print a count of the number of lines that match), -i (ignore case), -v (print
out the lines that don't match the pattern) and -n (printout the line number
before printing the matching line). So

$ grep -vi hello *.txt

searches all text files in the current directory for lines that do not contain
any form of the word hello (e.g. Hello, HELLO, or hELlO).

If you want to search all files in an entire directory tree for a particular pattern,
you can combine grep with find using backward single quotes to pass the
output fromfind into grep. So

$ grep hello `find . -name "*.txt" -print`

will search all text files in the directory tree rooted at the current directory for
lines containing the word "hello".

The patterns that grep uses are actually a special type of pattern known as
regular expressions. Just like arithemetic expressions, regular expressions are
made up of basic subexpressions combined by operators.

The most fundamental expression is a regular expression that matches a single


character. Most characters, including all letters and digits, are regular expressions
that match themselves. Any other character with special meaning may be quoted
by preceding it with a backslash (\). A list of characters enclosed by '[' and ']'
matches any single character in that list; if the first character of the list is the caret
`^', then it matches any character not in the list. A range of characters can be
specified using a dash (-) between the first and last items in the list. So [0-
9] matches any digit and [^a-z] matches any character that is not a digit.

The caret `^' and the dollar sign `$' are special characters that
match the beginning and end of a line respectively. The dot '.' matches
any character. So

$ grep ^..[l-z]$ hello.txt


matches any line in hello.txt that contains a three character sequence that
ends with a lowercase letter from l to z.

egrep (extended grep) is a variant of grep that supports more sophisticated


regular expressions. Here two regular expressions may be joined by the
operator `|'; the resulting regular expression matches any string matching either
subexpression. Brackets '(' and ')' may be used for grouping regular expressions.
In addition, a regular expression may be followed by one of several repetition
operators:

`?' means the preceding item is optional (matched at most once).


`*' means the preceding item will be matched zero or more times.
`+' means the preceding item will be matched one or more times.
`{N}' means the preceding item is matched exactly N times.
`{N,}' means the preceding item is matched N or more times.
`{N,M}' means the preceding item is matched at least N times, but not more
than M times.

For example, if egrep was given the regular expression

'(^[0-9]{1,5}[a-zA-Z ]+$)|none'

it would match any line that either:

begins with a number up to five digits long, followed by a sequence of one or


more letters or spaces, or

contains the word none

You can read more about regular expressions on the grep and egrep manual
pages.

Note that UNIX systems also usually support another grep variant
called fgrep (fixed grep) which simply looks for a fixed string inside a file (but this
facility is largely redundant).
3.6 Sorting files
There are two facilities that are useful for sorting files in UNIX:

sort filenames

sort sorts lines contained in a group of files alphabetically (or if the -n option is
specified) numerically. The sorted output is displayed on the screen, and may be
stored in another file by redirecting the output. So

$ sort input1.txt input2.txt > output.txt

outputs the sorted concentenation of files input1.txt and input2.txt to


the file output.txt.

uniq filename

uniq removes duplicate adjacent lines from a file. This facility is most useful when
combined with sort:

$ sort input.txt | uniq > output.txt

3.7 File Compression and Backup


UNIX systems usually support a number of utilities for backing up
and compressing files. The most useful are:

tar (tape archiver)

tar backs up entire directories and files onto a tape device or (more commonly)
into a single disk file known as an archive. An archive is a file that contains other
files plus information about them, such as their filename, owner, timestamps, and
access permissions. tar does not perform any compression by default.

To create a disk file tar archive, use

$ tar -cvf archivenamefilenames


where archivename will usually have a .tar extension. Here the c option
means create, v means verbose (output filenames as they are archived), and f
means file.To list the contents of a tar archive, use

$ tar -tvf archivename

To restore files from a tar archive, use

$ tar -xvf archivename

cpio

cpio is another facility for creating and reading archives. Unlike tar, cpio
doesn't automatically archive the contents of directories, so it's common to
combine cpiowith find when creating an archive:

$ find . -print -depth | cpio -ov -Htar > archivename

This will take all the files in the current directory and the
directories below and place them in an archive called archivename.The - depth
option controls the order in which the filenames are produced and is
recommended to prevent problems with directory permissions when doing a
restore.The -o option creates the archive, the -v option prints the names of the
files archived as they are added and the -H option specifies an archive format
type (in this case it creates a tar archive). Another common archive type is crc,
a portable format with a checksum for error control.

To list the contents of a cpio archive, use

$ cpio -tv < archivename

To restore files, use:

$ cpio -idv < archivename

Here the -d option will create directories as necessary. To force cpio to extract
files on top of files of the same name that already exist (and have the same or
later modification time), use the -u option.
compress, gzip

compress and gzip are utilities for compressing and decompressing individual
files (which may be or may not be archive files). To compress files, use:

$ compress filename
or
$ gzip filename

In each case, filename will be deleted and replaced by a compressed file


called filename.Z or filename.gz. To reverse the compression process, use:

$ compress -d filename
or
$ gzip -d filename

3.8 Handling Removable Media (e.g. floppy disks)


Mountable File System

UNIX supports tools for accessing removable media such as CDROMs and
floppy disks.

All UNIX systems have at least


one permanent non-removable
hard disk system. The root
directory and the directories
below it are stored on this disk.
Its structure is known as the root
file system.If an additional hard
disk is added, UNIX creates a
separate new file system on this
disk. Before it can be used, it
must be attached to the root file
system. This is called mounting
an additional file system.
An empty directory on the root file system is selected as the mount point,
and using the UNIX mount command, the new file system will appear under
this directory.

mount, umount

The mount command serves to attach the filesystem found on some device to
the filesystem tree. Conversely, the umount command will detach it again (it is
very important to remember to do this when removing the floppy or CDROM).
The file /etc/fstab contains a list of devices and the points at which they will be
attached to the main filesystem:

$ cat /etc/fstab
/dev/fd0 /mnt/floppy auto rw,user,noauto 0 0
/dev/hdc /mnt/cdrom iso9660 ro,user,noauto 0 0

In this case, the mount point for the floppy drive is /mnt/floppy and the mount
point for the CDROM is /mnt/cdrom. To access a floppy we can use:

$ mount /mnt/floppy
$ cd /mnt/floppy
$ ls (etc...)

To force all changed data to be written back to the floppy and to detach the
floppy disk from the filesystem, we use:

$ umount /mnt/floppy

mtools

If they are installed, the (non-standard) mtools utilities provide a convenient way
of accessing DOS-formatted floppies without having to mount and unmount
filesystems. You can use DOS-type commands like "mdir a:", "mcopy a:*.* .",
"mformat a:", etc. (see the mtools manual pages for more details).
7.1 Objectives
This lecture covers basic system administration concepts and tasks, namely:

 The superuser root. 


  Shutdown and system start-up. 
 Adding users. 
 Controlling user groups. 
  Reconfiguring and recompiling the Linux kernel. 
 Cron jobs. 
 Keeping essential processes alive. 

Note that you will not be given administrator access on the lab machines. However,
you might like to try some basic administration tasks on your home PC.

7.2 The Superuser root


The superuser is a privileged user who has unrestricted access to all commands
and files on a system regardless of their permissions. The superuser's login is
usually root. Access to the root account is restricted by a password (the root
password). Because the root account has huge potential for destruction, the root
password should be chosen carefully, only given to those who need it, and
changed regularly.

One way to become root is to log in as usual using the username root and the
root password (usually security measures are in place so that this is only possible
if you are using a "secure" console and not connecting over a network). Using root
as your default login in this way is not recommended, however, because normal
safeguards that apply to other user accounts do not apply to root. Consequently
using root for mundane tasks often results in a memory lapse or misplaced
keystrokes having catastrophic effects (e.g. forgetting for a moment which
directory you are in and accidentally deleting another user's files, or accidentally
typing "rm -rf * .txt" instead of "rm -rf *.txt" ).

A better way to become root is to use the su utility. su (switch user) lets you
become another user (at least as far as the computer is concerned). If you don't
specify the name of the user you wish to become, the system will assume you
want to become root. Using su does not usually change your current directory,
unless you specify a "-" option which will run the target user's startup scripts and
change into their home directory (provided you can supply the right password of
course). So:

$ su -
Password: xxxxxxxx
#

Note that the root account often displays a different prompt (usually a #). To
return to your old self, simply type "exit" at the shell prompt.

You should avoid leaving a root window open while you are not at your machine.

7.3 Shutdown and System Start-up


Shutdown: shutdown, halt, reboot (in /sbin)

/sbin/shutdown allows a UNIX system to shut down gracefully and securely. All
logged-in users are notified that the system is going down, and new logins are
blocked. It is possible to shut the system down immediately or after a specified
delay and to specify what should happen after the system has been shut down:

# /sbin/shutdown -r now (shut down now and reboot)


# /sbin/shutdown -h +5 (shut down in 5 minutes & halt)
# /sbin/shutdown -k 17:00 (fake a shutdown at 5pm)

halt and reboot are equivalent to shutdown -h and shutdown -r respectively.

If you have to shut a system down extremely urgently or for some reason cannot
use shutdown, it is at least a good idea to first run the command:

# sync

which forces the state of the file system to be brought up to date.


System startup:

At system startup, the operating system performs various low-level tasks, such as
initialising the memory system, loading up device drivers to communicate with
hardware devices, mounting filesystems and creating the init process (the parent
of all processes). init's primary responsibility is to start up the system services as
specified in /etc/inittab. Typically these services include gettys (i.e. virtual
terminals where users can login), and the scripts in the
directory /etc/rc.d/init.dwhich usually spawn high-level daemons such
as httpd (the web server). On most UNIX systems you can type dmesg to see
system startup messages, or look in/var/log/messages.

If a mounted filesystem is not "clean" (e.g. the machine was turned off without
shutting down properly), a system utility fsck is automatically run to repair it.
Automatic running can only fix certain errors, however, and you may have to run
it manually:

# fsck filesys

where filesys is the name of a device (e.g. /dev/hda1) or a mount point (like /).
"Lost" files recovered during this process end up in the lost+found directory.
Some more modern filesystems called "journaling" file systems don't require fsck,
since they keep extensive logs of filesystem events and are able to recover in a
similar way to a transactional database.

7.4 Adding Users


useradd (in /usr/sbin):

useradd is a utility for adding new users to a UNIX system. It adds new user
information to the /etc/passwd file and creates a new home directory for the
user. When you add a new user, you should also set their password (using the - p
option on useradd, or using the passwd utility):

# useradd bob
# passwd bob
7.5 Controlling User Groups
groupadd (in /usr/sbin):

groupadd creates a new user group and adds the new information to /etc/group:

# groupadd groupname

usermod (in /usr/sbin):

Every user belongs to a primary group and possibly also to a set of supplementary
groups. To modify the group permissions of an existing user, use

# usermod -g initialgroup username -G othergroups

where othergroups is a list of supplementary group names separated by commas


(with no intervening whitespace).

groups

You can find out which groups a user belongs to by typing:

# groups username

7.6 Reconfiguring and Recompiling the Linux Kernel


Linux has a modular, customisable kernel with several switchable options (e.g.
support for multiple processors and device drivers for various hardware devices).
It may happen that some new hardware is added to a Linux machine which
requires you to recompile the kernel so that it includes device driver support (and
possibly new system calls) for the new hardware. To do this, you will need to
rebuild the Linux kernel from scratch as follows:

Look in /usr/src/linux for the kernel source code. If it isn't there (or if there is just
a message saying that only kernel binaries have been installed), get hold of a
copy of the latest kernel source code from http://www.kernel.org and untar it
into /usr/src/linux.
 Change directory to /usr/src/linux.

To configure the kernel type either

# make config (simple text mode configuration), or


# make menuconfig (menu-driven text configuration), or
# make xconfig (graphical configuration for X)

You will be asked to select which modules (device drivers, multiprocessor support
etc.) you wish to include. For each module, you can chose to include it in the
kernel code (y), incorporate it as an optional module that will be loaded if needed
(m) or to exclude it from the kernel code (n). To find out which optional modules
have actually been loaded you can run lsmod when the system reboots.

Now type:

# make dep (to build source code dependencies)


# make clean (to delete all stale object files)
# make bzImage (to build the new kernel)
# make modules (to build the new optional modules)
# make modules_install (to install the modules)

 The file /usr/src/linux/arch/i386/boot/bzImage now contains your new


kernel image. It remains only to install it. 

Change directory to /usr/src/linux/arch/i386/boot. In the same directory should be


a script called install.sh which will copy your kernel image into/boot/vmlinuz:

# install.sh version bzImage /boot/System.map /boot

where version is the kernel version number (of form 2.2.xx).

 Finally, you may need to update the /etc/lilo.conf file so that lilo (the Linux
boot loader) includes an entry for your new kernel. Then run 

# lilo
to update the changes. When you reboot your machine, you should be able
to select your new kernel image from the lilo boot loader.

7.7 Cron Jobs


crond is a daemon that executes commands that need to be run regularly
according to some schedule. The schedule and corresponding commands
are stored in the file/etc/crontab.

Each entry in the /etc/crontab file entry contains six fields separated by spaces or
tabs in the following form:

minute hour day_of_month month weekday command

These fields accept the following values:

minute 0 through 59
hour 0 through 23
day_of_month 1 through 31
month 1 through 12
weekday 0 (Sun) through 6 (Sat)
command a shell command

You must specify a value for each field. Except for the command field, these
fields can contain the following:

 A number in the specified range, e.g. to run a command in May, specify 5


 in the month field. 
 Two numbers separated by a dash to indicate an inclusive range, e.g. to
 run a cron job on Tuesday through Friday, place 2-5 in the weekday field. 
 A list of numbers separated by commas, e.g. to run a command on the first
 and last day of January, you would specify 1,31 in the day_of_month field. 
 (asterisk), meaning all allowed values, e.g. to run a job every hour,
specify an asterisk in the hour field. 

You can also specify some execution environment options at the top of the
/etc/crontab file:
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root

To run the calendar command at 6:30am. every Mon, Wed, and Fri, a suitable
/etc/crontab entry would be:

30 6 * * 1,3,5 /usr/bin/calendar

The output of the command will be mailed to the user specified in the MAILTO
environment option.

You don't need to restart the cron daemon crond after changing /etc/crontab - it
automatically detects changes.

7.8 Keeping Essential Processes Alive


It is important that daemons related to mission critical services are immediately
respawned if they fail for some reason. You can do this by adding your own
entries to the/etc/inittab file. For example:

rs:2345:respawn:/home/sms/server/RingToneServer

Here rs is a 2 character code identifying the service, and 2345 are the runlevels (to
find about runlevels, type man runlevel) for which the process should be created.
Theinit process will create the RingToneServer process at system startup, and
respawn it should it die for any reason.
The VI Editor

Vi Editor is used to edit a text file.It uses a number of internal commands to


navigate to any point in a text file and edit the text there.It also allows you to
copy and move text within a file and also from one file to another.

.vi Basics:

To invoke a file:

$ vi <filename>

You will see a ~ sign and you are in Command Mode.

This is the mode where you can pass commands to act on text, using most of the
keys of the keyboard.Presing a key doesn’t show it on the screen but performs a
function like moving the cursor,deleting a line etc.

To enter a text you have to move to the Input Mode.This can be done by pressing
the key marked I and you are in this mode.

To erase a letter written before the current cursor position use backspace.

To erase the entire word press [ctrl+w]

Press [Esc] key to revert to command mode

To save the textwritten move to exMode or Last Line Mode.

To invoke the exMode from the Command Mode enter colon( : ), which shows up
in the last line.Enter x and press enter.

The file is saved and It quits the editor taking back to the shell prompt.
Modes of vi Editor:There are three modes in vi Editor.These are:

 Command Mode 
 Input Mode 
 Ex Mode or Last Line Mode 

Command Mode:

This is the default Mode of the editor where every key


pressed is interpreted as a command to run on text..You will be in this mode to
copy or delete the text.Unnecessary press of Esc sounds a beep.

Input Mode:

Every key pressed after switching to this mode actually shows


up as text.This mode is invoked by pressing keys.Pressing [Esc] in this mode takes
vi to Command mode.

Ex Mode or Last Line Mode:

The mode used to handle files and perform


substitution.Pressing a colon ( : ) in the command mode takes you to ex Mode.

You can run an ex Mode Command followed by an [Enter].It will take you back to
the command Mode.

To Remember:

To Clear Screen:

In the Command Mode Use [Ctrl+l]

Don’t use *Caps Lock+

Avoid Using PC Navigation: Don’t use any standard navigation like


Up,Down,PageUp etc.
Input Mode:

Key of the Input Mode doesn’t appear on the screen.It is used for
entering,copying,replacing the text.Following are the commands used in this
mode:

Insert and append Replace ( I,a,I,A)


(r,R,s,S)
Open a line I
(o , O)
A

.a Inserts text at the beginning of line


appends text at end of line Appends
.i O
right of the cursor position
.o r
Shifts text to the right without overwriting it.
.s R
Opens a line above the current line

Opens a new line below the current line.


S
Replace single character

Replaces one character with many

Replaces all character on the right of the cursor


Position

Replaces the entire line irrespective of the cursor


position.
Saving Text and Quitting The ex Mode:

Command Action

:w Saves file and remain in editing mode

:x Saves file and quits editing mode

:wq as above

:w new.pl Like Save as

:w! n2w.pl As above, but overwrites existing file

:q Quits editing mode when no changes are made to

file

:q! Quits editing mode but after abandoning changes

n1,n2w build.sql writes line n1 to n2 to file bild.sql

:.w build.sql Writes current line to file build.sql

:$w build.sql Write last line to file build.sql

:sh Escapes to UNIX Shell

:recover Recovers file from a crash

10,50w n2.pl Writes 41 lines to another file

5w n2.pl Writes 5th line to another file.

:.,$w tempfile Saves current line through end


Navigation:

Movement in the four directions

.k Moves cursor up

.j Moves cursor down

.h Moves cursor left

.l Moves cursor right

The repeat factor can also be used as a command prefix with all these four
commands

e.g.

4k Moves the cursor 4 lines up

20h Moves the cursor 20 characters left.

Word Navigation:

B,b Moves back to the beginning of the word

W,w Moves forword to the beginning of the word

E,e Moves forword o the end of the Word

Here in case of b,w,e word is simply a string of alphanumeric characters and


underscore whereas in case of B,W,E a word becomes a string of non- whitespace
characters i.e. those containing a space or a tab.

A repeat factor speeds up the cursor movement along a line.

E.g. 5b Takes the cursor 5 words back

Moving to line extremes

0 or | Moves to the first character of a Line

30| Moves the cursor to column 30


$ In the ex Mode-last line of the file

In the command mode- moves to the end of the current line.

Scrolling

Ctrl+f Scrolls forward

Ctrl+b Scrolls Backword

Ctrl+d Scrolls half page forward

Ctrl+u Scrolls half page backward

The repeat factor is also used here.

Absolute Movement:

Ctrl+g the current line number

40G moves to the line number40

G end of the file

1G goes to the line number 1

Editing Text

Deleting text:

.x Deletes a single character

.dd deletes the entire current line

The repeat factor can be used. As

4x deletes the current character as well as three charactes from the

right

6dd deletes the crrent line and five lines below


4X deletes the current character as well as three characters from the

left

Moving Text

P and p is used to move text or to do Put operation followed by delete or


copy operations.

E.g suppose u have written Nuix instead of Unix. It means u have to place n after u
for this use:

.x unix becomes uix cursor on u

.p n put on right – uix becomes unix

To associate word right use p and left use “P”. P places text on left of the cursor..

To put deleted lines from one location to another same commands can be used.

.p puts the deleted line above the current Line

P puts the deleted line below the current Line

Copying Text

.yy Yanks a current line

10yy current +9 lines below the current line

P or p can be used according to the requirement to put lines further.

Joining Lines:

J Joins the cureent line with the following line.

4J joins following 3 lines with current one.


Undoing Last Editing Instructions

.u Must use in Command Mode;press [Esc] if necessary

U Don’t move away from current line.

Repeating The Last Command:

.(Dot) used for repeating both Input and

Command Mode Commands.

Use the actual command once and repeat it any number of times Use u to undo it.

Searching For A Pattern(/ and ?)

Searching can be made in both forward and reverse directions and can be
repeated.

/printf Searches forward

?pattern[Enter] Searches backward for the most previous instance of the


pattern.

Repeating the Last Pattern Search:

.n Repeats search in same direction of original search

N Repeats search in the forward direction.

Search and Replace:

With the ex mode s command is used to substitute some text.It lets you replace a
pattern in the file with something else.

Syntax:

:address/source_pattern/target_pattern/flags
Address can be one or pair of numbers separated by comma.

Commanly used flag is g,which carries out substitution for all occurrences of the
pattern in a line.

To make interactive substitution gc flag is used.

If g or gc flag is not used than substitution will be carried out for the first
occurrences in each addressed line.

:1,$s /director/member/g

Here director is replaced with member globally throughout the file.

UNIX Standard Devices

There are THREE standard devices supported by the


UNIX shell, stdin, stdout and stderr. Each program
that runs has allocated to it a stdin, stdout and stderr
device.

  these devices may be redirected to another device or file 


  standard output (stdout) is associated with the users terminal display 
  standard input (stdin) is associated with the users terminal keyboard 
 standard error (stderr) is associated with the users terminal display 

When a program is executed, it has


associated with it each of three
standard UNIX devices.
Device Redirection

The input or output devices associated with


a program may be redirected to another
device.

For example, a program that normally reads from stdin (the keyboard) can be
redirected to read from a file.

In a similar manner, a program that


normally writes to stdout (the
screen) can redirect its output to a
file or printer. In this example, the ls
command redirects stdout to files.

  the standard devices may be redirected to a device or file 


  other devices can be printers, terminals 
  devices appear as files in the /dev subdirectory 
  the symbol > changes the standard output device, creating a new file 
  the symbol < changes the standard input device 
 >> redirects stdout, appending to an existing file 

Shell Device Redirection

It is also possible to redirect any device using the file descriptor number assigned
to that device.
By default, the c compiler sends the error output to
the screen. To redirect the error output to a file, it is
necessary to use the file descriptor.

 stdin has a file descriptor of 0 


 stdout has a file descriptor of 1 
 stderr has a file descriptor of 2 

Pipes

Sometimes, the use of intermediately files is undesirable. Consider an example


where you are required to generate a report, but this involves a number of steps.
First you have to concatenate all the log files. The next step after that is strip out
comments, and then to sort the information. Once it is sorted, the results must be
printed.

A typical approach is to do each operation as a separate step, writing the results


to a file that then becomes an input to the next step.

Pipelining allows you to connect two


programs together so that the output of one
program becomes the input to the next
program.

 allow commands to be combined in a sequential order 


  connects stdout of one program to the stdin of the next program 
  the symbol | (vertical bar) represents a pipe 
 any number of commands can be connected in sequence, forming a
pipeline 
  all programs in a pipeline execute at the same time 
 complex operations are easily performed by piping commands 
UNIT – III

Enivironment variables(HOME,LANG,SHELL,USER,DISPLAY,VISUAL),Local variables,


concept of /etc/passwd, /etc/shadow, /etc/group,and su- command, special
permissions(suid for an executable,sgid for an executable,sgid for a
directory,sticky bit for a directory) tail, wc, sort, uniq, cut, tr, diff, aspell, basic
shell scripts grep, sed,
The awk Filter: Selection criteria and action. Splitting a line into fields and using
printf. Computation sing decimal numbers. The BEGIN and END sections. Using
arrays with both numeric and nonnumeric subscript. String handling using built-in
functions. Programming constructs : if, for, while,. Using awk in pipelines.

A variable is a character string to which we assign a value. The value assigned


could be a number, text, filename, device, or any other type of data.
A variable is nothing more than a pointer to the actual data. The shell enables you
to create, assign, and delete variables.
Variable Names:

The name of a variable can contain only letters ( a to z or A to Z), numbers ( 0 to 9)


or the underscore character ( _).

By convention, Unix Shell variables would have their names in UPPERCASE.

The following examples are valid variable names:

_ALI
TOKEN_A
VAR_1
VAR_2
Following are the examples of invalid variable names:

2_VAR -
VARIABLE
VAR1-VAR2
VAR_A!

The reason you cannot use other characters such as !,*, or - is that these
characters have a special meaning for the shell.

Defining Variables:
Variables are defined as follows::

variable_name=variable_value

For example:

NAME="Zara Ali"

Above example defines the variable NAME and assigns it the value "Zara Ali".
Variables of this type are called scalar variables. A scalar variable can hold only
one value at a time.

The shell enables you to store any value you want in a variable. For example:

VAR1="Zara Ali"
VAR2=100

Accessing Values:
To access the value stored in a variable, prefix its name with the dollar sign ( $):

For example, following script would access the value of defined variable NAME
and would print it on STDOUT:

#!/bin/sh
NAME="Zara Ali"
echo $NAME
This would produce following value:

Zara Ali

Read-only Variables:

The shell provides a way to mark variables as read-only by using the readonly
command. After a variable is marked read-only, its value cannot be changed.

For example, following script would give error while trying to change the value of
NAME:

#!/bin/sh
NAME="Zara Ali"
readonly NAME
NAME="Qadiri"
This would produce following result:

/bin/sh: NAME: This variable is read only.

Unsetting Variables:

Unsetting or deleting a variable tells the shell to remove the variable from the list
of variables that it tracks. Once you unset a variable, you would not be able to
access stored value in the variable.

Following is the syntax to unset a defined variable using the unset command:

unset variable_name

Above command would unset the value of a defined variable. Here is a simple
example:

#!/bin/sh
NAME="Zara Ali"
unset NAME
echo $NAME

Above example would not print anything. You cannot use the unset command to
unset variables that are marked readonly.

Variable Types:

When a shell is running, three main types of variables are present:

 Local Variables: A local variable is a variable that is present within the


current instance of the shell. It is not available to programs that are started
 by the shell. They are set at command prompt. 
 Environment Variables: An environment variable is a variable that is
available to any child process of the shell. Some programs need
environment variables in order to function correctly. Usually a shell script
defines only those environment variables that are needed by the programs
that it runs. 
 Shell Variables: A shell variable is a special variable that is set by the shell
and is required by the shell in order to function correctly. Some of these
variables are environment variables whereas others are local variables. 

Shell Variables and Environment Variables

Every UNIX process runs in a specific environment. An environment consists of a


table of environment variables, each with an assigned value. When you log in
certain login files are executed. They initialize the table holding the environment
variables for the process. When this file passes the process to the shell, the table
becomes accessible to the shell. When a (parent) process starts up a child
process, the child process is given a copy of the parent process' table.
Environment variable names are generally given in upper case by convention.

The shell also maintains a set of internal variables known as shell variables. These
variables cause the shell to work in a particular way. Shell variables are local to
the shell in which they are defined; they are not available to the parent or child
shells. By convention, shell variable names are generally given in lower case in the
C shell family and upper case in the Bourne shell family.

C Shell Family

The C shell family explicitly distinguishes between shell variables and environment variables.

Shell Variables

A shell variable is defined by the set command and deleted by the unset
command. The main purpose of your .cshrc file (discussed later in this chapter) is
to define such variables for each process. To define a new variable or change the
value of one that is already defined, enter:

% set name=value

where name is the variable name, and value is a character string that is the value
of the variable. If value is a list of text strings, use parentheses around the list
when defining the variable, e.g.,

% set name=(value1 value2 value3)

The set command issued without arguments will display all your shell variables.
You cannot check the value of a particular variable by using set name,
omitting=value in the command; this will effectively assign an empty value to the
variable.

To delete, or unset, a shell variable, enter:

% unset name

To use a shell variable in a command, preface it with a dollar sign ($), for example
$name. This tells the command interpreter that you want the variable's value, not
its name, to be used. You can also use ${name}, which avoids confusion when
concatenated with text.

To see the value of a single variable, use the echo command:

% echo $name

If the value is a list, to see the value of the nth string in the list enter:

% echo $name[n]

The square brackets are required, and there is no space between the name and
the opening bracket.

To prepend or append a value to an existing shell variable, use the following


syntax:

% set name=prepend_value${name}

or

% set name=${name}append_value

Note that when a shell is started up, four important shell variables are
automatically initialized to contain the same values as the corresponding
environment variables. These are user, term, home and path. If any of these are
changed, the corresponding environment variables will also be changed.

Environment Variables

An important Unix concept is the environment, which is defined by environment


variables. Some are set by the system, others by you, yet others by the shell, or
any program that loads another program.
A variable is a character string to which we assign a value. The value assigned
could be a number, text, filename, device, or any other type of data.

For example, first we set a variables TEST and then we access its value using echo
command:

$TEST="Unix
Programming" $echo $TEST
Unix Programming

Note that environment variables are set without using $ sign but while accessing
them we use $sign as prefix. These variables retain their values until we come out
shell.

When you login to the system, the shell undergoes a phase called initialization to
set up various environment. This is usually a two step process that involves the
shell reading the following files:

 /etc/profile 

 profile 

The process is as follows:

1. The shell checks to see whether the file /etc/profile exists.

2. If it exists, the shell reads it. Otherwise, this file is skipped. No error
message is displayed.

3. The shell checks to see whether the file .profile exists in your home
directory. Your home directory is the directory that you start out in after
you log in.

4. If it exists, the shell reads it; otherwise, the shell skips it. No error message
is displayed.

5. As soon as both of these files have been read, the shell displays a prompt:

This is the prompt where you can enter commands in order to have them
execute.
Note - The shell initialization process detailed here applies to all Bourne type
shells, but some additional files are used by bash and ksh.

The .profile File:

The file /etc/profile is maintained by the system administrator of your UNIX


machine and contains shell initialization information required by all users on a
system.

The file .profile is under your control. You can add as much shell customization
information as you want to this file. The minimum set of information that you
need to configure includes

 The type of terminal you are using 



 A list of directories in which to locate commands 

 A list of variables effecting look and feel of your terminal. 

You can check your .profile available in your home directory. Open it using vi
editor and check all the variables set for your environment.

Setting the Terminal Type:

Usually the type of terminal you are using is automatically configured by either
the login or gettyprograms. Sometimes, the autoconfiguration process guesses
your terminal incorrectly.

If your terminal is set incorrectly, the output of commands might look strange, or
you might not be able to interact with the shell properly.

To make sure that this is not the case, most users set their terminal to the lowest
common denominator as follows:

$TERM=vt100
$
Setting the PATH:

When you type any command on command prompt, the shell has to locate the
command before it can be executed.

The PATH variable specifies the locations in which the shell should look for
commands. Usually it is set as follows:
$PATH=/bin:/usr/bin
$

Here each of the individual entries separated by the colon character, :, are
directories. If you request the shell to execute a command and it cannot find it in
any of the directories given in the PATH variable, a message similar to the
following appears:

$hello
hello: not found
$
There are variables like PS1 and PS2 which are discussed in the next section.

PS1 and PS2 Variables:


The characters that the shell displays as your command prompt are stored in the
variable PS1. You can change this variable to be anything you want. As soon as
you change it, it'll be used by the shell from that point on.

For example, if you issued the command:

$PS1='=>'
=> => =>

Your prompt would become =>. To set the value of PS1 so that it shows the
working directory, issue the command:
=>PS1="[\u@\h \w]\$"
[root@ip-72-167-112-17 /var/www/tutorialspoint/unix]$
[root@ip-72-167-112-17 /var/www/tutorialspoint/unix]$

The result of this command is that the prompt displays the user's username, the
machine's name (hostname), and the working directory.

There are quite a few escape sequences that can be used as value arguments for
PS1; try to limit yourself to the most critical so that the prompt does not
overwhelm you with information.
Escape Sequence Description

\t Current time, expressed as HH:MM:SS.

\d Current date, expressed as Weekday Month Date

\n Newline.

\s Current shell environment.

\W Working directory.

\w Full path of the working directory.

\u Current user.s username.

\h Hostname of the current machine.

Command number of the current command. Increases with each new command
\#
entered.

If the effective UID is 0 (that is, if you are logged in as root), end the prompt with
\$
the # character; otherwise, use the $.

You can make the change yourself every time you log in, or you can have the
change made automatically in PS1 by adding it to your .profile file.

When you issue a command that is incomplete, the shell will display a secondary
prompt and wait for you to complete the command and hit Enter again.

The default secondary prompt is > (the greater than sign), but can be changed by
re-defining the PS2shell variable:

Following is the example which uses the default secondary prompt:

$ echo "this is
a > test"
this is a
test
$

Following is the example which re-define PS2 with a customized prompt:

$ PS2="secondary prompt->"
$ echo "this is a
secondary prompt-
>test" this is a
test
$

Environment Variables:

Following is the partial list of important environment variables. These variables


would be set and accessed as mentioned above:

Variable Description

DISPLAY Contains the identifier for the display that X11 programs should use by default.

Indicates the home directory of the current user: the default argument for the cd
HOME
built-in command.

Indicates the Internal Field Separator that is used by the parser for word splitting
IFS
after expansion.

LANG expands to the default system locale; LC_ALL can be used to override
LANG this. For example, if its value is pt_BR, then the language is set to (Brazilian)
Portuguese and the locale to Brazil.

On many Unix systems with a dynamic linker, contains a colon-separated list of


LD_LIBRARY_PATH directories that the dynamic linker should search for shared objects when
building a process image after exec, before searching in any other directories.

Indicates search path for commands. It is a colon-separated list of directories in


PATH
which the shell looks for commands.

PWD Indicates the current working directory as set by the cd command.

RANDOM Generates a random integer between 0 and 32,767 each time it is referenced.

Increments by one each time an instance of bash is started. This variable is


SHLVL useful for determining whether the built-in exit command ends the current
session.

TERM Refers to the display type

TZ Refers to Time zone. It can take values like GMT, AST, etc.

UID Expands to the numeric user ID of the current user, initialized at shell startup.

Following is the sample example showing few environment


variables: $ echo $HOME
/root
]$ echo $DISPLAY

$ echo
$TERM xterm
$ echo $PATH
/usr/local/bin:/bin:/usr/bin:/home/amrood/bin:/usr/local/bin
$
Display Environment Variable

Open the terminal and type the following commands to display all
environment variables and their values under UNIX-like operating systems:

$ set
OR
$ printenv
OR
$ env

Sample outputs:

Fig.01: Displaying all environment variables and their values


command To display search path, enter:

echo $PATH

To display prompt settings, enter:

echo $PS1
A few more examples:

echo $USER
echo $PWD
echo $MAIL
echo $JAVA_PATH
echo $DB2INSTANCE

Change or Set Environment Variable

You can use the following command to change the environment variable for the
current session as per your shell.

For Korn shell (KSH)

The syntax is as follows:

var=value
export var

To set JAVA_PATH, enter:

JAVA_PATH=/opt/jdk/bin
export JAVA_PATH

For Bourne shell (sh and bash)

The syntax is as follows:

export var=value

To set PATH, enter:

export PATH=$PATH:/opt/bin:/usr/local/bin:$HOME/bin
For C shell (csh or tcsh)

The syntax is as follows:

setenv var value

Set EDITOR to vim, enter:

setenv EDITOR vim


Unix - Special Variables

Previous tutorial warned about using certain nonalphanumeric characters in your


variable names. This is because those characters are used in the names of special
Unix variables. These variables are reserved for specific functions.

For example, the $ character represents the process ID number, or PID, of the
current shell:

$echo $$

Above command would write PID of the current shell:

29949

The following table shows a number of special variables that you can use in your
shell scripts:

Variable Description

$0 The filename of the current script.

These variables correspond to the arguments with which a script was invoked. Here n is a
$n positive decimal number corresponding to the position of an argument (the first argument is
$1, the second argument is $2, and so on).

$# The number of arguments supplied to a script.

All the arguments are double quoted. If a script receives two arguments, $* is equivalent to $1
$*
$2.

All the arguments are individually double quoted. If a script receives two arguments, $@ is
$@
equivalent to $1 $2.

$? The exit status of the last command executed.

The process number of the current shell. For shell scripts, this is the process ID under which
$$
they are executing.
$! The process number of the last background command.

Command-Line Arguments:

The command-line arguments $1, $2, $3,...$9 are positional parameters, with $0
pointing to the actual command, program, shell script, or function and $1, $2, $3,
...$9 as the arguments to the command.

Following script uses various special variables related to command line:

#!/bin/sh

echo "File Name: $0" echo


"First Parameter : $1" echo
"First Parameter : $2" echo
"Quoted Values: $@" echo
"Quoted Values: $*"
echo "Total Number of Parameters : $#"

Here is a sample run for the above script:

$./test.sh Zara Ali


File Name : ./test.sh
First Parameter : Zara
Second Parameter : Ali
Quoted Values: Zara Ali
Quoted Values: Zara Ali
Total Number of Parameters : 2

Special Parameters $* and $@:

There are special parameters that allow accessing all of the command-line
arguments at once. $* and $@ both will act the same unless they are enclosed in
double quotes, "".
Both the parameter specifies all command-line arguments but the "$*" special
parameter takes the entire list as one argument with spaces between and the
"$@" special parameter takes the entire list and separates it into separate
arguments.

We can write the shell script shown below to process an unknown number of
command-line arguments with either the $* or $@ special parameters:

#!/bin/sh

for TOKEN in $*
do
echo $TOKEN
done
There is one sample run for the above script:

$./test.sh Zara Ali 10 Years


Old Zara
Ali
10
Years
Old

Note: Here do...done is a kind of loop which we would cover in subsequent


tutorial.

Exit Status:

The $? variable represents the exit status of the previous command.

Exit status is a numerical value returned by every command upon its completion.
As a rule, most commands return an exit status of 0 if they were successful, and 1
if they were unsuccessful.

Some commands return additional exit statuses for particular reasons. For
example, some commands differentiate between kinds of errors and will return
various exit values depending on the specific type of failure.
Following is the example of successful command:

$./test.sh Zara Ali


File Name : ./test.sh
First Parameter : Zara
Second Parameter : Ali
Quoted Values: Zara Ali
Quoted Values: Zara Ali
Total Number of Parameters : 2
$echo $?
0
$
Simple Filters

PAGINATING FILES((pr)

The pr commands prepares a file for printing by adding suitable header footers
and formatted text.

$pr <filename>
$ pr dept.lst

pr adds five lines of margin at the top and five at the bottom. The lower portion of
the page has not been shown in the examples for reasons of economy.The header
shows the date and time of last modification of the file, along with the filename
and page number.

14.2.1 pr Options

pr’s –k option (where k is an integer) prints in k columns If a program outputs a


series of 20 numbers, one in each line, then this option can make good use of the
screen’s empty spaces.As pr is a filter, it can obtain its input from the standard
output of another program.
-t option is used to suppress the headers and footers.

$ a.out | pr –t -5

$pr –t –n –d –o 10 dept.lst

To specify specific page number

$pr +10 chap01

-l option is used to sets the page length

$pr –l 54 chap01

Because pr formats its input by adding margins and a header, it’s often used as a
pre-processor before printing with lp command.
&pr –h “Department list” | lp

Since pr output often lands up in the hard copy, pr and lp form a


common pipeline sequence.

Few more options are:

-d Double spaces input , reduces clutter


-n Numbers lines, which helps in debugging code
-o n Offsets lines by n spaces; increases left margin of page

Displaying the beginning of a file: head

The head command, as the name implies, displays the top of the file. When used
without an option it displays the first ten lines of the file.

.head emp.lst
To display fixed number of lines option used is –n

$head –n 3 emp.lst

Displaying The end of the file(tail):

It displays the end of the file.It provides an additional method of addressinglines.

To display last three lines:

$tail -3 emp.lst

+count option is used to represent the line number from where the
selection should begin.

$tail +11 emp.lst


Displays lines which are 11th line onward.

Options:

Monitoring File Growth(-f):

$tail –f /oracle/app/oracle/product/8.1/install.log

Splitting A File Vertically( cut):

A file can be vertically splitted according to the column or fileds.tee is a facility


that saves the output in a file as well as displays it on the terminal.

a) Cutting Columns(-c):

To extract specific columns –c option is used with a list of column


numbers, delimited by a comma.Ranges can also be used using hyphen.

$ cut –c 6-22,24-32 abc

Cutting Fields(-f):
Cut uses tab as field delimiter and –f is used to mention field.-d is also used to
mention field delimiter.

To cut 2nd and 3rd field of a file

$ cut –d \| -f 2,3 abc | tee xyz

Extracting user list from who output:

$who | cut –d “ “ –f 1

Pasting Files(paste):

To paste the information u get through cut command , paste command is


used.It paste the information vertically rather than horizontally.

$ cut –d \| -f 1,4- sortlist | paste –d “|” cutlist –

$ cut –d \| -f 1,4- sortlist | paste –d “|” –cutlist

$ paste –d “|” cutlist1 cutlist2

Joining Lines(-s) It is used to join lines. With this option –d can also be used to
put delimiters.It can also be used in a circular manner. Suppose three
different information about a candidate is given in three lines and we want to
get all together in one line , the command used can be:

$ paste –s –d “||\n” addressbook


$paste –s addressbook

Ordering a file(sort):

Sort orders a file. It can be sort on specified field.

$ sort sortfile

Sort reorder lines in ASCII collating sequence –whitespace first, then numerals,
uppercase letters and finally lowercase letters.
Sort options:

sorting on primary key(-k)


$ sort –t “|” –k 2 sortlist It
will sort on second field.

$sort –t “|” –r –k 2 sortlist


It will sort in reverse order.
Or the command can be issued as:

$sort –t “|” –k 2r sortlist

Sorting on secondary key: if the PK is the third field and SK is the 2nd field then

$sort –t “|” –k 3,3 -k 2,2 sortlist

Sorting on columns: Within a field a character position can also be specified to be


the beginning of sort.To sort the file according to the date of birth, i.e. to sort 7th
and 8th column position within he fifth column:

$ sort –t “|” –k 5.7,5.8 sortlist

Numeric sort(-n)

$sort numfile
$sort –n numfile

USERS

 user account files: /etc/passwd and /etc/shadow 


  useradd - add a user account 
  userdel - remove a user account 
 usermod - modify userid info, e.g. userid, UID, GID, etc. 
  chsh - change shell 
 passwd - change password 
  su - start a subshell: log in as a new userid 
 sudo - execute a single command as another userid 
GROUPS
  group account files: /etc/group and /etc/gshadow 
  groupadd - create a new group 
  groupdel - delete a group 
 groupmod - modify group name, GID, password 
o gpasswd - manage groups: set group administrator, add/delete
members
  groups - display all groups 
  id - display user UID and group GID and groups 
 newgrp - start a subshell: log in to a new group with a password 

Users: The Password File - /etc/passwd

o the /etc directory is where “Host-Specific Configuration” files are


stored
o Almost everything a user can or can’t do in a Linux system
is determined by:
 what user they log in as (or become with su or sudo) 
 what group(s) that user belongs to 

Password File Format - /etc/passwd

When a user is created on the system, the following information is stored in


seven colon-separated fields in /etc/passwd:

username:x:UID:GID:comment:home_directory:login_shell 1
234567
root:x:0:0:Super User:/root:/bin/bash
idallen:x:500:500:Ian! D. Allen:/home/idallen:/bin/bash

1. login userid (stored in variables $USER or $LOGNAME in the shell)


2. encrypted password (or an x marker indicating use of /etc/shadow)
3. User ID number (UID)
4. Group ID number (GID) - but users can be in more groups, too
5. Comments: any text information; often the user’s full name and/or office
6. Home directory (absolute path): usually /home/$USER
7. Login shell to give the user at login; usually /bin/bash
  The above information about each user is kept in /etc/passwd 
  The file requires root access for modifications (writing) 

 Its content can be viewed (read) by anyone 
 Using privileged commands, users can modify content related

to their own account info, e.g. passwd, chsh 
 Encrypted passwords are usually stored in /etc/shadow, accessible
only by root 

Shadow Passwords - /etc/shadow

 When a system has shadow passwords enabled (the default), the password
field in /etc/passwd is replaced by an “x” and the user’s real encrypted

password is stored in /etc/shadow. 
 /etc/shadow is only readable by the root user, so even the encrypted


password is hidden and can’t be used in a password-cracking program 
 Each line in /etc/shadow contains the user’s login userid, their encrypted
password, and fields relating to password expiration. 
 Special passwords (see “man shadow”): 
1. a leading ! means the password (and thus account) is locked
2. an asterisk (star) * indicates the account has been disabled
useradd

 Used to create a new login account. 




Also creates a group with the same name. 
 Usually the defaults are correct, but options let you change any of
the information to be stored in the passwd and group files. 
 Sometimes called “adduser”, but sometimes “adduser” is 
a different program (e.g. Ubuntu).

userdel

 Remove an account from the password and group files. 


 To actually remove the home directory, you must use the “-r” option! 

 if you forget -r, you will leave a home directory with no owner! 

 Will not remove an account that has active processes running (e.g. a shell) 
usermod

 Change any of the information about a user account. 


 Changing the home directory with “-d” changes only the field 
in /etc/passwd; it does not actuallymove the directory unless you also give “-m”.

 RTFM: the -d option must be followed by the new home directory



name 
 RTFM: do not put the -m option in between the -d and the home

directory 
 RTFM: the last argument on the command line must always be

the login name of the existingaccount you want modified 
 If you have already used -d without using -m, you can’t do the
command a second time with -m - it will say “nothing changed”.
You have to put things back the way they were by using -d(without -
m) to undo the change you made, then use -d with -m to redo the
change. 

 Can lock/unlock an account by inserting “!” in front of the password field. 
 Will not modify an account that has active processes running (e.g. a shell) 

chsh

 “CHange SHell” 
 Changes the login shell in /etc/passwd - does not affect current shell 
 Only root can change shells of other accounts 
 If a shell isn’t specified on the command line, it will prompt for one 
 Usually only allows setting a shell from a small system-defined list 

passwd

 Changes the login password in /etc/passwd (or /etc/shadow) 


 Only root can change passwords of other accounts 

2.8 su

 Set userid or substitute user 



2.9 sudo
 Execute a single command with other (usually root) privileges 

Groups: The Group File - /etc/group


 Groups allow a set of permissions to be assigned to group of users 
 Every file system object has “group” permissions; if you are not the owner
of the object but are in that group, group permissions apply to you. 

File system objects have only one owner and can be in only

 one group. 
  Logged in users can be “in” (members of) multiple groups. 


Most group information is maintained in /etc/group and /etc/gshadow 
 BUT: At login, every user is given an initial group GID from
the passwd file. 

 A user will belong to other groups (supplementary groups), if the user is
a member of those groups in the /etc/group file. 

Group File Format - /etc/group

When a group is created on the system, the following information is stored in


four colon-separated fields in /etc/group:

groupname:x:GID:userid1,userid2,userid3
1234
root:x:0:
cdrom:x:500:idallen,alleni

 group name 
 encrypted password (or an x marker indicating use of /etc/gshadow) 
  Group ID number (GID) 
 Optional list of userids that are members of that group 

1. The above information about groups is kept in /etc/group
2. Modifications can be done by root or by the Group Administrator for a group
3. Its content can be viewed by anyone
4. Encrypted passwords are usually stored in /etc/gshadow, accessible only
by root
Group Shadow Passwords - /etc/gshadow

 When a system has shadow passwords enabled (the default), the password field
in /etc/group is replaced by an “x” and the user’s real encrypted password is

stored in /etc/gshadow. 
 /etc/gshadow is only readable by the root user, so even the encrypted password

is hidden and can’t be used in a password-cracking program 
 Each line in /etc/gshadow contains the group name, the group encrypted
password, an optional list of Group Administrators, and an optional list of
 Group Members (which should be the same in /etc/group) 
 Special passwords (see “man gshadow”): 

o a leading ! means the group password is locked


o * indicates the group cannot be logged into by non-members

Group Commands - groupadd, groupdel, groupmod, gpasswd, group, id, newgrp

 groupadd - create a new group in /etc/group 


 groupdel - remove a group from /etc/group 
 groupmod - modify the name or GID of a group in /etc/group 
 gpasswd - administer the /etc/group and /etc/gshadow files 

o can be used by the Group Administrator as well as root


o add and delete group members, or set the member list
o root can set the list of Group Administrators for a group

 group - list all the groups a user belongs to 




id - more detailed version of “groups” showing numeric values 
 newgrp - (rarely used) use the group password to start a new shell with additional
group privileges 

Changing Privilege - su, sudo, and newgrp

su - substitute user or set userid

 Example: su --login 
 Opens up a subshell as the new user, with that user’s privileges 
 Exiting the subshell goes back to the previous user 
 Ordinary (non-root) users need to enter the password for the other account 
 A dash - or --login option (options must be surrounded by spaces) means use a
full login shell that clears the environment, sets groups and goes to the user’s

home directory as if the user had just logged in. 
 Without the full login, the command will set privileges but will leave most of the
existing environment unchanged, including an unchanged current directory (that
 may not grant the new user any permissions!). 
 If you don’t give a userid, it assumes you want to become the root user 


[idallen@localhost]$ whoami

idallen 
[idallen@localhost]$ su
password: XXX
[root@localhost]# whoami

root 
[root@localhost]# exit
[idallen@localhost]$
[idallen@localhost]$ whoami
idallen 

sudo - do as if suIndex

 Example: sudo passwd idallen 


 Execute a single command with other (usually root) privileges 
 Safer way to do root tasks (avoids running a whole shell as root) 
 The root account can update /etc/sudoers with the list of who can do what 
[idallen@localhost]$ whoami
idallen
[idallen@localhost]$ sudo passwd alleni
[sudo] password for idallen: XXXXXXXXXX
Changing password for user alleni.
New password: XXX
Retype new password: XXX
passwd: all authentication tokens updated
successfully. [idallen@localhost]$ whoami
idallen
[idallen@localhost]$

newgrp - log in to a new group

  Opens up a subshell as the new group, with that group’s privileges 


 Exiting the subshell goes back to the previous group 
 rarely used - needs a group password set 
Author:
| Ian! D. Allen - idallen@idallen.ca - Ottawa, Ontario, Canada
| Home Page: http://idallen.com/ Contact Improv: http://contactimprov.ca/
| College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/
| Defend digital freedom: http://eff.org/ and have fun: http://fools.ca/
http://teaching.idallen.com/cst8207/12f/notes/600_users_and_groups.html
http://debian-handbook.info/browse/stable/sect.user-group-databases.html
http://teaching.idallen.com/cst8207/12f/notes/indexcgi.cgi

What is SUID, SGID and Sticky bit?

There are 3 special permission that are available for executable files
and directories. These are:

1. SUID permission
2. SGID permission
3. Sticky bit

Set-user Identification (SUID)

Have you ever thought how a non-root user can change his own password when
he does not have write permission to the/etc/shadow file. To understand the
trick check for the permission of /usr/bin/passwd command :

# ls -lrt /usr/bin/passwd
-r-sr-sr-x 1 root sys 31396 Jan 20 2014
/usr/bin/passwd

– If you check carefully, you would find the 2 S’s in the permission field. The first
s stands for the SUID and the second one stands for SGID.

– When a command or script with SUID bit set is run, its effective UID becomes
that of the owner of the file, rather than of the user who is running it.

– Another good example of SUID is the su command :

# ls -l /bin/su
-rwsr-xr-x-x 1 root user 16384 Jan 12 2014 /bin/su

– The setuid permission displayed as an “s” in the owner’s execute field.


How to set SUID on a file?

# chmod 4555 [path_to_file]

Note :
If a capital “S” appears in the owner’s execute field, it indicates that the setuid bit
is on, and the execute bit “x” for the owner of the file is off or denied.

Set-group identification (SGID)

SGID permission on executable file

– SGID permission is similar to the SUID permission, only difference is – when the
script or command with SGID on is run, it runs as if it were a member of the same
group in which the file is a member.

# ls -l /usr/bin/write

-r-xr-sr-x 1 root tty 11484 Jan 15 17:55 /usr/bin/write

– The setgid permission displays as an “s” in the group’s execute field.

Note :
– If a lowercase letter “l” appears in the group’s execute field, it indicates that
the setgid bit is on, and the execute bit for the group is off or denied.

How to set GUID on a file?

# chmod 2555 [path_to_file]

SGID on a directory
– When SGID permission is set on a directory, files created in the directory belong
to the group of which the directory is a member.

– For example if a user having write permission in the directory creates a file
there, that file is a member of the same group as the directory and not the user’s
group.

– This is very useful in creating shared directories.

How to set SGID on a directory

# chmod g+s [path_to_directory]

Sticky Bit

– The sticky bit is primarily used on shared directories.

– It is useful for shared directories such as /var/tmp and /tmp because users can
create files, read and execute files owned by other users, but are not allowed to
remove files owned by other users.

– For example if user bob creates a file named /tmp/bob, other user tom can not
delete this file even when the /tmp directory has permission of 777. If sticky bit is
not set then tom can delete /tmp/bob, as the /tmp/bob file inherits the parent
directory permissions.

– root user (Off course!) and owner of the files can remove their own files.

Example of sticky bit :

# ls -ld /var/tmp

drwxrwxrwt 2 sys sys 512 Jan 26 11:02 /var/tmp

- T refers to when the execute permissions are off.


- t refers to when the execute permissions are on.

How to set sticky bit permission?

# chmod +t [path_to_directory]
or

# chmod 1777 [path_to_directory]

http://thegeekdiary.com/what-is-suid-sgid-and-sticky-bit/

AWK FILTER

awk is a programming language designed to search for, match patterns, and


perform actions on files. awk programs are generally quite small, and are
interpreted. This makes it a good language for prototyping.

THE STRUCTURE OF AN AWK PROGRAM

awk scans input lines one after the other, searching each line to see if it matches a
set of patterns or conditions specified in the awk program.

For each pattern, an action is specified. The action is performed when the pattern
matches that of the input line.

Thus, an awk program consists of a number of patterns and associated actions.


Actions are enclosed using curly braces, and separated using semi-colons.

pattern { action }
pattern { action }

INPUT LINES TO awk

When awk scans an input line, it breaks it down into a number of fields. Fields are
separated by a space or tab character. Fields are numbered beginning at one, and
the dollar symbol ($) is used to represent a field.
For instance, the following line in a file

I like money.

has three fields. They are

$1 I
$2 like
$3 money.

Field zero ($0) refers to the entire line.

awk scans lines from a file(s) or standard input.

Running an awk program

To run the above program, type following command

awk -f myawk1 /etc/group

awk interprets the actions specified in the program file myawk1, and applies this
to each line read from the file /etc/group. The effect is to print out each input line
read from the file, in effect, displaying the file on the screen (same as the Unix
command cat).

Searching for a string within an input line

To search for an occurrence of a string in an input line, specify it as a pattern and


enclose it using a forward slash symbol. In the example below, it searches each
input line for the string brian, and the action prints the entire line.

/brian/ { print $0 }

Using awk programs with form files

awk programs are particularly suited to generating reports or forms. In the


following examples, we shall use the following textual data as the input file. The
file is called awktext. A heading has been provided here for clarity, there is no
header in the data file.
Type Memory (Kb) Location Serial # HD Size (Mb)

XT 640 D402 MG0010 0

386 2048 D403 MG0011 100

486 4096 D404 MG0012 270

386 8192 A423 CC0177 400

486 8192 A424 CC0182 670

286 4096 A423 CC0183 100

286 4096 A425 CC0184 80

Mac 4096 B407 EE1027 80

Apple 4096 B406 EE1028 40

68020 2048 B406 EE1029 80

68030 2048 B410 EE1030 100

$unix 16636 A405 CC0185 660

"trs80"64 Z101 EL0020 0

In addition, all examples (awk program files myawknn) are available.

A public domain MSDOS awk program (awk.exe) is also available.

Simple Pattern Selection

This involves specifying a pattern to match for each input line scanned. The
following awk program (myawk2) compares field one ($1) and if the field matches
the string "386", the specific action is performed (the entire line is printed).

$1 == "386" { print $0 }
Note: The == symbol represents an equality test, thus in the above pattern, it
compares the string of field one against the constant string "386", and performs
the action if it matches.

Create the program

$ cat - > myawk2


$1 == "386" { print $0 }
< ctrl-d>
$

Note: < ctrl-d> is a keypress to terminate input to the shell. Hold down the ctrl key
and then press d. User input is shown in bold type.

Run The Program

$ awk -f myawk2 awktext

Sample Program Output

386 2048 D403 MG0011 100

386 8192 A423 CC0177 400

The program prints out all input lines where the computer type is a "386".

Using Comments In awk Programs

Comments begin with the hash (#) symbol and continue till the end of the line.
The awk program below adds a comment to a previous awk program shown
earlier

#myawk3, same as myawk2 but has a comment in


it $1 == "386" { print $0 }

Comments can be placed anywhere on the line. The example below shows the
comment placed after the action.

$1 == "386" { print $0 } # print all records where the computer is a 386


Remember that the comment ends at the end of the line. The following program
is thus wrong, as the closing brace of the action is treated as part of the comment.

$1 == "386 { print $0 #print out all records }

Relational Expressions

We have already seen the equality test. Detailed below are the other relational
operators used in comparing expressions.

< less than

< = less than or equal to


== equal to

!= not equal

> = greater than or equal to

> greater than

~ matches

!~ does not match

Some Examples Of Using Relational Operators

# myawk4, an awk program to display all input lines for computers


# with less than 1024 Kb of memory
$2 < 1024 { print $0 }

myawk4 Program Output

XT 640 D402 MG0010 0


"trs80" 64 Z101 EL0020 0

# myawk5
# an awk program to print the location/serial number of 486 computers
$1 == "486" { print $3, $4 }

myawk5 Program Output

D404 MG0012

A424 CC0182

# myawk6
# an awk program to print out all computers belonging to
management. /MG/ { print $0 }

myawk6 Program Output

XT 640 D402 MG0010 0

386 2048 D403 MG0011 100

486 4096 D404 MG0012 270

The awk program myawk6 scans each input line searching for the occurrence of
the string MG. When found, the action prints out the line. The problem with this
is it might be possible for the string MG to occur in another field, but the serial
number indicates that it belongs to another department.

What is necessary is a means of matching only a specific field. To apply a search to


a specific field, the match (~) symbol is used. The modified awk program shown
below searches field 4 for the string MG.
# myawk6A
# improved awk program, print out all computers belonging to
management. $4 ~ /MG/ { print $0 }

myawk6a Program Output

XT 640 D402 MG0010 0

386 2048 D403 MG0011 100

486 4096 D404 MG0012 270

Making the output a bit more meaningful

In all the previous examples, the output of the awk program has been either the
entire line or fields within the line. Lets add some text to make the output more
meaningful. Consider the following awk program,

# myawk7
# list computers located in D block, type and location
$3 ~ /D/ { print "Location = ", $3, " type = ", $1 }

myawk7 Program Output


Location = D402 type = XT
Location = D403 type = 386
Location = D404 type = 486

Text And Formatted Output Using printf


We shall tidy the output information by using a built in function of awk called
printf. C programmers will have no difficulty using this, as it operates the same
way as in the C programming language.
Printing A Text String

Lets examine how to print out some simple text. Consider the following
statement,

printf( "Location : " );

The printf statement is terminated by a semi-colon. Brackets are used to enclose


the argument, and the text is enclosed using double quotes. Now lets combine it
into an actual awk program which displays the location of all 286 type computers.

#myawk8
$1 == "286" { printf( "Location : "); print $3 }

myawk8 Program Output


Location : A423
Location : A425

Printing A Field Which Is A Text String

Lets now examine how to use printf to display a field which is a text string. In the
previous program, a separate statement (print $3) was used to write the room
location. In the program below, this will be combined into the printf statement
also.

#myawk9
$1 == "286" { printf( "Location is %s\n", $3 ); }
myawk9 Program Output
Location is A423
Location is A425
Note: The symbol \n causes subsequent output to begin on a new line. The symbol
%s informs printf to print out a text string, in this case it is the contents of the
field $3.
Consider the following awk program which prints the location and serial number
of all 286 computers.

#myawk10
$1=="286" { printf( "Location = %s, serial # = %s\n", $3, $4 ); }

myawk10 Program Output


Location = A423, serial # = CC0183
Location = A425, serial # = CC0184

Printing A Numeric Value


Lets now see how to print a numeric value. The symbol %d is used for numeric
values. The following awk program lists the location and disk capacity of all 486
computers.

#myawk11
$1=="486" { printf("Location = %s, disk = %dKb\n", $3, $5 ); }

myawk11 Program Output


Location = D404, disk = 270Kb
Location = A424, disk = 670Kb

Formatting Output

Lets see how to format the output information into specific field widths. A
modifier to the %s symbol specifies the size of the field width, which by default
is right justified.
#myawk12
# formatting the output using a field width
$1=="286" {printf("Location = %10s, disk = %5dKb\n",$3,$5);}

myawk12 Program Output

Location = A423, disk = 100Kb

Location = A425, disk = 80Kb

10%s specifies to print out field $3 using a field width of 10 characters, and
%5d specifies to print out field $5 using a field width of 5 digits.

Summary of printf so far

Below lists the options to printf covered above. [n] indicates optional arguments.

%[n]s print a text string

%[n]d print a numeric value

\n print a new-line

The BEGIN And END Statements Of An awk Program

The keywords BEGIN and END are used to perform specific actions relative to the
programs execution.

BEGIN The action associated with this keyword is executed before the first
input line is read.
END The action associated with this keyword is executed after all input lines
have been processed.
The BEGIN keyword is normally associated with printing titles and setting default
values, whilst the END keyword is normally associated with printing totals.
Consider the following awk program, which uses BEGIN to print a title.
#myawk13
BEGIN { print "Location of 286 Computers" }
$1 == "286" { print $3 }

myawk13 Program Output


Location of 286 Computers
A423
A425

Introducing awk Defined Variables


awk programs support a number of pre-defined variables.

NR the current input line number


NF number of fields in the input line

#myawk14
# print the number of computers
END { print "There are ", NR, "computers" }

myawk14 Program Output


There are 13 computers

User Defined Variables In An awk Program

awk programs support the use of variables. Consider an example where we want
to count the number of 486 computers we have. Variables are explicitly initialised
to zero by awk, so there is no need to assign a value of zero to them.
The following awk program counts the number of 486 computers, and uses the
END keyword to print out the total after all input lines have been processed.
When each input line is read, field one is checked to see if it matches 486. If it
does, the awk variable computers is incremented (the symbol ++ means
increment by one).

#myawk15
$1 == "486" { computers++ }
END { printf("The number of 486 computers is %d\n", computers); }

myawk15 Program Output


The number of 486 computers is 2

Note: There is no need to explicitly initialise the variable 'computers'


to zero. awk does this by default.

Regular Expressions

awk provides pattern searching which is more comprehensive than the simple
examples outlined previously. These patterns are called regular expressions, and
are similar to those supported by other UNIX utilities like grep.
The simplest regular expression is a string enclosed in slashes,

/386/
In the above example, any input line containing the string 386 will be printed. To
restrict a search to a specific field, the match (or not match) symbol is used. In the
following example, field one of the input line is searched for the string 386.

$1 ~ /386/
In regular expressions, the following symbols are metacharacters with special
meanings.

\^$.[]*+?()|

^ matches the first character of a string


$ matches the last character of a string
. matches a single character of a string
[ ] defines a set of characters
( ) used for grouping
| specifies alternatives

A group of characters enclosed in brackets matches to any one of the enclosed


characters. In the example below (myawk16), field one is matched against either
"8" or "6".
#myawk16, display all x8x computer types
$1 ~ /[86]/ { print $0 }

myawk16 Program Output


386 2048 D403 MG0011 100
486 4096 D404 MG0012 270
386 8192 A423 CC0177 400
486 8192 A424 CC0182 670
286 4096 A423 CC0183 100
286 4096 A425 CC0184 80
68020 2048 B406 EE1029 80
68030 2048 B410 EE1030 100
"trs80" 64 Z101 EL0020 0

Note: In this example, field one is searched for the character '8' and '6', in any
order of occurrence and position.

If the first character after the opening bracket ([) is a caret (^) symbol, this
complements the set so that it matches any character NOT IN the set. The
following example (myawk17) shows this, matching field one with any character
except "8" or "6".

#myawk17
# display all which do not contain 2, 3, 4, 6 or 8 in first
field $1 ~ /[^23468]/ { print $0 }

myawk17 Program Output


XT 640 D402 MG0010 0
Mac 4096 B407 EE1027 80
Apple 4096 B406 EE1028 40
68020 2048 B406 EE1029 80
68030 2048 B410 EE1030 100
$unix 16636 A405 CC0185 660
"trs80" 64 Z101 EL0020 0

#myawk18
# display all lines where field one contains A-Z, a-z
$1 ~ /[a-zA-Z]/ { print $0 }
myawk18 Program Output

XT 640 D402 MG0010 0


Mac 4096 B407 EE1027 80
Apple 4096 B406 EE1028 40
$unix 16636 A405 CC0185 660
"trs80" 64 Z101 EL0020 0
Parentheses are used to group options together, and the vertical bar is used for
alternatives. In the following example (myawk19), it searches all input lines for
the string "Apple", "Mac", "68020" or "68030".

#myawk19
# illustrate multiple searching using alternatives
/(Apple|Mac|68020|68030)/ { print $0 }

myawk19 Program Output

Mac 4096 B407 EE1027 80


Apple 4096 B406 EE1028 40
68020 2048 B406 EE1029 80
68030 2048 B410 EE1030 100
To use metacharacters as part of a search string, their special meaning must be
disabled. This is done by preceding them with the backslash (\) symbol. The
following example prints all input lines which contain the string "b$".

/b\$/ { print $0 }

Special symbols recognised by awk

In addition to metacharacters, awk recognises the following C programming


language escape sequences within regular expressions and strings.

\b backspace
\f formfeed
\r carriage return
\t tab
\" double quote

The following example prints all input lines which contain a tab character

/\t/ { print $0 }

Consider also the use of string concatenation in pattern matching. The plus (+)
symbol concatenates one or more strings in pattern matching. The following awk
program (myawk16a) searches for computer types which begin with a dollar ($)
symbol and are followed by an alphabetic character (a-z, A-Z), and the last
character in the string is the symbol x.

#myawk16a
$1 ~ /^\$+[a-zA-Z]+x$/ { print $0 }

myawk16 Program Output


$unix 16636 A405 CC0185 660

awk interprets any string or variable on the right side of a ~ or !~ as a regular


expression. This means the regular expression can be assigned to a variable, and
the variable used later in pattern matching. An earlier awk program (myawk17)
searched for input lines where field one did not contain the digits 2, 3, 4, 6 or 8.

#myawk17
# display all which do not contain 2, 3, 4, 6 or 8 in first
field $1 ~ /[^23468]/ { print $0 }

The awk program below shows how to rewrite this (myawk17) using a variable
which is assigned the regular expression.

#myawk20
BEGIN { matchstr = "[^23468]" }
$1 ~ matchstr { print $0 }

myawk20 Program Output


XT 640 D402 MG0010 0
Mac 4096 B407 EE1027 80
Apple 4096 B406 EE1028 40
68020 2048 B406 EE1029 80
68030 2048 B410 EE1030 100
$unix 16636 A405 CC0185 660
"trs80"64 Z101 EL0020 0

Consider the following example, which searches for all lines which contain the
double quote character (").

#myawk21
BEGIN { matchstr = "\"" }
$1 ~ matchstr { print $0 }

myawk21 Program Output

"trs80" 64 Z101 EL0020 0

Combining Patterns

Patterns can be combined to provide more powerful and complex matching. The
following symbols are used to combine patterns.

|| logical or, either pattern can match


&& logical and, both patterns must match
! logical not, patterns not matching
Lets suppose we want a list of all "486" computers which have more than 250Mb
of hard disk space. The following awk pattern uses the logical and to construct the
necessary pattern string.

#myawk22
$1 == "486" && $5 > 250 { print $0 }

myawk22 Program Output


486 4096 D404 MG0012 270
486 8192 A424 CC0182 670

awk Pattern Ranges


A pattern range is two patterns separated by a comma. The action is performed
for each input line between the occurrence of the first and second pattern.

#myawk23
# demonstrate the use of pattern ranges
/XT/, /Mac/ { print $0 }

myawk23 Program Output

XT 640 D402 MG0010 0


386 2048 D403 MG0011 100
486 4096 D404 MG0012 270
386 8192 A423 CC0177 400
486 8192 A424 CC0182 670
286 4096 A423 CC0183 100
286 4096 A425 CC0184 80
Mac 4096 B407 EE1027 80

The awk program myawk23 prints out all input lines between the first occurrence
of "XT" and the next occurrence of "Mac".

Write an awk program using a pattern range to print out all input lines beginning
with the first computer fitted with 8192Kb of memory, up to the next computer
which has less than 80Mb of hard disk. After running the program successfully,
enter it in the space provided below.

awks Built In Variables

awk provides a number of internal variables which it uses to process files. These
variables are accessible by the programmer. The following is a summary of awk's
built-in variables.

ARGC number of command-line arguments


ARGV array of command-line arguments
FILENAME name of current input file
FNR record number in current file
FS input field separator (default= space and tab characters)
NF number of fields in input line
NR number of input lines read so far
OFMT output format for numbers (default=%.6)
OFS output field separator (default=space)
ORS output line separator (default=newline)
RS input line separator (default=newline)
RSTART index of first character matched by match()
RLENGTH length of string matched by match()
SUBSEP subscript separator (default="\034")

#myawk24
# print the first five input lines of a file, bit like head
FNR == 1, FNR == 5 { print $0 }

myawk24 Program Output

XT 640 D402 MG0010 0


386 2048 D403 MG0011 100
486 4096 D404 MG0012 270
386 8192 A423 CC0177 400
486 8192 A424 CC0182 670

#myawk25
# print each input line preceded with a line number
# print the heading which includes the name of the file
BEGIN { print "File:", FILENAME }
{ print NR, ":\t", $0 }

myawk25 Program Output


File: awktext
1: XT 640 D402 MG0010 0
2: 386 2048 D403 MG0011 100
3: 486 4096 D404 MG0012 270
4: 386 8192 A423 CC0177 400
5: 486 8192 A424 CC0182 670
6: 286 4096 A423 CC0183 100
7: 286 4096 A425 CC0184 80
8: Mac 4096 B407 EE1027 80
9: Apple 4096 B406 EE1028 40
10 : 68020 2048 B406 EE1029 80
11 : 68030 2048 B410 EE1030 100
12 : $unix 16636 A405 CC0185 660
13 : "trs80" 64 Z101 EL0020 0

#myawk26
# demonstrate use of argc and argv parameters
BEGIN { print "There are ",ARGC, "parameters on the command line";
print "The first argument is ", ARGV[0];
print "The second argument is ", ARGV[1]
}

myawk26 Program Output


(invoked using awk -fmyawk26 awktext)

There are 2 parameters on the command line


The first argument is awk
The second argument is awktext

#myawk27
# print out the number of fields in each input
line { print "Input line", NR, "has", NF, "fields" }

myawk27 Program Output

Input line 1 has 5 fields


Input line 2 has 5 fields
Input line 3 has 5 fields
Input line 4 has 5 fields
Input line 5 has 5 fields
Input line 6 has 5 fields
Input line 7 has 5 fields
Input line 8 has 5 fields
Input line 9 has 5 fields
Input line 10 has 5 fields
Input line 11 has 5 fields
Input line 12 has 5 fields
Input line 13 has 5 fields

Using the BEGIN statement, it is often desirable to change both FS (the symbol
used to separate fields) and RS (the symbol used to separate input lines). The
following text file (awktext2) is used for the program myawk28. The test file
separates each field using a dollar symbol ($), and each input line by a carat
symbol (^). The program reads the file and prints out the username and password
for each user record. A heading is shown only for clarity.

awktext2 data format

(username$address$password$privledge$downloadlimit$protocol^)

Joe Bloggs$767 Main Rd Tawa$smidgy$clerk$500$zmodem^Sam


Blue$1023
Kent Drive Porirua$yougessedit$normal$100$xmodem^Bobby Williams$96
Banana Grove$mymum$sysop$3000$zmodem^

#myawk28

# a program which shows use of FS and RS, scans awktext2


BEGIN { FS = "\$"; RS = "\^" }
{ print "User = ", $1, " Password:", $3 }

myawk28 Program Output

User = Joe Bloggs Password: smidgy


User = Sam Blue Password: yougessedit
User = Bobby Williams Password: mymum
User = Password:
awks Assignment Operators

The following is a summary of awk's assignment operators.

+ add
- subtract
* multiply
/ divide
++ increment
-- decrement
% modulus
^ exponential
+= plus equals
-= minus equals
*= multiply equals
/= divide equals
%= modulus equals
^= exponential equals

Now some examples,

sum = sum + 3 # same as sum += 3


sum = x / y
n++ # same as n = n + 1

The following awk program displays the average installed memory capacity for an
IBM type computer (XT - 486). Note the use of %f within the printf statement to
print out the result in floating point format. The use of .2 between the % and f
symbols specify two decimal places.

#myawk29
/(XT|286|386|486)/ { computers++, ram += $2 }
END { avgmem = ram / computers;
printf(" The average memory per PC = %.2f", avgmem )
}
myawk29 Program Output
The average memory per PC = 4480.00

awks Built In String Functions

The following is a summary of awk's built-in string functions. An awk string is


created by enclosing characters within quotes ("). A string can contain C language
escape sequences. The following awk string contains the escape sequence for a
new-line character.

"hello\n"

gsub(r,s) substitutes s for r globally in current input line,


returns the number of substitutions
gsub(r,s,t) substitutes s for r in t globally, returns number of
substitutions
index(s,t) returns position of string t in s, 0 if not present
length(s) returns length of s
match(s,r) returns position in s where r occurs, 0 if not
present
split(s,a) splits s into array a on FS, returns number of fields
split(s,a,r) splits s into array a on r, returns number of fields
sprintf(fmt, expr-list) returns expr-list formatted according to format
string specified by fmt
sub(r,s) substitutes s for first r in current input line,
returns number of substitutions
sub(r,s,t) substitutes s for first r in t, returns number of
substitutions
substr(s,p) returns suffix s starting at position p
substr(s,p,n) returns substring of s length n starting at position
p

The following awk program (myawk31) uses the string function gsub to replace
each occurrence of 286 with the string AT.
#myawk31
{ gsub( /286/, "AT" ); print $0 }

myawk31 Program Output

XT 640 D402 MG0010 0


386 2048 D403 MG0011 100
486 4096 D404 MG0012 270
386 8192 A423 CC0177 400
486 8192 A424 CC0182 670
AT 4096 A423 CC0183 100
AT 4096 A425 CC0184 80
Mac 4096 B407 EE1027 80
Apple 4096 B406 EE1028 40
68020 2048 B406 EE1029 80
68030 2048 B410 EE1030 100
$unix 16636 A405 CC0185 660
"trs80" 64 Z101 EL0020 0

awk Control Flow Statements

awk provides a number of constructs to implement selection and iteration. These


are similar to C language constructs.

if ( expression ) statement1 else statement2

The expression can include the relational operators, the regular expression
matching operators, the logical operators and parentheses for grouping.
expression is evaluated first, and if NON-ZERO then statement1 is executed,
otherwise statement2 is executed.
In the following awk program (myawk32), each input line is scanned and field $5
is compared against the value of the awk user defined variable disksize (awk
initialises it to 0). When field $5 is greater, it is assigned to disksize, and the input
line is saved in the other user defined variable computer. Note the use of the
braces { } to group the program statements as belonging to the if statement
(same syntax as in the C language).
#myawk32
#demonstrate use of if statement, find biggest disk
{ if( disksize < $5 )
{
disksize = $5
computer = $0
}
}
END { print computer }

myawk32 Program Output


486 8192 A424 CC0182 670

while ( expression ) statement

expression is evaluated, and if NON-ZERO then statement is executed, then


expression is re-evaluated. This continues until expression evaluates as ZERO, at
which time the while statement terminates.

#myawk33
# a while statement to print out each second field only for "286"
computers BEGIN { printf("Type\tLoc\tDisk\n") }
/286/ { field = 1
while( field < = NF )
{
printf("%s\t", $field )
field += 2
}
print ""
}

myawk33 Program Output

Type Loc Disk


286 A423 100
286 A425 80
The following data file (awktext3) contains a list of computers per department in
an organisation.

Management 22 Electronics 46 Engineering 12


Health_Science 5 Tourism 20 Registry 18
Computing_Centre 300 Library 4 Halls 2

for ( expression1; expression; expression2 ) statement

The for statement provides repetition of a statement. expression1 is executed


first, and is normally used to initialise variables used within the for loop.
expression is a re-evaluation which determines whether the loop should continue.
expression2 is performed at the end of each iteration of the loop, before the re-
evaluation test is performed.

1. expression1
2. expression is evaluated. If non-zero got step 3 else exit
3. statement is executed
4. expression2 is executed
5. goto step 2

Consider the following awk program (myawk34) which is the same as myawk33
shown earlier.

#myawk34
# a for statement to print out each second field only for "286"
computers BEGIN { printf("Type\tLoc\tDisk\n") }
/286/ { for( field = 1; field < = NF; field += 2) printf("%s\t", $field )
print ""
}

myawk34 Program Output

Type Loc Disk


286 A423 100
286 A425 80

do statement while( expression )

The statement is executed repeatedly until the value of expression is ZERO.


statement is executed at least once.

#myawk35
# print out every second field for "286"
computers BEGIN { field = 1 }
$1 == "286" { do {
printf("%s\t", $field)
field += 2
} while( field < = NF )
}

myawk35 Program Output

286 A423 100

break, continue, next, exit

The break statement causes an immediate exit from within a while or for loop.
The continue statement causes the next iteration of a loop.
The next statement skips to the next input line then re-starts from the first
pattern-action statement.
The exit statement causes the program to branch to the END statement (if one
exists), else it exits the program.

#myawk36
#print out computer types "286" using a next statement
{ while( $1 != "286" ) next; print $0 }

myawk36 Program Output

286 4096 A423 CC0183 100


286 4096 A425 CC0184 80
Arrays in awk programs

awk provides single dimensioned arrays. Arrays need not be declared, they are
created in the same manner as awk user defined variables.
Elements can be specified as numeric or string values. Consider the following awk
program (myawk37) which uses arrays to hold the number of "486" computers
and the disk space totals for all computers.

#myawk37

# diskspace[] holds sum of disk space for all computers


# computers[] holds number of computers of specified
type $1 == "486" { computers["486"]++ }
$5 > 0 { diskspace[0] += $5 }
END { print "Number of 486 computers =", computers[486];
print "Total disk space = ",diskspace[0]
}

myawk37 Program Output

Number of 486 computers = 2


Total disk space = 2580

Note that the previous program (myawk37) uses TWO pattern action statements
for each input line. The first pattern action statement handles the number of
"486" type computers, whilst the second handles the total disk space for all
computer types.

Consider the following awk program (myawk38) which uses the in statement
associated with processing areas. The program .....

#myawk38
{ computers[$1]++ }
END { for ( name in computers )
print "The number of ",name,"computers is",computers[name]
}
myawk38 Program Output

The number of "trs80" computers is 1


The number of $unix computers is 1
The number of 286 computers is 2
The number of 386 computers is 2
The number of 486 computers is 2
The number of 68020 computers is 1
The number of 68030 computers is 1
The number of Apple computers is 1
The number of Mac computers is 1
The number of XT computers is 1

awk User Defined Functions

awk supports user defined functions. The syntax is

function name( argument-list ) {


statements
}

The definition of a function can occur anywhere a pattern-action statement can.


Argument-list is a list of variable names separated by commas. There must be NO
space between the function name and the left bracket of the argument-list.

The return statement is used to return a value by the function.

Consider the following awk program (myawk39) which calculates the factorial of
an inputted number.

#myawk39
function factorial( n ) {
if( n < = 1 ) return 1
else return n * factorial( n - 1)
}
{ print "the factorial of ", $1, "is ", factorial($1) }
Sample myawk39 Program Output (awk -fmyawk39)
10
the factorial of 10 is 3628800
3
the factorial of 3 is 6
1
the factorial of 1 is 1
4
the factorial of 4 is 24

awk Output

The statements print and printf are used in awk programs to generate output.
awk uses two variables, OFS (output field separator) and ORS (output record
separator) to delineate fields and output lines. These can be changed at any time.
The special characters used in printf, which follow the % symbol, are,

c single character
d decimal integer
e double number, scientific notation
f floating point number
g use e or f, whichever is shortest
o octal
s string
x hexadecimal
% the % symbol

The default output format is %.6g and is changed by assigning a new value to
OFMT.

awk Output To Files

awks output generated by print and printf can be redirected to a file by using the
redirection symbols > (create/write) and > > (append). The names of files MUST
be in quotes.

#myawk40
# demonstrates sending output to a file
$1 == "486" { print "Type=",$1, "Location=",$3 > "comp486.dat"

Sample output contained in 'comp486.dat'


Type= 486 Location= D404
Type= 486 Location= A424

awk Output To Pipes

The output of awk programs can be piped into a UNIX command. The statement

print " ",$1 | "sort"


causes the output of the print command to be piped to the UNIX sort command.

awk Input

Data Files

We have seen TWO methods to give file input to an awk program. The first
specified the filename on the command line, the other left it blank, and awk read
from the keyboard (examples were myawk30 and myawk39).
Program Files
We have used the -f parameter to specify the file containing the awk program.
awk programs can also be specified on the command-line enclosed in single
quotes, as the following example shows.

awk '/286/ {print $0 }' awktext


Note: For MSDOS systems, a double quote must be used to enclose the awk
program when specified on the command line.

The getline function

awk provides the function getline to read input from the current input file or from
a file or pipe.
getline reads the next input line, splitting it into fields according to the settings of
NF, NR and FNR. It returns 1 for success, 0 for end-of-file, and -1 on error.
The statement

getline data

reads the next input line into the user defined variable data. No splitting of fields
is done and NF is not set.

The statement

getline < "temp.dat"

reads the next input line from the file "temp.dat", field splitting is performed, and
NF is set.

The statement

getline data < "temp.dat"

reads the next input line from the file "temp.dat" into the user defined variable
data, no field splitting is done, and NF, NR and FNR are not altered.
Consider the following example, which pipes the output of the UNIX command
who into getline. Each time through the while loop, another line is read from who,
and the user defined variable users is incremented. The program counts the
number of people on the host system.

while ( "who" | getline


) users++

awk Summary

The following is a summary of the most common awk statements and features.

Command Line
awk program filenames
awk -f program-file filenames
awk -Fs
(sets field separator to string s, -Ft sets separator to tab)

Patterns

BEGIN
END
/regular expression/
relational expression
pattern & & pattern
pattern || pattern
(pattern)
!pattern
pattern, pattern

Control Flow Statements

if ( expr) statement [ else statement]


if ( subscript in array) statement [ else statement]
while ( expr) statement
for ( expr ; expr ; expr ) statement
for ( var in array ) statement
do statement while ( expr)
break
continue
next
exit [ expr]
return [ expr]

Input Output

close( filename ) close file


getline set $0 for m ne xt input line, set NF, N R, FN R

getline < file set $0 from next input line of file, set NF
getline var set var from next input line, net NR, FNR
getline var < file set var from next input line of file
print print current input line
print expr-list print expressions
print expr-list > file print expressions to file
printf fmt, expr-list format and print
printf fmt, expr-list > file format and print to file
system( cmd-line ) execute command cmd-line, return status

In print and printf above, > > appends to a file, and the | command writes to
a pipe. Similarly, command | getline pipes into getline. The function getline
returns
0 on the end of a file, -1 on an error.

Functions

func name( parameter list ) { statement }


function name ( parameter list ) { statement }
function-name ( expr, expr, ... )

String Functions

gsub(r,s,t) substitutes s for r in t globally, returns number of substitutions


index(s,t) returns position of string t in s, 0 if not present
length(s) returns length of s
match(s,r) returns position in s where r occurs, 0 if not present
split(s,a,r) splits s into array a on r, returns number of fields
sprintf(fmt, expr-list) returns expr-list formatted according to format
string specified by fmt
sub(r,s,t) substitutes s for first r in t, returns number of substitutions
substr(s,p,n) returns substring of s length n starting at position p

Arithmetic Functions

atan2(y,x) arctangent of y/x in radians


cos(x) cosine of x, with x in radians
exp(x) exponential function of x
int(x) integer part of x truncated towards 0
log(x) natural logarithm of x
rand() random number between 0 and 1
sin(x) sine of x, with x in radians
sqrt(x) square root of x
srand(x) x is new seed for rand()

Operators (increasing precedence)

= += -= *= /= %= ^= assignment
?: conditional expression
|| logical or
&& logical and
~ !~ regular expression match, negated
match
< < = > > = != == relationals
blank string concatenation
+ - add, subtract
* / % multiply, divide, modulus
+ - ! unary plus, unary minus, logical
negation
^ exponentional
++ -- increment, decrement
$ field

Regular Expressions (increasing precedence)

c matches no-metacharacter c
\c matches literal character c
. matches any character except newline
^ matches beginning of line or string
$ matches end of line or string
[abc...] character class matches any of abc...
[^abc...] negated class matches any but abc... and newline
r1 | r2 matches either r1 or r2
r1r2 concatenation: matches r1, then r2
r+ matches one or more r's
r* matches zero or more r's
r? matches zeor or more r's
(r) grouping: matches r

Built-In Variables

ARGC number of command-line arguments


ARGV array of command-line arguments (0..ARGC-1fR)
FILENAME name of current input file
FNR input line number number in current file
FS input field separator (default blank)
NF number of fields in input line
NR number of input lines read so far
OFMT output format for numbers (default=%.6g)
OFS output field separator (default=space)
ORS output line separator (default=newline)
RS input line separator (default=newline)
RSTART index of first character matched by match()
RLENGTH length of string matched by match()
SUBSEP subscript separator (default=\034")

Limits
Each implementation of awk imposes some limits. Below are typical limits

100 fields
2500 characters per input line
2500 characters per output line
1024 characters per individual field
1024 characters per printf string
400 characters maximum quoted string
400 characters in character class
15 open files
1 pipe
sleep
This command suspends execution for a specified time interval, expressed in
seconds.

UNIT – IV

Process related commands(ps, top, pstree, nice, renice), Introduction to the linux
Kernel, getting started with the kernel(obtaining the kernel source, installing the
kernel source,using patches, the kernel source tree, building the kernel proceess
management(process descriptor and the task structure, allocating the process
descriptor, storing the process descriptor, process state, manipulating the current
process state, process context, the process family tree, the Linux scheduling
algorithm, overview of system calls,.Intoduction to kernel debuggers(in windows
and linux)

Introduction to UNIX:
Lecture Four

4.1 Objectives
This lecture covers:

  The concept of a process. 


 Passing output from one process as input to another using pipes. 
  Redirecting process input and output. 
  Controlling processes associated with the current shell. 
 Controlling other processes. 
4.2 Processes

A process is a program in execution. Every time you invoke a system utility or an


application program from a shell, one or more "child" processes are created by
the shell in response to your command. All UNIX processes are identified by a
unique process identifier or PID. An important process that is always present is
the init process. This is the first process to be created when a UNIX system starts
up and usually has a PID of 1. All other processes are said to be "descendants" of
init.

4.3 Pipes

The pipe ('|') operator is used to create concurrently executing processes that
pass data directly to one another. It is useful for combining system utilities to
perform more complex functions. For example:

$ cat hello.txt | sort | uniq

creates three processes (corresponding to cat, sort and uniq) which execute
concurrently. As they execute, the output of the who process is passed on to the
sortprocess which is in turn passed on to the uniq process. uniq displays its output
on the screen (a sorted list of users with duplicate lines removed).

Similarly:

$ cat hello.txt | grep "dog" | grep -v "cat"


finds all lines in hello.txt that contain the string "dog" but do not contain the
string "cat".

4.4 Redirecting input and output

The output from programs is usually written to the screen, while their input
usually comes from the keyboard (if no file arguments are given). In technical
terms, we say that processes usually write to standard output (the screen) and
take their input from standard input (the keyboard). There is in fact another
output channel called standard error, where processes write their error
messages; by default error messages are also sent to the screen.

To redirect standard output to a file instead of the screen, we use the > operator:

$ echo hello
hello
$ echo hello > output
$ cat output
hello

In this case, the contents of the file output will be destroyed if the file already
exists. If instead we want to append the output of the echo command to the file,
we can use the >> operator:

$ echo bye >>


output $ cat
output
hello
bye
To capture standard error, prefix the > operator with a 2 (in UNIX the file numbers
0, 1 and 2 are assigned to standard input, standard output and standard error
respectively), e.g.:

$ cat nonexistent
2>errors $ cat errors
cat: nonexistent: No such file or directory
$

You can redirect standard error and standard output to two different files:

$ find . -print 1>errors


2>files or to the same file:
$ find . -print 1>output 2>output
or
$ find . -print >& output
Standard input can also be redirected using the < operator, so that input is read
from a file instead of the keyboard:

$ cat <
output hello
bye
You can combine input redirection with output redirection, but be careful not to
use the same filename in both places. For example:

$ cat < output > output

will destroy the contents of the file output. This is because the first thing the shell
does when it sees the > operator is to create an empty file ready for the output.

One last point to note is that we can pass standard output to system utilities that
require filenames as "-":

$ cat package.tar.gz | gzip -d | tar tvf -

Here the output of the gzip -d command is used as the input file to the tar
command.

4.5 Controlling processes associated with the


current shell

Most shells provide sophisticated job control facilities that let you control many
running jobs (i.e. processes) at the same time. This is useful if, for example, you
are editing a text file and want ot interrupt your editing to do something else.
With job control, you can suspend the editor, go back to the shell prompt, and
start work on something else. When you are finished, you can switch back to the
editor and continue as if you hadn't left.

Jobs can either be in the foreground or the background. There can be only one job
in the foreground at any time. The foreground job has control of the shell with
which you interact - it receives input from the keyboard and sends output to the
screen. Jobs in the background do not receive input from the terminal, generally
running along quietly without the need for interaction (and drawing it to your
attention if they do).

The foreground job may be suspended, i.e. temporarily stopped, by pressing the
Ctrl-Z key. A suspended job can be made to continue running in the foreground or
background as needed by typing "fg" or "bg" respectively. Note that suspending a
job is very different from interrupting a job (by pressing the interrupt key, usually
Ctrl-C); interrupted jobs are killed off permanently and cannot be resumed.
Background jobs can also be run directly from the command line, by appending a
'&' character to the command line. For example:

$ find / -print 1>output 2>errors


& [1] 27501
$

Here the [1] returned by the shell represents the job number of the background
process, and the 27501 is the PID of the process. To see a list of all the jobs
associated with the current shell, type jobs:

$ jobs
[1]+ Running find / -print 1>output 2>errors &
$
Note that if you have more than one job you can refer to the job as %n where n is
the job number. So for example fg %3 resumes job number 3 in the foreground.
To find out the process ID's of the underlying processes associated with the shell
and its jobs use ps (process show):
$ ps
PID TTY TIME CMD
17717 pts/10 00:00:00 bash
27501 /10 00:00:01 find
27502 pts/10 00:00:00 ps

So here the PID of the shell (bash) is 17717, the PID of find is 27501 and the PID of
ps is 27502.

To terminate a process or job abrubtly, use the kill command. kill allows jobs to
referred to in two ways - by their PID or by their job number. So
$ kill %1
or
$ kill 27501

would terminate the find process. Actually kill only sends the process a signal
requesting it shutdown and exit gracefully (the SIGTERM signal), so this may not
always work. To force a process to terminate abruptly (and with a higher
probability of sucess), use a -9 option (the SIGKILL signal):

$ kill -9 27501

kill can be used to send many other types of signals to running processes. For
example a -19 option (SIGSTOP) will suspend a running process. To see a list of
such signals, run kill -l.

Type the command,

$ sleep 200&

This will run the sleep program in the background for 200 seconds.

pstree (process status, tree display)

This command reports on the active processes being run.


Type the command

$ pstree –p

You may even see the sleep process listed if it is still running in the background.

Note: If pstree is not available on your system, try ps as an alternative.

Using sty

This sets certain terminal I/O options for the device associated with stdin. When
typed without arguments, it displays the current settings of these options.
The format of the command is
stty [-a] [-g] [options]

The various options supported are

parity
character
size baud
rate stop bits
flow control, cts, xon-xoff
upper-lowercase translation
cr/cr-lf translation, delays after cr/cr-lf, formfeed, tabs
echo, full/half-duplex

A - sign preceding an argument turns that option off. Some examples are,

stty -echo ;suppress echoing of input


stty erase # ;set erase character to #
stty quit @ ;set SIGQUIT to @ (initially ctrl-\)

Using tput

This command uses the terminfo database to make the values of


terminal dependent attributes available to the shell.

The format of the command is

tput -Ttermtype attribute

Some examples are,

tput clear ;echo clear screen sequence


tput cols ;print the number of columns

bold=`tput smso`
norm=`tput rmso`
echo "${bold}Hello there${norm}"

The last example assigns the codes for bold-on and bold-off to shell variables.
These are then used to highlight a text string.

Using tset

This command is used to specify the erase and kill characters for the terminal, as
well as specifying the terminal type and exporting terminal information for use in
the shell environment.

This command is used at login time (.profile) to determine the terminal type.
Examples of commands are,

tset vt100 ;set terminal type to vt100


tset -e# ;set erase character to # (default ctrl-H)
tset -k@ ;set kill character to @ (default ctrl-U)

4.6 Controlling other processes

You can also use ps to show all processes running on the machine (not just the
processes in your current shell):

$ ps -fae (or ps -aux on BSD machines)

ps -aeH displays a full process hierarchy (including the init process).

Many UNIX versions have a system utility called top that provides an interactive
way to monitor system activity. Detailed statistics about currently running
processes are displayed and constantly refreshed. Processes are displayed in
order of CPU utilization. Useful keys in top are:
s - set update frequency k - kill process (by PID)
u - display processes of one user q – quit

On some systems, the utility w is a non-interactive substitute for top.

One other useful process control utility that can be found on most UNIX systems
is the pkill command. You can use pkill to kill processes by name instead of PID or
job number. So another way to kill off our background find process (along with
any another find processes we are running) would be:

$ pkill find
[1]+ Terminated find / -print 1>output 2>errors
$

Note that, for obvious security reasons, you can only kill processes that belong to
you (unless you are the superuser).

The Process

A Process is a UNIX abstraction that enables us to look at files and programs in


another way. A file is treated as simple file when it lies in a dormant state on disk.
It can be understood as a process when it is executed.

Processes as a living being can born, can give birth, and also can die.

Hundreds or thousands of processes can run on UNIX being a Multitasking


System.Processess belongs to the domain of the kernel, which is being
responsible for their management. Processes can be controlled by moving them
between foreground and background and killing them when they get out of
control.

Basics of Processes:

It is an instance of a running program. A process is said to be born when the


program starts execution and remains alive as long as the program is active. After
the execution is complete, the process is said to die.
Every process has name and when two users run the same program there is one
program on disk but two processes in memory.

It is the kernel who is ultimately responsible for the management of these


processes. It determines the time and priorities that are allocated to processes
that multiple processes are able to share CPU resources. It provides a mechanism
by which a process is able to execute for a finite period of time and then
relinquish control to another process. The kernel has to sometimes store pages of
this process in the swap area of the disk before calling them again for running.

Processes have also attributes as the files have. Some attributes of every process
are maintained by the kernel in memory in a separate structure called the process
table. The process table is the inode for processes. Two important attributes of a
process are:

The Process Id (PID)


Each process is uniquely identified by a unique integr called
Process-id (PID) that is allotted to the kernel when the process is born. We need
this PID to control the process. E.g. to kill it.

The Parent PID (PPID):


The PID of Parent is also available as a process attribute. When
several processes have the same PPID, it often makes a sense to kill its parent
instead of all children separately.

The other attributes also get inherited by the child from its parent.

The Shell Process:

The shell process is a process which immediately gets set up by the


kernel, when you log on to the system. This process represent a Unix Command
which may be sh,ksh,csh,bash.Any command you type in at the prompt is actually
the standard input to the shell process.This process remains alive until you log
out, when it si killed by the kernel.
The shell maintains a set of environment variables, like PATH and SHELL.The
shell’s pathname is stored in SHELL, but its PID is stored in a special “variable “,
$$. To know the pad of the current shell, type

$ echo $$

The PID

Parents and Child:

Every process has a parent.A process born from it is said to be


child. E.g. When you run a command:

$ cat emp.lst

This process will get started by shell process.It will remain active till the command
is active.The shell (sh,ksh,bash,csh) is said to be its parent of cat.Every process
have and parent, no process can be orphan.

The command:

$ cat emp.lst | grep ‘director’

Sets up two processes for the two commands.In UNIX multitasking nature permits
a process to generate (or spawn) one or more children.

Process status(ps):
ps command is used to display some process attributes.ps can be
seen as the process counterpart of the file system’s ls command.The Command
reads through the kernal’s data structure and process tables to fetch the
characterstics of processes.

ps displays the processes associated with a user at the terminal. If you execute
the command immediately after logging in.

Like who ps also displays header information.


It shows PID,TTY(terminal with which the process is associated),The cumulative
processor time(TIME),CMD(and the process name)

ps Options:

.ps is a highly variant command,its output varies across different


UNIX flavours.

Full Listing(-f):
To get a detailed listing which also shows the parent of every
process, use the –f(full) options

$ps –f

Displaying Processes of a User(-u):

System administrator uses it to know the activities of any user

$ps –u shan

Displaying All User Processes(-a):

The –a (all) option lists the processes of all users but doesn’t display the system
processes.

$ps –a

Displaying a long-listed memory related information:

Used to display a long list showing memory related information

$ps –a

System Processes”(-e or –A):

Displays the list of all processes along with the controlling terminal (TTY), Time
and CMD running on your system.
System Process are easily identified by ? in the TTY column.These processes are
known as daemons because they are called without a specific request from a
user.Many of these daemons are actually sleeping and wake up only when they
receive input.

Daemon do important work for the system.


The lpsched daemon controls all printing activity.
Sendmail daemon handles both incoming and outgoing mail.
inetd daemon is used to run Telnet and FTP on TCP/IP network
cron daemon looks at its control file once a minute to decide what it should do.

MECHANISM OF PROCESS CREATION

There are three distinct phases and uses three important system calls:

  Fork 
  Exec 
 Wait 

Fork:
A process in Unix is created with the fork system call, which creates a copy
of the process that invokes it.
The process image is practically identical to that of the calling
process,except for a few parameters like PID.
When a process is forked in this way, the child gets a new PID.
The forking mechanism is responsible for the multiplication of process in
the system.

Exec:
The child then overwrites the image that it has just created with the copy of
the program that has to be executed.
This is done with a system call belonging to the exec family, and the parent
is said to exec this process.
No additional process is created here ; the existing program is simply
replaced with the new program.This process has the same PID as the child that
was just forked.
Wait:

The parent then executes the wait system call to wait for the child process
to complete.
It picks the exit status of the child and then continues with its other
functions.
A parent may not decide to wait for the death of its child.

In brief when you run a command from the shell, the shell first forks another shell
process.
The newly forked shell then overlays itself with the executable image of the
command(say cat), which then starts to run.
The parent(shell) waits for the caommnad(cat) to terminate and then picks up the
exit status of the child.
This is the number returned by the child to the kernel, and has great significance
in both shell programming and systems programming.

When a process is forked, the child has differenr PID and PPID from its
parent.However it inherits most of the environment of its parent.The important
attributes that are inherited are:

The real UID and real GID of the process.


The effective UID and effective GID of the process.The are same as their “real”
cousins.
The priority with which the process runs.
The current directory from where the process run.
The descriptors of all the files opened by the parent process.
Some environmental variables (like HOME and PATH).

How the Shell is Created:

When the system moves to multiuser mode, init forks and execs a getty for every
active communication port(or line).
Each one of these getys prints the login prompt on the respective terminal and
then goes off to sleep.

When a user attempts to log in, getty wakes up and fork-execs the login program
to verify the login name and the password entered.

On successful login, login forks-execs the process representing the login shell.

Repeated overlaying ultimately results in init becoming the immediate ancestor of


the shell as can be seen from this sequence:

init getty login shell


fork fork-exec fork-exec

init goes off to sleep, waiting for the death of its children.The other processes
getty and login have extinguished themselves by overlaying.When the user logs
out, her shell is killed, and the death is intimated to init.init then wakes up,swan
another getty for that line to monitor the next login.

Internal And External Commands:

From the process point of view , the shell recognizes three types of commands:

 External Commands 

 Shell Scripts 

 Internal Commands 

External Commands:

The most commonly used ones are the UNIX utilities and
programs like cat,ls etc.The shell creates a process for each of these commands
that it executes while remaining their parent.
Shell Scripts:
The shell executes these scripts by spawning another shell,
which then executes the commands listed in the script.The child shell becomes
the parent of the commands listed in the scripts.The child shell becomes the
parent of the commands that feature in the script.

Internal Commands:

The shell has a number of built-in commands as well.Some of


them like cd and echo don’t generate a process, and are executed directly by the
shell.Similarly, variable assigned with the statement x=5, for instance doesn’t
generate a process either.

Running Jobs In Background:

A multitasking system lets a user do more than one job at a time.Since there can
be one job in the foreground, the rest of the job have to run in the
background.There are two ways of doing this
With the shell’s & operator
The nohup command

&: No Logging Out

The & is the shell’s operator used to run a process in the


background. The parent in this case doesn’t wait for the child’s death.
Just terminate the command line with an &; the command will run in the
background.

e.g. sort –o emp.lst emp.lst &


550 The job’s PID

The shell immediately returns a number- the PID of the invoked command.The
prompt is returned and the shell is ready to accept another command even
though the previous command has not been terminated yet.The shell,however
,remains the parent of the background process.
Using an & you can as many jobs in the background as the system loads permits.
Depending on the shell you are using ,the standard output and the standard error
of a job running in the background may or may not come to the terminal.
Background execution of a jobs is a useful feature that you should utilize to
relegate time-consuming or low-priority jobs to the background, and the run the
important ones in the foreground.

nohup (Log Out Safely)

Background jobs cease to run , when a user logs out(The c hell and Bash
excepted)That happens because her shell is killed.And when the parent is killed,
its children are also normally killed.
The UNIX system permits variation in this default behaviour.The nohup(no
hangup) command, when prefixed to a command, permits execution of the
process even after the user has logged out. You must use the & with it as well:

$nohup sort emp.lst &


586
Sending output to nohup.out

The shell returns the PID this time

Note: Using the ps command after using nohup from another terminal few
important things are to be noticed like:

The shell died on logging out but its child(sort)didn’t; it turned into an orphan.The
kernel handles such situations by reassigning the PPID of the orphan to the
system’s init process (PID 1)- the parent of all shells.When the user logs out , init
takes over the parentage of any process run with nohup.In this way u can kill
apparent without killing its child.

If you run more than one command in a pipeline, then you should use the nohup
command at the beginning of each command in the pipeline:

nohup grep ‘director’ emp.lst & | nohup sort &

nice: Job Execution with Low Priority


Processes in the Unix System are usually executed with equal
priority. This is not always desirable since high priority jobs must be completed at
the earliest. UNIX offers the nice command, which is used with the & operator to
reduce the priority of jobs. More important jobs can than have greater access to
the system resources (being ‘nice’ to your neighbors)

To run a job with a low priority, the command name should be prefixed with nice:

nice wc –l manual
Or
nice wc –l manual &

nice is a built-in command in the C shell. .nice values are system-dependent and
typically ranges from 1 to 19. Commands execute with a nice value that is
generally in the middle of the range from 1 to 19. Commands execute with a nice
value that is generally in the middle of the range- usually 10. A higher nice value
implies a lower priority. .nice reduces the priority of any process, thereby raising
its nice value. You can also specify the nice value explicitly with the –n option:

nice –n 5 wc –l manual &

A nonpriviledged user cannot increase the priority of a process; that power is


reserved for the superuser. The nice value is displayed with the ps –o nice
command.

Killing Processes With Signals


The Unix system often requires to communicate the
occurrence of an event to a process.This is done by sending a signal to the
process.Each signal is identified by a number and designed to perform a specific
function.

As the same signal number may represent two different signals on two different
machines, its better to represent them by their signal name having the SIG prefix.

They can be generated by the kill command or from the keyboard.

If a program is running longer than you anticipated and you terminate it , you
normally press the interrupt key. This sends the process the SIGINT signal(2)
Ther si one signal that aprocess cannot ignore or run user defined code to handle
it; it’s the SIGKILL(9) signal which terminates a process immediately

kill(Premature Termination Of A Process):

It is an internal command used to kill the process.The command uses


one o more PID’s as its arguments, and by default uses the sigterm(15) signal.

kill 111
Here 111 is PID of a process

To facilitate premature termination , the & operator displays the PID of the
process that runs in the background.If you don’t remember the PID the use ps
command to know then kill.
If you want to kill more than one job either in the background or in different
windows , then all can be killed using single kill statements with their respective
id’s

kill 121 111 125 138 144

Killing the last Background Job

In most shell system variable $! Stores the PID of the last background job- the
same number seen when the & is affixed to a command.So you can kill the last
background process without using the ps command to find out its PID:

$ sort –o emp.lst emp.lst &

& kill $!

Use kill wih Other Signals:

By default, kill uses the SIGTERM signal (15) to terminate the process.

Processes can also be killed using SIGKILL signal (9).This signal can’t be generated
at the press of a key, so you must use kill with the signal name(without the SIG)
prefixed with –s option:
kill –s KILL 121

A simple kill command won’t kill the login shell.You can kill your login shell by
using any of these commands:

kill -9 $$ Recommended way of using kill


kill –s KILL 0 Kills all process including the login shell

kill –l
Used to view all signal names

JOB CONTROL:

A job is given a name to a group of processes.The easiest way of creating a job is


to run a pipeline of two or more commands.To manipulate jobs , job control
facilities can be used. Job control in the shell means you can:

Relegate a job to the background(bg)


Bring it back to the foreground(fg)
List the active jobs(jobs)
Suspend a foreground job([Ctrl+z])
Kill a job(kill)

To push current foreground job to the background, bg command is used.

The & at the end of the line indicates that the job is now running in the
background.Any number of foreground jobs can goes to the background, first
with [Ctrl+z] and then with bg command.

All the jobs comprise one process each.List can be viewed using jobs command.

Any of the background job can be brought in the foreground using fg command.

The fg and bg both commands can be used along with the job number,job name
or a string as argument prefixed with % symbol.

e.g.
fg %1 fg
%sort
bg %2
bg %?perm Sends to the background job containing string perm.

Start [Ctrl+q]
Stop [Ctrl+s]
Susp [Ctrl+z]
Dsusp [Ctrl+y]

at and batch [Execute Later]

at One At A Time:
At takes as a argument the time the job is to be executed and
displays the at> prompt.Input has to be supplied from the standard input

[Ctrl+d] is used to come back to the prompt.

at 14:08
at>abc.sh
[Ctrl+d]

at doesn’t indicate the name of the script to be executed; that is something the
user has to remember.A user may prefer to redirect the output of the command
itself.
at 15:08
.script1.sh>resu.lst

-f option is used to take commands from the file.

To mail job completion to the user ,use –m option.

at also offers keywords like now,noon,midnight,today,tomorrow.

e.g. at 9am tomorrow

The month name and day of week can also be used, remember must be either
fully spelled out or abbreviated to three letters.
Jobs can be listed with at –l and removed with at –r.

Unfortunately, there’s no way you can find out the name of the program
scheduled to be executed.This can create problem specially when you are unable
to recall whether a specific job has actually been scheduled for later execution.

Batch: Execute in Batch Queue


The batch command also schedules jobs for later execution, but
unlike at, jobs are executed as soon as system load permits.

This command doesn’t take any arguments but uses an internal algorithm to
determine the execution time. This prevents too many CPU – hungry jobs from
running at the same time. The response of batch is similar to at .

$ batch < manual.sh

Any job scheduled with batch goes to a special at queue, and can also be removed
with at –r.

cron: Running job Periodically

The ps –e command always shows the cron daemon running. This is the Unix
System’s chronograph, ticking away every minute. Unlike at and batch that are
meant for one-time execution. .cron executes programs at regular intervals. It is
mostly dormant but every minute it wakes up and looks in a control file (the
crontab file) in /var/spool/cron/crontabs for instructions to be performed at that
instant. After executing them, it goes to sleep, only to wake up the next minute.

A specimen entry in the file /var/spool/cron/crontabs/id can look like this:

00-10 17 * 3,6,9,12 5 find / -newer .lat_time –print>backup list

Each line consists of six fields separated by whitespaces.The complete command


line is shown in the last field.All these fields together determine when and how
often the command will be executed.
A * used in any of the first five fields implies that the command is to be executed
every period depending on the field where it is placed.

The first field (legal values 00 to 59) specifies the number of minutes after the
hour when the command is to be executed .
The range 00 to 10 schedules execution every minute in the first 10 minutes of
the hour.

The second field (17 i.e. 5 p.m.) indicates the hour in 24 hour format for
scheduling(legal values 1 to 24)

The third field(legal values 1 to 31) controls the day of the month. This field (here
an *) read with the other two, implies that the command is to be executed every
minute , for the first 10 minutes, starting at 5 p.m. every day.

The fourth field (3, 6, 9, and 12) specifies the month (legal values 1 to 12)

The fifth field (5 – Friday0 indicates the days of the week (legal values 0 to 6),
Sunday having the value 0.

Now here the find command will thus be executed every minute in the first 10
minutes after 5 p.m. every Friday of the months March, June, September and
December of every Year.

crontab: creating a crontab File:

You need to use the crontab command to place the file in the directory containing
crontab files for cron to read the file again:

crontab cron.txt

If a user runs this command, a file named kumar will be created in


var/spool/cron/crontabs containing the contents of cron.txt.
When used crontab command with –e option, it calls up the editor
variable(vi).After editing you can quit vi and the command will automatically get
scheduled up for the execution.

To see the content crontab –l is used and to remove them crontab –r is used.

cron is basically used by the system administrator to perform housekeeping


chores, like removing outdated files or collecting data on system performance.It is
also useful to periodically dial up to an internet mail server to send and retrieve
mail.

Timing Processes : time

The time command accepts the entire command line to be timed as its argument
and does this work.It executes the program and also displays the time usage on
the terminal.
This enables programmers to tune their program to keep CPU usage to an
optimum level.
To find out time taken to perform a sorting operation be preceeding the sort
command with time.

$ time sort –o newlist invoice.lst

It shows three time’s

a) real : Clock elapsed time from the invocation of the command till its
termination.

b)user : The time spent by the program in executing itself.

c)sys : It ndicates the time used by the kernel in doing work on behalf if the user
process.

The sum of the user time and the sys time actually represents the CPU
time.

Type the command,


$ sleep 200&

This will run the sleep program in the background for 200 seconds.

pstree (process status, tree display)

This command reports on the active processes being run.

Type the command

$ pstree -p

You may even see the sleep process listed if it is still running in the background.

Note: If pstree is not available on your system, try ps as an alternative.

Using stty

This sets certain terminal I/O options for the device associated with stdin. When
typed without arguments, it displays the current settings of these options.

The format of the command is

stty [-a] [-g] [options]

The various options supported are

parity
character
size baud
rate stop bits
flow control, cts, xon-xoff
upper-lowercase translation
cr/cr-lf translation, delays after cr/cr-lf, formfeed, tabs
echo, full/half-duplex

A - sign preceding an argument turns that option off. Some examples are,
stty -echo ;suppress echoing of input
stty erase # ;set erase character to #
stty quit @ ;set SIGQUIT to @ (initially ctrl-\)

Using tput

This command uses the terminfo database to make the values of terminal
dependent attributes available to the shell.

The format of the command is

tput -Ttermtype attribute

Some examples are,

tput clear ;echo clear screen sequence


tput cols ;print the number of columns

bold=`tput smso`
norm=`tput rmso`
echo "${bold}Hello there${norm}"

The last example assigns the codes for bold-on and bold-off to shell variables.
These are then used to highlight a text string.

Using tset
This command is used to specify the erase and kill characters for the terminal, as
well as specifying the terminal type and exporting terminal information for use in
the shell environment.

This command is used at login time (.profile) to determine the terminal type.
Examples of commands are,

tset vt100 ;set terminal type to vt100


tset -e# ;set erase character to # (default ctrl-H)
tset -k@ ;set kill character to @ (default ctrl-U)
UNIT-V
Shell Programming: Shell scripts and execution methods. User’s initialization file (.
Profile and rc, etc). The dot command. Interactive execution and command line
arguments ($1, $2 etc). Meta Characters – syntactic (&&, (), &, ||, ;;, <, > etc),
pattern matching, substitute shell variables. Quoting, Test Commands. Control
flow with for, if, while, case. The Here document, String handling and
computation using expr. Setting positional parameters (set command), and shift.
Shell functions. Interupt handling (trap), Korn and Bash Shell features, let
command, arrays.

Introduction to UNIX:
Lecture Five
5.1 Shells and Shell Scripts

A shell is a program which reads and executes commands for the user. Shells also
usually provide features such job control, input and output redirection and a
command language for writing shell scripts. A shell script is simply an ordinary
text file containing a series of commands in a shell command language (just like a
"batch file" under MS-DOS).

The shell is a unique and multi-faceted program.It is a command interpreter and a


programming language rolled into one.It is also a process that creates an
environment for you to work in.

There are many different shells available on UNIX systems


(e.g. sh, bash, csh, ksh, tcsh etc.), and they each support a different command
language. Here we will discuss the command language for the Bourne
shell sh since it is available on almost all UNIX systems (and is also supported
under bash and ksh).

5.3 Shell ‘s Interactive Cycle

When you log on to a Unix machine, you see first see a prompt.This prompt
remains there till you key in something.Even though it may appear that the
system is idling , a Unix Command is in fact running at the terminal.This command
is all the time and never terminates unless you log out.This command is the shell.

Even though the shell appears not to be doing anything meaningful when there is
no activity at the terminal, it swings into action the moment you key in something
from the keyboard.You can see the shell is running using ps command.

Bash shell runs at the terminal /dev/pts/2.Following are the activities performed
by the shell in its interpretive cycle:

The shell issues the prompt and waits for you to enter a command.

After a command is entered the shell scans the command line for metacharacters
and expands abbreviations like the * in rm* to recreate a simplified command
line.

It then passes on the command line to the kernel for execution.


The shell waits for the command to complete and normally can’t do any work
while the command is running.

After command execution is complete, the prompt reappears and the shell
returns to its waiting role to start the next cycle.You are now free to enter
another command.

You can change this behaviour and instruct the shell not to wait so you can fire
one job after another without waiting for the previous one to complete .i.e.
Background Processing using metacharacter(&)

Redirection-The three standard Files:

There are three standard files actually containing streams of characters which
many command see as input and output. A stream is simply a sequence of
bytes.When a user logs in, the shell makes available three files representing three
streams.Each stream is associated with a default device

Standard Input-The file representing input, which is connected to the keyboard

Standard Output- The file representing output, which is connected to the display

Standard Error-The file representing error messages that emanate from the
command or shell, also connected to display.

Standard Input: - cat and wc are the commands which when used without
arguments, read the file representing standard input.The file is indeed special; it
can represent three input sources:

The keyboard, the default source.

 A file using redirection with the < symbol 



 Another program using a pipeline 
The symbol used with wc command are < and | (pipe).

$ wc < sample.txt

It will redirect the standard input to originate from a file on disk.This


reassignment requires the < symbol.

Taking Input from both file and standard input

When a command takes input from multiple sources-say a file and standard input
the symbol – must be used to indicate the sequence of taking input.

$cat –foo first from standard input and then from foo
$cat foo –bar first from foo , then standard input and then bar

The third source of standard input is pipe(|)

Standard Output:

All commands displaying output on the terminal actually write to the standard
output file as a stream of characters and not directly to the terminal. The three
possible destinations are:

The terminal (default)

 A file using redirection symbols >,>> 


 As input to another program using pipeline 

$wc sample.txt>newfile
$cat newfile

Sending output of wordcount to the file newfile instead of sending it to the


standard output.

To append the content in a file >> operator is used.


$wc sample.txt>>newfile

If the output file doesn’t exists, the shell creates it before executing the command
else overwrites it so to overcome this problem we can use append operator>>.

Standard Error :

To handle the error stream, file descriptors are used. A file descriptor is the
number used to represent all the three files as:

  Standard Input 
 Standard Output 
 Standard Error 
When you enter an incorrect command to try to open a nonexistent file, certain
error message display on the screen.This is the default destination i.e. terminal.
This error message can be redirected as:

 $cat foo > errorfile will not work


 $cat foo 2>errorfile can work
  $cat foo 2>> errorfile 
 $cat foo > err 2> errorfile 

Two Special Files(/dev/null and /dev/tty):

/dev/null is a special file that simply accepts any stream without growing in size.

$cmp foo1 foo2 >/dev/null


$cat /dev/null

If you check the file size, it will be zero./dev/null simply incinerates all output
written to it.No matter whether you direct or append output to this file, it’s
always remains zero.It is actually a pseudo-device which is useful in redirecting
error message away from the terminal.

/dev/tty : the second special file in the unix system is the one indicating one’s
terminal-/dev/tty. This is not the file that represents standard output or standard
error.Commands enerally don’t write to this file, but you will require redirecting
some statements in the shell scripts.
$who >/dev/tty

If two users issue the same command both can get the ouput in their different
output terminals.say /dev/pts/1 and /dev/pts/2

.dev/tty can be accessed independently by several users without conflict.

$ foo.sh > redirect.txt

Redirecting a script’s output implies redirecting the output of all statements in the
script.echo statements if contained in the script, and the output is redirected to
/dev/tty then it won’t affect.

8.6 Start-up Shell Scripts

When you first login to a shell, your shell runs a systemwide start-up script
(usually called /etc/profile under sh, bash and ksh and /etc/.login under csh). It
then looks in your home directory and runs your personal start-up script (.profile
under sh, bash and ksh and .cshrc under csh and tcsh). Your personal start-up
script is therefore usually a good place to set up environment variables such as
PATH, EDITOR etc. For example with bash, to add the directory ~/bin to your
PATH, you can include the line:

export PATH=$PATH:~/bin

in your .profile. If you subsequently modify your .profile and you wish to import
the changes into your current shell, type:

$ source .profile
or
$ . ./profile

The source command is built into the shell. It ensures that changes to the
environment made in .profile affect the current shell, and not the shell that would
otherwise be created to execute the .profile script.
With csh, to add the directory ~/bin to your PATH, you can include the line:
set path = ( $PATH $HOME/bin )

in your .cshrc.

Command Substitution

Command Substitution is a very handy feature of the bash shell. It enables you to
take the output of a command and treat it as though it was written on the
command line. For example, if you want to set the variable X to the output of a
command, the way you do this is via command substitution.

There are two means of command substitution:

  brace expansion 
 backtick expansion. 

Brace expansion works as follows: $(commands) expands to the output of


commands This permits nesting, so commands can include brace expansions

Backtick expansion expands `commands` to the output of commands

An example is given;:

#!/bin/bash
files="$(ls )"
web_files=`ls
public_html` echo $files
echo $web_files
X=`expr 3 \* 2 + 4` # expr evaluate arithmatic expressions. man expr for details.
echo $X

Note that even though the output of ls contains newlines, the variables do not.
Bash variables can not contain newline Anyway, the advantage of the $()
substitution method is almost self evident: it is very easy to nest. It is supported
by most of the bourne shell varients.However, the backtick substitution is slightly
more readable, and is supported by even the most basic.
Single Quotes versus double quotes

Basically, variable names are exapnded within double quotes, but not single
quotes. If you do not need to refer to variables, single quotes are good to use as
the results are more predictable.
An example

#!/bin/bash
echo -n '$USER=' # -n option stops echo from breaking the
line echo "$USER"
echo "\$USER=$USER" # this does the same thing as the first two lines
The output looks like this (assuming your username is elflord)
$USER=elflord

$USER=elflord

so the double quotes still have a work around. Double quotes are more flexible,
but less predictable. Given the choice between single quotes and double quotes,
use single quotes.

Using Quotes to enclose your variables

It is a good idea to protect variable names in double quotes. This is usually the
most important if your variables value either (a) contains spaces or (b) is the
empty string. An example is as follows:

#!/bin/bash
X=""
if [ -n $X ]; then # -n tests to see if the argument is non empty
echo "the variable X is not the empty string"
fi

This script will give the following output:

the variable X is not the empty string


Because the shell expands $X to the empty string. The expression [ -n ] returns
true (since it is not provided with an argument). A better script would have been:

#!/bin/bash
X=""
if [ -n "$X" ]; then # -n tests to see if the argument is non empty
echo "the variable X is not the empty string"
fi

In this example, the expression expands to [ -n "" ] which returns false, since the
string enclosed in inverted commas is clearly empty.

Variable Expansion in action

The shell does "expand" variables.Here is an example:

#!/bin/bash
LS="ls"
LS_FLAGS="-al"

$LS $LS_FLAGS $HOME

This looks a little enigmatic. It actually executes the command

ls -al /home/elflord

(assuming that /home/elflord is your home directory). That is, the shell simply
replaces the variables with their values, and then executes the command.

Using Braces to Protect Your Variables

Suppose you want to echo the value of the variable X, followed immediately by
the letters "abc". Let's have a try :

#!/bin/bash
X=ABC
echo "$Xabc"
This gives no output.The answer is that the shell thought that we were asking for
the variable Xabc, which is uninitialised. The way to deal with this is to put braces
around X to seperate it from the other characters. The following gives the desired
result:

#!/bin/bash
X=ABC
echo "${X}abc"

Capturing command output

Any UNIX command or program can be executed from a shell script just as if you
would on the line command line. You can also capture the output of a command
and assign it to a variable by using the forward single quotes ` `:

#!\bin\sh
lines=`wc -l $1`
echo "the file $1 has $lines lines"

This script outputs the number of lines in the file passed as the first parameter.

Arithmetic operations

The Bourne shell doesn't have any built-in ability to evaluate simple mathematical
expressions. Fortunately the UNIX expr command is available to do this. It is
frequently used in shell scripts with forward single quotes to update the value of a
variable. For example:

lines = `expr $lines + 1`

adds 1 to the variable lines.

expr supports the operators +, -, *, /, % (remainder), <, <=, =, !=, >=, >, | (or) and
& (and).
8.3 Shell Programming / Shell Scripts

Shell Script: When a group of commands have to be executed regularly, they


should be stored in a file and the file itself executed as shell script or shell
program. The extension used for it is .sh

Shell scripts are executed in a separate child shell process and this sub-shell need
not be the same as your login shell.

#: is used on the rightmost side of a line to give comments.It can be palced


anywhere in the program.If # is used in the first line along with ! mark then it is
said to be interpreter line. It is followed by the path name of the shell to be used
for running the script.

E.g.:

#!/bin/sh
# script.sh an Script Shell
.echo “Today’s date:’date’”
.echo “This months calendar :”
.cal ‘date “+%m 20%y”’
.echo “My shell :$SHELL”

To run the script, make it executable first and then invoke the script name.

$chmod +x script.sh
$script.sh

Explicitly child can be spawned with the script name as argument.

$sh script.sh
Making Scripts Interactive (read)

The read statement is the shell’s internal tool for taking input from the user i.e.
making script interactive.It is used with one or more variables.Input supplied
through the standard input is read into these variables.
.read name
The script pauses at this point to take input from the keyboard.Whatever you
enter will get stored in the variable name.

#!/bin/sh
#emp1.sh
#
.echo “Enter the pattern to be searched :\c”
.read pname
.echo “Enter the file to be used :\c”
.read flname
.echo “Searching for $pname from $flname”
.grep “$pname”$flname
.echo “Selected records shown above”

8.4 Shell Variables and the Environment

A shell lets you define variables (like most programming languages). A variable is a
piece of data that is given a name. Once you have assigned a value to a variable,
you access its value by prepending a $ to the name:

$ bob='hello
world' $ echo $bob
hello world
$

Variables created within a shell are local to that shell, so only that shell can access
them. The set command will show you a list of all variables currently defined in a
shell. If you wish a variable to be accessible to commands outside the shell, you
can export it into the environment:

$ export bob
(under csh you used setenv). The environment is the set of variables that are
made available to commands (including shells) when they are executed. UNIX
commands and programs can read the values of environment variables, and
adjust their behaviour accordingly. For example, the environment variable PAGER
is used by the mancommand (and others) to see what command should be used
to display multiple pages. If you say:

$ export PAGER=cat

and then try the man command (say man pwd), the page will go flying past
without stopping. If you now say:

$ export PAGER=more

normal service should be resumed (since now more will be used to display the
pages one at a time). Another environment variable that is commonly used is the
EDITORvariable which specifies the default editor to use (so you can set this to vi
or emacs or which ever other editor you prefer). To find out which environment
variables are used by a particular command, consult the man pages for that
command.
Another interesting environment variable is PS1, the main shell prompt string
which you can use to create your own custom prompt. For example:

$ export PS1="(\h) \w>


" (lumberjack) ~>

The shell often incorporates efficient mechanisms for specifying common parts of
the shell prompt (e.g. in bash you can use \h for the current host, \w for the
current working directory, \d for the date, \t for the time, \u for the current user
and so on - see the bash man page).

Another important environment variable is PATH. PATH is a list of directories that


the shell uses to locate executable files for commands. So if the PATH is set to:

/bin:/usr/bin:/usr/local/bin:.

and you typed ls, the shell would look for /bin/ls, /usr/bin/ls etc. Note that the
PATH contains'.', i.e. the current working directory. This allows you to create a
shell script or program and run it as a command from your current directory
without having to explicitly say "./filename".

Note that PATH has nothing to with filenames that are specified as arguments to
commands (e.g. cat myfile.txt would only look for ./myfile.txt, not
for/bin/myfile.txt, /usr/bin/myfile.txt etc.)

8.4 Simple Shell Scripting


USING COMMAND LINE ARGUMENTS:

Shell scripts can also accepts arguments from the command line.This way they
can be run non interactively and be used with redirection and pipelines.When
arguments are specified with a shell script, they are assigned to certain special
“variables” rather positional parameters.

The first argument is read into the parameter 1, the second argument into $2
and so on.Few more positional parameter are as:

$* It stores the complete set of positional parameter as a single string

$# It is set to the number of arguments specified.This lets you design


script that check whether the right number of argument have been
entered.

$0 Holds the command name itself.

$! PID of the last background job

$$ PID of the current shell

$? Exit status of last command


exit and Exit status of command:

The two common exit values are:

.exit 0 Used when everything went


fine
.exit 1 Used when something went
wrong

e.g.

$ grep director emp.lst >/dev/null;echo $?


0 Success

$ grep manager emp.lst >/dev/null;echo $?


1 Failure

$ grep director emp3.lst >/dev/null;echo $?


.grep: can’t open emp3.lst
2

The logical operators && and ||-Conditional Execution:

.cmd1&&cmd2
.cmd1||cmd2

The && delimit two commands ; the command cmd2 is executed only when cmd1
succeeds.You can use it with grep in this way:

$ grep ‘director’ emp.lst && echo “pattern found in file”

The || operator plays an inverse role; the second command is executed only
when the first fails.If you grep a pattern from a file without success, you can noify
the failure.

$grep ‘manager’ emp.lst || echo “Pattern not found”


Consider the following simple shell script, which has been created (using an
editor) in a text file called simple:

#!/bin/sh
# this is a comment
echo "The number of arguments is $#"
echo "The arguments are $*"
echo "The first is $1"
echo "My process number is $$"
echo "Enter a number from the keyboard:
" read number
echo "The number you entered was $number"

The shell script begins with the line "#!/bin/sh" . Usually "#" denotes the start of a
comment, but #! is a special combination that tells UNIX to use the Bourne shell
(sh) to interpret this script. The #! must be the first two characters of the script.

The arguments passed to the script can be accessed through $1, $2, $3 etc. $*
stands for all the arguments, and $# for the number of arguments. The process
number of the shell executing the script is given by $$. the read number
statement assigns keyboard input to the variable number.

To execute this script, we first have to make the file simple executable:

$ ls -l simple
-rw-r--r-- 1 will finance 175 Dec 13 simple

$ chmod +x simple

$ ls -l simple

-rwxr-xr-x 1 will finance 175 Dec 13 simple

$ ./simple hello world The


number of arguments is 2
The arguments are hello
world The first is hello
My process number is 2669
Enter a number from the
keyboard: 5
The number you entered was 5
$

We can use input and output redirection in the normal way with scripts, so:

$ echo 5 | simple hello world

would produce similar output but would not pause to read a number from the
keyboard.

8.5 More Advanced Shell Scripting

The if statement makes two-way decisions depending on the fulfillment of


a certain conditions.

 if-then-else statements 
Shell scripts are able to perform simple conditional
branches: if [ test ]
then commands-if-test-is-
true
else commands-if-test-is-
false
fi
Syntax 1

.if command is successful


.then
.execute commands
.else
.execute commands
.fi

Syntax 2

.if command is successful


.then
Execute commands
.fi

Syntax 3

.if command is successful


.then
.execute commands
.elif command is successful
.then….
.execute commands
.else
………..
.fi
Using test and [] to Evaluate Expression:

When you use if to evaluate expressions, you require the test statement because
the true and false values returned by expressions can’t be handled by if..test
uses certain operators to evaluate the condition on its right and returns either a
true or false exit status, which is then used by if for making decisions.test works
in three ways:

Compares two numbers


Compares two strings or a single one for a null value
Checks a file’s attributes

Numeric Comparision:

The numerical comparision operators used by test have a form different from
what you would have seen anwhere. They always begin with –(hyphen) ,
followed by a two character word and enclosed on either side by whitespaces.

Operator Meaning

-eq Equal to
-ne not equal to
-gt Greater than
-ge Greater than or equal to
-lt less than
-le Less than or equal to

String Test Used by test

Test True if

.s1 = s2 String s1=s2


.s1 != s2 String s1 is not equal to s2
-n stg String stg is not a null string
-z stg String is a null string
.stg String stg is assigned and
not null
.s1 == s2 String s1=s2

File Tests
test can be used to test the various file attributes like its type(file,directory
or symbolic link) or its permission (read,write,execute,SUID)

Test True if

-f file file exists and is a regular file


-r file file exists and is readable
-w file file exists and is writable
-x file file exists and is executable
-d file file exists and is directory
-s file file exists and has a size greater than 0
-e file file exists(Korn and Bash only)
-u file file exists and has SUID bit set
-k file file exists and a sticky bit set
.f1 –nt f2 f1 is newer than f2
.f1 –ot f2 f1 is older than f2
.f1 –ef f2 f1 is linked with f2

The test condition may involve file characteristics or simple string or numerical
comparisons. The [used here is actually the name of a command (/bin/[) which
performs the evaluation of the test condition. Therefore there must be spaces
before and after it as well as before the closing bracket. Some common test
conditions are:

-s file
true if file exists and is not empty
-f file
true if file is an ordinary
file -d file
true if file is a directory
-r file
true if file is
readable -w file
true if file is
writable -x file
true if file is
executable $X -eq $Y
true if X equals Y
$X -ne $Y
true if X not equal to
Y $X -lt $Y
true if X less than
$Y $X -gt $Y
true if X greater than $Y
$X -le $Y
true if X less than or equal to Y
$X -ge $Y
true if X greater than or equal to
Y "$A" = "$B"
true if string A equals string
B "$A" != "$B"
true if string A not equal to string B
$X ! -gt $Y
true if string X is not greater than
Y $E -a $F
true if expressions E and F are both true
$E -o $F
true if either expression E or expression F is true
String Comparisions:

test can be used to compare strings with yet another set of operators.Equality is
performed with = and inequality with the != .

e.g.
#! /bin/sh
# Checks user input for null values
#
.if [$# -eq 0]; then
.echo “Enter the strings to be searched :\c”
.read pname
.if [-z “$pname”+; then
.echo you have not entered the string”;
.exit 1
.fi

.echo “Enter the filename to be used:\c”


.read flname
.if [! –n “$flname”+;then
.echo “You have not entered the filename”;
.exit 2
.fi
.emp3a.sh “$pname” “$flname”
.else
.emp3a.sh $*
.fi

e.g. 2

#!/bin/sh
# Using test, $0 and $# in an if-elif-if
#
.if test $# -eq 0; then
.echo “Usage $0 pattern file “>/dev/tty
.elif test $# -eq 2 ; then
.grep “$1” $2 || echo “$1 not found in $2” >/dev/tty
.else
.echo “you didn’t enter two arguments” > /dev/tty
.fi

 for loops

Sometimes we want to loop through a list of files, executing some commands on


each file. We can do this by using a for loop:

for variable in list


do
statements (referring to $variable)
done

The following script sorts each text files in the current directory:

#!/bin/sh
for f in *.txt
do
echo sorting file $f
cat $f | sort > $f.sorted
echo sorted file has been output to $f.sorted
done

 while loops

Another form of loop is the while loop:

while [ test ]
do
statements (to be executed while test is true)
done
The following script waits until a non-empty file input.txt has been created:

#!/bin/sh
while [ ! -s input.txt
] do
echo waiting...
sleep 5
done
echo input.txt is ready

You can abort a shell script at any point using the exit statement, so the following
script is equivalent:

#!/bin/sh
while true
do
if [ -s input.txt ]
echo input.txt is ready
exit
fi
echo waiting...
sleep 5
done

 case statements

case statements are a convenient way to perform multiway branches where one
input pattern must be compared to several alternatives:

case variable
in pattern1)
statement (executed if variable matches pattern1)
;;
pattern2)
statement
;;
etc.
esac
The following script uses a case statement to have a guess at the type of non-
directory non-executable files passed as arguments on the basis of their
extensions (note how the "or" operator | can be used to denote multiple
patterns, how "*" has been used as a catch-all, and the effect of the forward
single quotes `):

#!/bin/sh
for f in $*
do
if [ -f $f -a ! -x $f ]
then
case $f
in core)
echo "$f: a core dump file"
;;
*.c)
echo "$f: a C program"
;;
*.cpp|*.cc|*.cxx)
echo "$f: a C++ program"
;;
*.txt)
echo "$f: a text file"
;;
*.pl)
echo "$f: a PERL script"
;;
*.html|*.htm)
echo "$f: a web document"
;;
*)
echo "$f: appears to be "`file -b $f`
;;
esac
fi
done

You might also like