Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/277138016

Linux Kernel Internals

Research · May 2015


DOI: 10.13140/RG.2.1.2427.2802

CITATIONS READS
0 9,633

1 author:

Ata-ur-Rasool Haq
University of Johannesburg
1 PUBLICATION   0 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Ata-ur-Rasool Haq on 25 May 2015.

The user has requested enhancement of the downloaded file.


10/13/2014 Linux Kernel Internals
An Introductory observation focusing on
SLAB Allocator, Process Scheduler and
I/O Scheduler

Ata-ur-Rasool Haq
ATANASEERI@YAHOO.COM
Contents
Abstract ............................................................................................................................... 2
Introduction ......................................................................................................................... 3
Operating System ........................................................................................................ 3
Kernel .......................................................................................................................... 4
Linux Operating System ............................................................................................. 5
Process Scheduler ............................................................................................................... 7
Process Scheduling Policy .......................................................................................... 7
Understanding Timeslice ............................................................................................ 8
I/O Scheduler ...................................................................................................................... 9
The Linus Elevator ...................................................................................................... 9
The Deadline I/O Scheduler...................................................................................... 10
The Anticipatory I/O Scheduler ................................................................................ 10
SLAB Allocator ................................................................................................................ 11
Evolution and Conclusion ................................................................................................. 12
References ......................................................................................................................... 14

1|Page
Abstract
To have sufficient understanding of Linux Operating System or any other Operating
System, for that matter, and what makes them functional at the very core, we need to
thoroughly understand its kernel. The kernel, in an abstracted sense, connects the body of
the computer to its soul meaning it is the connection between the hardware and the
software of a Computer. It also deals with the management, resource allocation and time
distribution for programs onto hardware.

2|Page
Introduction
To understand the concept of kernels or, more specifically, the concept of Linux Kernels
one should understand and know what Linux is and in turn what an Operating System is.
Knowing the general notions first will help to grasp upon those highly definitive
concepts.

Operating System
As suggested in the name itself Operating System, in an abstract sense, is what assists us
to operate a computer system and is the driving force of any computer especially when
we consider human-computer interaction. Since the human-computer interaction is linked
to providing an input through some sort of hardware, this discussion extends to the
working of the whole computer system which includes hardware-software collaboration.

The operating system software is the chief piece of software, the portion of the computing
system that manages all of the hardware and all of the other software. To be specific, it
controls every file, every device, every section of main memory, and every nanosecond
of processing time. It controls who can use the system and how. In short, it’s the boss.
Therefore, each time the user sends a command, the operating system must make sure
that the command is executed; or, if it’s not executed, it must arrange for the user to get a
message explaining the error. Remember: This doesn’t necessarily mean that the
operating system executes the command or sends the error message—but it does control
the parts of the system that do. (McHoes & Flynn, 2011)

FIGURE 1: OPERATING SYSTEMS TURN UGLY HARDWARE INTO BEAUTIFUL ABSTRACTIONS. (TANENBAUM, 2009)

3|Page
To put it into modest words, the user of the system, in the modern computer architecture,
may only interact with the an application program of any sort to achieve some goal and
that application sits on top of the operating system which uses the kernel to get facilitated
for performing the action or command by us. Note that usually the top-level application
programs are tremendously user-friendly and require little effort to perform tasks because
all the complicated back-end tasks are handled by the OS.

There are few main categories that operating systems fall into which are Batch systems
designed to handle set of jobs in sequence for processing, Interactive systems intended to
support interactive computing for users connected to the computer system via
communication lines, Real-time systems considered to sustain application programs with
very tight timing constraints and, finally, Hybrid systems to support batch and interactive
computing. (Garrido, Schlesinger, & Hoganson, 2013)

Kernel
Previously, the discussion of Operating System took-off with the idea of human-
computer interaction which lead to hardware-software interaction, simply put hardware-
software interaction has a “middle-man” called the kernel. It is a central part of any
Operating System and it is a kind of software itself.

Between these bits of hardware and the applications you use every day is a layer of
software that is responsible for making all of the hardware work efficiently and building
an infrastructure on top of which the applications you use can work. This layer of
software is the operating system, and its core is the kernel. In modern operating systems,
the kernel is responsible for the things you normally take for granted: virtual memory,
hard-drive access, input/output handling and so forth. Generally larger than most user
applications, the kernel is a complex and fascinating piece of code that is usually written
in a mix of assembly, the low-level machine language, and C. In addition, the kernel uses
some underlying architecture properties to separate itself from the rest of the running
programs. (Perla & Oldani, 2011)

This effectively important role that the kernel takes on, requires it to be efficiently active
at any given time when the machine is in use. Therefore, when the computer has starts its
normal operations one of the first task is to deliver the kernel into the main memory so it
can occupy a given space and actively assist the operating system by performing its tasks
until the computer is then shut-off.

If, for a deeper look, we were to divide the OS structure into a few categories it would
first have the Graphical User Interface (GUI) right on the top and then System Call
Interface (SCI) which would ‘call’ system or hardware operations through the kernel
sitting under it.

4|Page
The system services interface ( or the system call interface ) separates the kernel
from the application layer. Located above the hardware, the kernel is the core and
the most critical part of the operating system and needs to be always resident in
memory. A detailed knowledge about the different components of the operating
system, including these lower-level components, corresponds to an internal view of
the system. (Garrido, Schlesinger, & Hoganson, 2013)

The following figure places the above mentioned concepts into an abstractive perspective
to show where the kernel would exist in the system of interactions.

FIGURE 2: ABSTRACT IDEA OF THE KERNEL’S POSITION

Linux Operating System


Linux OS has been around for a long time which actually emerged out of a renowned
UNIX operating system. Perhaps, it is now the most powerful, open-source (free) and a
highly reliable operating system. With an enormous range of commercial and non-
commercial distributions available, this OS comprises of a lot of features so one can find
a distribution suited for specific needs.

Linux’s life began in the hands of Linus Torvalds at the University of Helsinki in
Finland. While the Linux we know today has been developed with the assistance of
programmers world-wide, Linus Torvalds still retains control of the evolving core of the
Linux operating system: the kernel. Torvalds originally intended to develop Linux as a

5|Page
hobby. Early versions didn’t have the end-user in mind, instead providing the barest
bones of functionality to allow UNIX programmers the apparent joy of programming the
kernel. (Danesh, 1999)

Today, Linux has reached a point of full commercial application in all fields from
scientific research to the business environment and making its way towards, but not
limited to, compact mobile OS. It is also over-coming its friendly-interface issues and
support shortages in regards to hardware and software compatibility support, which, in
the past has made the general public reluctant to use this OS.

If the world hadn't contributed to building Linux from scratch in the 1990s, FreeBSD
operating system might be more advanced than any other OS is today. Plenty of wheels
were reinvented during Linux's formative years, and perhaps without the need to take
those steps back, FreeBSD OS may have taken faster steps forward. (Venezia, 2012)

FIGURE 3: LINUX MILESTONES TIMELINE (CRAIG, 2013)


Standards such as the Linux Standard Base and organizations make it possible for Linux
to continue to be a stable, productive operating system into the future. (Negus, 2012)

6|Page
Process Scheduler
In reality our computers can only do one task at a time, even if that task runs for a micro-
second, because it has limited resources especially when it comes to processing capacity
and main-memory. Therefore, our operating system has to handle these resources in such
a way that we would not have to wait for something to get done before doing performing
another task. This “handling” of resources is achieved by the concept of Process
Scheduling in the operating system kernel.

The Scheduler is the component of the kernel that selects which process to run next. This
process scheduler can be viewed as the code that divides the finite resource of processing
time between the runnable processes on a system. (Love, 2004)

One notable advantage of scheduling processes is that it allows multi-tasking or


simultaneous execution by swapping resources to processes in an insolent way without
disturbing the user. Consequently it gives an impression that all tasks are running at the
same time even if some are actually not running but rather waiting to be ran, in the main
memory.

Multitasking operating systems come in two flavors: cooperative multitasking and


preemptive multitasking. Linux, like all UNIX variants and most modern operating
systems, provides preemptive multitasking in which the scheduler decides when a process
is to stop running and a new process is to resume running. The act of involuntarily
suspending a running process is called preemption. The time process runs before it is
preempted is predetermined, and is called timeslice of the process which gives each
process a slice of the processors time. Managing the timeslice enables the scheduler, or
more generally the kernel, to make global scheduling decisions for the system. It also
prevents from any one process from monopolizing the system. (Love, 2004)

On the other hand, cooperative multitasking allows a process to stop when the process
itself wants to stop which clearly poses many problems. The apparent problem with this
concept would be that the resources are occupied by one process for undeterminable
amount of time, hence this concept is nearly obsolete and may only be found in academia
for understanding related concepts.

Process Scheduling Policy


Policy in this context of handling processes and resource allocation refers to rules or
algorithms deployed by kernel to determine which process should be ran first or next.
This policy is of utmost importance when considering the flow of the system and it also
has a great impact on the end-user-experience. As mentioned before Linux OS makes a
heavy use of time-sharing techniques which is a part of its scheduling policy.

In addition to time-slicing the Linux kernel scheduling policy also includes the concept of
ranking processes according to their priority. Complicated algorithms are sometimes used
7|Page
to derive the current priority of a process, but the end result is the same: each process is
associated with a value that tells the scheduler how appropriate it is to let the process run
on a CPU. In Linux, process priority is dynamic. The scheduler keeps track of what
processes are doing and adjusts their priorities periodically; in this way, processes that
have been denied the use of a CPU for a long time interval are boosted by dynamically
increasing their priority. Correspondingly, processes running for a long time are
penalized by decreasing their priority. When speaking about scheduling, processes are
traditionally classified as I/O-bound or CPU-bound. The former make heavy use of I/O
devices and spend much time waiting for I/O operations to complete; the latter carry on
number-crunching applications that require a lot of CPU time. (Bovet & Cesati, 2006)

Understanding Timeslice
The timeslice, in fact, is just a number to determine how long a process would be in a run
state and how long would it wait in the memory or preempted. The scheduler policy must
decide the timeslice duration, which a difficult decision because if a timeslice is too long
it will cause poor system-interaction performance and if it is too short then a lot of CPU
time would be wasted due to swapping processes in and out of the processor.

In Linux the timeslice is tightly related to the process priority meaning processes with
higher priority or rank will run before the ones with the lower priority or rank.

The Linux scheduler bumps the priority of interactive tasks, enabling them to run more
frequently. Consequently, the Linux scheduler offers a relatively high default timeslice,
as shown in the figure below. Implementing dynamic time-slices and priorities provides
robust scheduling performance. (Love, 2004)

FIGURE 4: PROCESS TIMESLICE CALCULATION (LOVE, 2004)

8|Page
I/O Scheduler
Input/output scheduling refers to handling of block devices and their interaction by the
kernel. Like the process scheduler, I/O scheduler is also a mandatory component of the
kernel to provide a good OS experience. I/O scheduler handles one of the slowest devices
in a computer system which is hard drive. Due to the mechanical aspect of disk it loses a
lot of time in physical operation such as moving the disk head (seek) and of course there
is a whole lot of spinning of disk plates so the kernel has to have a way to work with and
around these short-comings and give a charming feel of computer operations.

When a kernel component wishes to read or write some disk data, it actually creates a
block device request. That request essentially describes the requested sectors and the kind
of operation to be performed on them (read or write). However, the kernel does not
satisfy a request as soon as it is created—the I/O operation is just scheduled and will be
performed at a later time. This artificial delay is paradoxically the crucial mechanism for
boosting the performance of block devices. When a new block data transfer is requested,
the kernel checks whether it can be satisfied by slightly enlarging a previous request that
is still waiting (i.e., whether the new request can be satisfied without further seek
operations). Because disks tend to be accessed sequentially, this simple mechanism is
very effective. (Bovet & Cesati, 2006)

I/O scheduler works by managing a block device’s request queue and performs two
primary tasks to minimize seek: merging and sorting. Merging is the coalescing of two or
more requests into one. (Love, 2004) For example, if two different request want to access
and read from two different sectors which are next to each other, then these request can
be merged together and this will be efficient since the data access time will be reduced.

Sorting refers to sorting the requests in the queue in such a way that the request for one
sector-group can be next to each other in the queue.

The Linus Elevator


Linus Elevator named after Linus Torvalds was the default I/O scheduler in Linux 2.4. It
performs both merging and sorting. When a request is added to the queue, it is first
checked against every other pending request to see if it is a possible candidate for
merging. (Love, 2004) When a request is added to the queue, four operations are
possible:

 First, if a request to an adjacent on-disk sector is in the queue, the existing request and the new one
merge.
 Second, if a request in the queue is sufficiently old, the new request is inserted the tail of the
queue.
 Next, if there is a suitable location sector-wise in the queue, the new request is inserted there.
 Finally, if no such suitable insertion point exists, the request is inserted at the tail of the queue.
(Love, 2004)

9|Page
The Deadline I/O Scheduler
The “Deadline” elevator makes use of four queues. Two of them -the sorted queues -
include the read and write requests, respectively, ordered according to their initial sector
numbers. The other two - the deadline queues - include the same read and write requests
sorted according to their “deadlines.” These queues are introduced to avoid request
starvation, which occurs when the elevator policy ignores for a very long time a request
because it prefers to handle other requests that are closer to the last served one. A request
deadline is essentially an expire timer that starts ticking when the request is passed to the
elevator. By default, the expire-time of read requests is 500 milliseconds, while the expire
time for write requests is 5 seconds—read requests are privileged over write requests
because they usually block the processes that issued them. The deadline ensures that the
scheduler looks at a request if it’s been waiting a long time, even if it is low in the sort.
(Bovet & Cesati, 2006)

FIGURE 5: QUEUES OF THE DEADLINE I/O SCHEDULER (LOVE, 2004)

The Anticipatory I/O Scheduler


The “Anticipatory” elevator is the most sophisticated I/O scheduler algorithm offered by
Linux. Basically, it is an evolution of the “Deadline” elevator, from which it borrows the
fundamental mechanism: there are two deadline queues and two sorted queues; the I/O
scheduler keeps scanning the sorted queues, alternating between read and write requests,
but giving preference to the read ones. The scanning is basically sequential, unless a
request expires. The default expire time for read requests is 125 milliseconds, while the
default expire time for write requests is 250 milliseconds. (Bovet & Cesati, 2006) The
elevator, however, follows some additional heuristics:

 The elevator might choose a request behind the current position in the sorted
queue, thus forcing a backward seek of the disk head.

10 | P a g e
 The elevator collects statistics about the patterns of I/O operations triggered by
every process in the system.
(Bovet & Cesati, 2006)

SLAB Allocator
Allocating and freeing structures in one of the most common operations inside any
kernel. Linux archives this by using its Slab Allocator mechanism.

The basic idea behind the slab allocator is to have caches of commonly used objects kept
in an initialized state available for use by the kernel. Without an object based allocator,
the kernel will spend much of its time allocating, initializing and freeing the same object.
The slab allocator aims to cache the freed object so that the basic structure is preserved
between uses. (Bonwick, 1994)

The slab allocator consists of a variable number of caches that are linked together on a
doubly linked circular list called a cache chain. A cache, in the context of the slab
allocator, is a manager for a number of objects of a particular type like the mm_struct or
fs_cache cache and is managed by a struct kmem_cache_s discussed in detail later.
The caches are linked via the next field in the cache struct. (Bonwick, 1994)

The slab allocator has three principle aims:

 The allocation of small blocks of memory to help eliminate internal fragmentation


that would be otherwise caused by the buddy system;
 The caching of commonly used objects so that the system does not waste time
allocating, initializing and destroying objects. Benchmarks on Solaris showed
excellent speed improvements for allocations with the slab allocator in use.
 The better utilization of hardware cache by aligning objects to the L1 or L2
caches.

(Bonwick, 1994)

The slab layer divides different objects into groups called caches, each of which stores a
different type of object. There is one cache per object type. The caches are then divided
into slabs which are composed of one or more physically contiguous pages. Typically,
slabs are composed of only a single page. Each slab contains some number of objects
which are the data structures being cached. (Love, 2004)

Each slab is in one of the three states: full, partial or empty. A full slab has no free
objects. An empty slab has no allocated objects. A partial slab has some allocated objects
and some free objects. When some part of the kernel requests new object, the request is
satisfied from a partial slab, if one exists. (Love, 2004)

11 | P a g e
FIGURE 6: LAYOUT OF THE SLAB ALLOCATOR (BONWICK, 1994)

Evolution and Conclusion


The 1.2 Linux scheduler used a circular queue for runnable task management that
operated with a round-robin scheduling policy. This scheduler was efficient for adding
and removing processes (with a lock to protect the structure). In short, the scheduler
wasn't complex but was simple and fast. Linux version 2.2 introduced the idea of
scheduling classes, permitting scheduling policies for real-time tasks, non-preemptible
tasks, and non-real-time tasks. The 2.2 scheduler also included support for symmetric
multiprocessing (SMP). The 2.4 kernel included a relatively simple scheduler that
operated in O(N) time (as it iterated over every task during a scheduling event). The 2.4
scheduler divided time into epochs, and within each epoch, every task was allowed to
execute up to its time slice. If a task did not use all of its time slice, then half of the
remaining time slice was added to the new time slice to allow it to execute longer in the
next epoch. The scheduler would simply iterate over the tasks, applying a goodness
function (metric) to determine which task to execute next. Although this approach was
relatively simple, it was relatively inefficient, lacked scalability, and was weak for real-
time systems. It also lacked features to exploit new hardware architectures such as multi-
core processors. The early 2.6 scheduler, called the O(1) scheduler, was designed to solve
many of the problems with the 2.4 scheduler—namely, the scheduler was not required to
iterate the entire task list to identify the next task to schedule (resulting in its name,
O(1),which meant that it was much more efficient and much more scalable). Given the
issues facing the O(1) scheduler and other external pressures, something needed to

12 | P a g e
change. That change came in the way of a kernel patch from Con Kolivas, with his
Rotating Staircase Deadline Scheduler (RSDL), which included his earlier work on the
staircase scheduler. The result of this work was a simply designed scheduler that
incorporated fairness with bounded latency. Kolivas' scheduler impressed many (with
calls to incorporate it into the current 2.6.21 mainline kernel), so it was clear that a
scheduler change was on the way. Ingo Molnar, the creator of the O(1) scheduler, then
developed the CFS based around some of the ideas from Kolivas' work. Let's dig into the
CFS to see how it operates at a high level. (Jones, 2009)

The field of Linux kernel and Linux as a whole quite interesting. One should definitely
consider Linux kernel development if there is an interest in computer operating system
and after all these are becoming the part of our everyday life. The best part of it all is that
it is an open source system which gives us all the power we need to manipulate any
component we like to our taste and it is a great tool for empowering the coming
generations. This research has discussed some of the crucial parts of kernel development
or a kernel system which are; the process scheduler, I/O scheduler and the Slab Allocator.
These are the basis of the Linux operating system.

FIGURE 7: LINUX GUI (HTTP://TECHTRAVELANDTALKS.BLOGSPOT.COM)

13 | P a g e
References
Bonwick, J. (1994). The slab allocator: An object-caching kernel memory allocator.
USENIX Summer, 87-98.

Bovet, D. P., & Cesati, M. (2006). Understanding the Linux Kernel (3rd Edition).
Sebastopol, CA: O’Reilly Media, Inc.

Craig. (2013, January 31). Memorable Linux Milestones. Retrieved from Linux
Foundation: http://www.linuxfoundation.org/news-
media/infographics/memorable-linux-milestones

Danesh, A. (1999). Mastering Linux: The Linux Resource for the Non-Unix User.
Alameda, CA: SYBEX Inc.

Garrido, J. M., Schlesinger, R., & Hoganson, K. (2013). Principles of Modern Operating
Systems, SE. Burlington, MA 01803: Jones & Bartlett Learning.

Jones, M. T. (2009). Inside the Linux 2.6 Completely Fair Scheduler. Providing fair
access to CPUs since 2.6.23, 1-2.

Love, R. (2004). Linux Kernel Development. Sams Publishing.

McHoes, A. M., & Flynn, I. M. (2011). Understanding Operating Systems. Boston, MA:
Course Technology, Cengage Learning.

Negus, C. (2012). Linux Bible, 8th Edition. Indianapolis, IN: John Wiley & Sons, Inc.

Perla, E., & Oldani, M. (2011). A Guide to Kernel Exploitation: Attacking the Core.
Burlington, MA: Elsevier Inc.

Tanenbaum, A. S. (2009). Modern Operating Systems . Upper Saddle River, N.J:


Pearson-Prentice Hall .

Venezia, P. (2012, November 12). A world without Linux: Where would Apache,
Microsoft -- even Apple be today? Retrieved from Infoworld:
http://www.infoworld.com/article/2616083/data-center/a-world-without-linux--
where-would-apache--microsoft----even-apple-be-today-.html

14 | P a g e

View publication stats

You might also like