Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 47

To this aim, the kernel provides macros and functions to abstract the architecture

specific details
▶ Endianness
▶ cpu_to_be32()
▶ cpu_to_le32()
▶ be32_to_cpu()
▶ le32_to_cpu()
▶ I/O memory access

Kernel Memory Constraits


No memory protection
▶ The kernel doesn’t try to recover from attemps to access illegal memory locations.
It just dumps oops messages on the system console.
▶ Fixed size stack (8 or 4 KB). Unlike in user space, no mechanism was
implemented to make it grow.
▶ Swapping is not implemented for kernel memory either.

User Space Driver


Possibilities for user space device drivers:
▶ USB with libusb, http://www.libusb.info/
▶ SPI with spidev, Documentation/spi/spidev
▶ I2C with i2cdev, Documentation/i2c/dev-interface
▶ Memory-mapped devices with UIO, including interrupt handling,
driver-api/uio-howto

Advantages
▶ No need for kernel coding skills. Easier to reuse code between devices.
▶ Drivers can be written in any language, even Perl!
▶ Drivers can be kept proprietary.
▶ Driver code can be killed and debugged. Cannot crash the kernel.
▶ Can be swapped out (kernel code cannot be).
▶ Can use floating-point computation.
▶ Less in-kernel complexity.
▶ Potentially higher performance, especially for memory-mapped devices, thanks to the
avoidance of system calls.

Drawbacks
▶ Less straightforward to handle interrupts.
▶ Increased interrupt latency vs. kernel code.

Kernel browsing cscope

Kernel Compilation Steps


1) Download Kernel
for kernel.org cd /usr/src wget ftp://ftp.kernel.org/pub/linux/kernel/v4.x/linux-4.8.4.tar.gz tar xvf linux-4.8.4.tar.gz
2) Create .config file and make oldconfig
cd /usr/src/linux-4.8.4/ cp /boot/config-$(uname -r) .config make oldconfig
3) compile, make and create .deb file
fakeroot make-kpkg --initrd --append-to-version=-digivalet kernel-image kernel-headers

Kernel COnfig of your running kernel


cp /boot/config-`uname -r` .config
make menuconfig
filesystem should not be module as it if it present as separate file , it will not be loaded as there is no filesystem
module

kernel Options
1. Tristate
2. Int , hex strings values also

CONFIG_MSDOS_FS is not set


CONFIG_UDF_NLS=y
CONFIG_UDF_NLS=m

Make oldconfig and then do make menuconfig

make oldconfig
▶ Needed very often!
▶ Useful to upgrade a .config file from an earlier kernel release
▶ Asks for values for new parameters.
▶ ... unlike make menuconfig and make xconfig which silently set default values
for new parameters.
If you edit a .config file by hand, it’s useful to run make oldconfig afterwards, to set
values to new parameters that could have appeared because of dependency changes.

Prepapration

1.Your need kernel header

2. cd /usr/src/linux-headers-3.5.0-17
sudo make modules_prepare
This will
make sure the kernel contains the information required. The target
exists solely as a simple way to prepare a kernel source tree for
building external modules

Vmlinux – uncompressed kernel in ELF format.

Clean-up generated files (to force re-compilation):


make clean
▶ Remove all generated files. Needed when switching from one
architecture to another. Caution: it also removes your .config
file!
make mrproper

3. STEPS
1.make oldconfig ,
2. menuconfig
3. make modules_install -> installing modules
4. make install ->install kernel and edit grub file

DTS - > DTB


The bootloader must load both the kernel image and the Device Tree Blob in
memory before starting the kernel.

Recent versions of U-Boot can boot the zImage

Only uboot required zimage - > uimage conversion.Uboot can also load DTB

make modules_install - > creates depend file also /modules.dep

that you can write to the kernel log from user space too:
echo "<n>Debug info" > /dev/kmsg

/proc/kmsg is readonly

Modprobe –r <module> removes the modules and dependent module if not in use

Modifino <module> parameter info


Insmod <module> parameter=value

Modprobe.conf also u can put parameter


Also through

kernel command line, when the driver is built statically into the
kernel:
usb-storage.delay_use=0

using sysfs

/sys/module/name/parameters

How to find/edit the current values for the parameters of a loaded module?
▶ Check /sys/module/<name>/parameters.
▶ There is one file per parameter, containing the parameter value.
▶ Also possible to change parameter values if these files have write permissions
(depends on the module code).
▶ Example:
echo 0 > /sys/module/usb_storage/parameters/delay_use

__init or __exit – not compulsory just when hotplug and freeing of space
Module_init and module_exit necessary.

Code marked as __init:


▶ Removed after initialization (static kernel or module.)
▶ See how init memory is reclaimed when the kernel finishes booting:
[ 2.689854] VFS: Mounted root (nfs filesystem) on device 0:15.
[ 2.698796] devtmpfs: mounted
[ 2.704277] Freeing unused kernel memory: 1024K
[ 2.710136] Run /sbin/init as init process

▶ Code marked as __exit:


▶ Discarded when module compiled statically into the kernel, or when module
unloading support is not enabled.

Exporting SYMBOLS ( variable or function)

From a kernel module, only a limited number of kernel functions can be called
▶ Functions and variables have to be explicitly exported by the kernel to be visible
to a kernel module
▶ Two macros are used in the kernel to export functions and variables:
▶ EXPORT_SYMBOL(symbolname), which exports a function or variable to all modules
▶ EXPORT_SYMBOL_GPL(symbolname), which exports a function or variable only to GPL
Modules

5.4 Is the latest

Only GPL licensed module can use symbols exported via EXPORT_SYMBOL_GPL ( eg kernel function)

KASAN – kernel address sanities

Gcc option –fsanitize=kernel-address ( 1/8 memory + 3 times slow)


Moduel param - > for giving parameter while inserting

Network Device driver

Bus Infracture – PCI


And Network Framework –
Probe Function
This function is responsible for
▶ Initializing the device, mapping I/O memory, registering the interrupt handlers. The
bus infrastructure provides methods to get the addresses, interrupt numbers and
other device-specific information.
▶ Registering the device to the proper kernel framework, for example the network
Infrastructure
Platform – Non Discoverable Device – i2c , spi
Resource allocation and settigs in DTS file

The sysfs virtual filesystem offers a mechanism to export such information to


user space

block & characted device is recognized by major and minor number

old kernel ( 2.6.32 ) you have to use mknod /dev/.. major minor to create now
now sys api are used , send udev event to create entry in dev

Exchange data between

Get_user and copy_from_user


And put_user and copy_to_user
Devm_kmalloc
Managed kmalloc. Memory allocated with this function is
automatically freed on driver detach.
Kmalloc – Max 4mb per allocation maximum upto 128mb in total
Vmalloc – very large , also full ram possible.
Allocations of fairly large areas is possible (almost as big as total available
memory, see http://j.mp/YIGq6W again), since physical memory fragmentation is
not an issue, but areas cannot be used for DMA, as DMA usually requires
physically contiguous buffers.
ioremap
To access I/O memory, drivers need to have a virtual address that the processor
can handle, because I/O memory is not mapped by default in virtual memory
Wait queues
While writing modules there might be situations where one might have to wait for input some condition to occur before
proceeding further. Tasks that need such behavior can make use of the sleep functionality available in the kernel.
In Linux sleeping is handled by a data structure called wait queue, which is nothing but a list of processes waiting for an
input or event.
Network rx is softirq
Mutex and Semaphore in Kernel

Mutex and semaphore are also available in kernel


The process requesting the lock blocks when the lock is already held. Mutexes can therefore only be
used in contexts where sleeping is allowed.

Waitqueue
WaitQueue
Whenever a process must wait for an event (such as the arrival of data or the termination of a process),
it should go to sleep. Sleeping causes the process to suspend execution, freeing the processor for
other uses. After some time, the process will be woken up and will continue with its job when the event
which we are waiting will be occurred.

Wait queue is a mechanism provided in kernel to implement the wait. As the name itself suggests, wait
queue is the list of processes waiting for an event
Kernel module is only compiled – make –C … not linked.

void spin_lock_irqsave(spinlock_t *lock,


unsigned long flags);
▶ void spin_unlock_irqrestore(spinlock_t *lock,
unsigned long flags);
▶ Disables / restores IRQs on the local CPU.
▶ Typically used when the lock can be accessed in both process and interrupt context,
to prevent preemption by interrupts.

Character and block devices are accessed through inode info via names in filesystem.
Inode contains reference to cdev structure.

There are many ways to Communicate between the User space and Kernel Space, they are:
 IOCTL
 Procfs
 Sysfs
 Configfs
 Debugfs
 Sysctl
 UDP Sockets
 Netlink Sockets

Some real time applications of ioctl is Ejecting the media from a “cd” drive, to change the Baud Rate of Serial
port, Adjust the Volume,network link up dow.

Define the ioctl code

1. Define the ioctl code

#define "ioctl name" __IOX("magic number","command number","argument type")

where IOX can be :


“IO”: an ioctl with no parameters
“IOW”: an ioctl with write parameters (copy_from_user)
“IOR”: an ioctl with read parameters (copy_to_user)
“IOWR”: an ioctl with both write and read parameters

Fops = unlocked_ioctl

User space you use below


ioctl(fd, WR_VALUE, (int32_t*) &number);
2
3 ioctl(fd, RD_VALUE, (int32_t*) &value);

Lockdep -> detectans deadloack

Lockfree Design
1.RCU - > lockfree design – RCU kernel API , use atom
2.Use of Atomic Operation
3. Per cpu variables

I++ - > may not be atomic in all the platfortm


Atomic_inc(i) - > will be atomic

Usages of locks
Use mutexes in code that is allowed to sleep
▶ Use spinlocks in code that is not allowed to sleep
(interrupts) or for which sleeping would be too
costly (critical sections)
▶ Use atomic operations to protect integers or
Addresses

Workqueue can use mutex , not tasklet

Debuggging.
===========
/proc/sys/kernel/printk - > loglevels , some may on console . but all on the kernel log buffer( dmesg)

1.2Pr_info-> modukles
dev_info-> for kerner drivers

2.
#define DEBUG
pr_debug and dev_dbg

3.
Copyright 2009 Jonathan Corbet <corbet@lwn.net>

Debugfs exists as a simple way for kernel developers to make information


available to user space. Unlike /proc, which is only meant for information
about a process, or sysfs, which has strict one-value-per-file rules,
debugfs has no rules at all. Developers can put any information they want
there. The debugfs filesystem is also intended to not serve as a stable
ABI to user space
JTAG type
1.OPENOCD compatible
2.GDB Compatible
Coherent vs Streaming DMA

MMAP
Perf on Custom KernelPert Tool highly Tight to kernel ,
you should/must not any mismatch kernel version perf.

Perf on Distribution Kernel

apt-get install linux-tools-common linux-tools-{uname -r}

Perf on Custom Kernel ( vanilla Kernel/Kernel.org)

This needs to be compiled. (needs flex and bison)

cd /tools/perf
make
cp perf /usr/bin/.
chmod 755 /usr/bin/perf
Interrupt Handler Flags
The third parameter, flags, can be either zero or a bit mask of one or more of the flags
defined in <linux/interrupt.h>.Among these flags, the most important are
nIRQF_DISABLED—When set, this flag instructs the kernel to disable all interrupts
when executing this interrupt handler.When unset, interrupt handlers run with all
interrupts except their own enabled. Most interrupt handlers do not set this flag, as
disabling all interrupts is bad form. Its use is reserved for performance-sensitive interrupts
that execute quickly.This flag is the current manifestation of the SA_INTERRUPT
flag, which in the past distinguished between “fast” and “slow” interrupts.

nIRQF_SAMPLE_RANDOM—This flag specifies that interrupts generated by this device


should contribute to the kernel entropy pool.The kernel entropy pool provides
truly random numbers derived from various random events. If this flag is specified,
the timing of interrupts from this device are fed to the pool as entropy.

Irqf_shared
The fifth parameter, dev, is used for shared interrupt lines.When an interrupt handler
is freed (discussed later), dev provides a unique cookie to enable the removal of only the
desired interrupt handler from the interrupt line.Without this parameter, it would be
impossible for the kernel to know which handler to remove on a given interrupt line.You
can pass NULL here if the line is not shared, but you must pass a unique cookie if your
interrupt line is shared.

Interrrut handler return value


1.irq_none = in shared inter , handler returns this to indicate this device is not intended.
2.irq_handled = if execution is done.

When a given interrupt handler is executing,


the corresponding interrupt line is masked out on all processors, preventing another
interrupt on the same line from being received

stack in interrupt

The setup of an interrupt handler’s stacks is a configuration option. Historically, interrupt


handlers did not receive their own stacks. Instead, they would share the stack of the
process that they interrupted.1 The kernel stack is two pages in size; typically, that is 8KB
on 32-bit architectures and 16KB on 64-bit architectures.
Early in the 2.6 kernel process, an option was added to reduce the stack size from two
pages down to one, providing only a 4KB stack on 32-bit systems.This reduced memory
pressure because every process on the system previously needed two pages of contiguous,
nonswappable kernel memory.To cope with the reduced stack size, interrupt handlers
were given their own stack, one stack per processor, one page in size.This stack is referred
to as the interrupt stack.

Disabling and Enabling Interrupts on a single CPU


To disable interrupts locally for the current processor (and only the current processor) and
then later reenable them, do the following:
local_irq_disable();
/* interrupts are disabled .. */
local_irq_enable();
These functions are usually implemented as a single assembly operation. (Of course,
this depends on the architecture.) Indeed, on x86, local_irq_disable() is a simple cli
and local_irq_enable() is a simple sti instruction. cli and sti are the assembly calls
to clear and set the allow interrupts flag,

local irq_save and local irq_restore. ( disable and save and enable and restore)
better than above function.
Disaling a particular IRQ on all CPU
void disable_irq(unsigned int irq);
void disable_irq_nosync(unsigned int irq);
void enable_irq(unsigned int irq);
void synchronize_irq(unsigned int irq);

Status of the Interrupt System


In_irq() and in_interrupts() to check if system is in interrupt handler or not.

SoftIRQ
Softirqs are a set of statically defined bottom halves that
can run simultaneously on any processor; even two of the same type can run concurrently.

Softirqs are useful when performance is critical, such as with

networking. Using softirqs requires more care, however, because two of the same
softirq can run at the same time. In addition, softirqs must be registered statically at compile
time.

Softirqs are required only for highfrequency


and highly threaded uses

ksoftirqd kernel thread executes softirq.

Currently, only two subsystems—networking and block devices—directly


use softirqs.

Multiple CPU can run same softirq –so locking is needed.

If you need to use locks in softirq then prefer to use tasklet, if without lock and per cpu variable can be used
Then use softirq ( if it is crticial and performance is required). There is no point in using locks and using softirq
As whole point of parallel running in all core/cpu is gone.

Tasklets are essentially softirqs in which multiple instances of the same handler cannot run concurrently on multiple
processors.

Tasklet
Two different tasklets can run concurrently on
different processors, but two of the same type of tasklet cannot run simultaneously. code can dynamically register tasklets
we can say tasklet softirq with only single instanace running at a time so no need of lock.

Softirq with Lock = tasklet

Worqueue
1. Process context

Bottom Half MEchanism


Softirq
Tasklet
workqueue
Kernel timer
Kthread
Threaded_irq

As with softirqs, tasklets cannot sleep.This means you cannot use semaphores or other
blocking functions in a tasklet.Tasklets also run with all interrupts enabled, so you must
take precautions (for example, disable interrupts and obtain a lock) if your tasklet shares
data with an interrupt handler.

Semaphore and mutex cannot be used in softirq or tasklet as they sleep only variant of spinlock can be used.
Workqueue we can use semaphore or mutex as it can sleep.kthread also.

Ksoftirqd
1. Tasklet and softirq is run by per processor kernel thread.

Workqueue
work queues are schedulable and can therefore sleep
kworker kthhead.

If the deferred work need not sleep,


softirqs or tasklets are used. Indeed, the usual alternative to work queues is kernel threads.
Because the kernel developers frown upon creating a new kernel thread (and, in some
locales, it is a punishable offense), work queues are strongly preferred.They are really easy
to use, too

why not use kthread over workqueue – just because too many thead in kernel is not recommmeded
instead they are put in quque and thread ( kworker ) process it one by one.

This means they are useful for situations in which you


need to allocate a lot of memory, obtain a semaphore, or perform block I/O.

INIT work , schedule work.

Which type of BH to use.


1. Softirq

Softirqs, by design, provide the least serialization.This requires softirq handlers to go


through extra steps to ensure that shared data is safe because two or more softirqs of the
same type may run concurrently on different processors. If the code in question is already
highly threaded, such as in a networking subsystem that is chest-deep in per-processor
variables, softirqs make a good choice.They are certainly the fastest alternative for timingcritical
and high-frequency uses

parallel , high performance

2.tasklet

Tasklets make more sense if the code is not finely threaded.They have a simpler interface
and, because two tasklets of the same type might not run concurrently, they are easier
to implement.Tasklets are effectively softirqs that do not run concurrently.

3.workqueue
Work queues involve the highest overhead
because they involve kernel threads and, therefore, context switching.This is not to
say that they are inefficient, but in light of thousands of interrupts hitting per second (as
the networking subsystem might experience), other methods make more sense. For most
situations, however, work queues are sufficient.

There is no spinlock in uniprocessor machine.

Atomic int or bits


- Counters

Atomicity – execute fully or no completely.


Orderging – barrier.

Need of locks when we have atomic instructions


Why we need locks when we have atomic , just beacsue of huge datastrucure ( like strucuture modification etc)

Spin locks can be used in interrupt handlers, whereas semaphores cannot be used because
they sleep. If a lock is used in an interrupt handler, you must also disable local interrupts
(interrupt requests on the current processor) before obtaining the lock. Otherwise, it
is possible for an interrupt handler to interrupt kernel code while the lock is held and attempt
to reacquire the lock

Note
that you need to disable interrupts only on the current processor.
spin_lock_irqsave(&mr_lock, flags);
/* critical region ... */
spin_unlock_irqrestore(&mr_lock, flags);
The routine spin_lock_irqsave()saves the current state of interrupts, disables them
locally, and then obtains the given lock. Conversely, spin_unlock_irqrestore()unlocks
the given lock and returns interrupts to their previous state.

Spinlock_Iqr is not recommended normally.instead irqsave and irqrestore.

Spinlock in uniprossor
On uniprocessor systems, the previous example must still disable interrupts to prevent
an interrupt handler from accessing the shared data, but the lock mechanism is compiled
away.The lock and unlock also disable and enable kernel preemption, respectively.

Table 10.4 Spin Lock Methods


Method Description
spin_lock() Acquires given lock
spin_lock_irq() Disables local interrupts and acquires given lock
spin_lock_irqsave() Saves current state of local interrupts, disables local interrupts,
and acquires given lock
spin_unlock() Releases given lock
spin_unlock_irq() Releases given lock and enables local interrupts
spin_unlock_irqrestore() Releases given lock and restores local interrupts to given previous
state
spin_lock_init() Dynamically initializes given spinlock_t
spin_trylock() Tries to acquire given lock; if unavailable, returns nonzero
spin_is_locked() Returns nonzero if the given lock is currently acquired, otherwise
it returns zero

Semaphores
Semaphores in Linux are sleeping locks.When a task attempts to acquire a semaphore
that is unavailable, the semaphore places the task onto a wait queue and puts the task to
sleep.The processor is then free to execute other code.When the semaphore becomes
available, one of the tasks on the wait queue is awakened so that it can then acquire the
semaphore.

Reader writer spinlock , and reader write semaphore api exists.

Sequence Lock
A prominent user of the seq lock is jiffies, the variable that stores a Linux machine’s
Uptime
On machines that cannot atomically
read the full 64-bit jiffies_64 variable, get_jiffies_64() is implemented using
seq locks:
u64 get_jiffies_64(void)
{
unsigned long seq;
u64 ret;
do {
seq = read_seqbegin(&xtime_lock);
ret = jiffies_64;
} while (read_seqretry(&xtime_lock, seq));
return ret;
}
Updating jiffies during the timer interrupt, in turns, grabs the write variant of the
seq lock:
write_seqlock(&xtime_lock);
jiffies_64 += 1;
write_sequnlock(&xtime_lock);

Kernel Preemption
.The most frequent of these situations is per-processor data. If the
data is unique to each processor, there might be no need to protect it with a lock because
only that one processor can access the data . But we need kernel preemption to be disabled.
So that only one process running in one cpu can modify it.

Per cpu variable can be run in other core.pre-empt_disable and prempt emable
.

Elegant API is
Get_cpu and put_cpu.
Get_cpu also disables kernel preemption.
We can use get_cpu() and put_cpu() to replace

preempt_disable()
cpu = smp_processor_id() and
preempt_enable()
for slightly better code.

Barrier to avoid reordering


S

Sequence lock is used in calculating the 64jiffies, using sequence_lock there is special lock xtime_lock
To access 64bit jffies.
int x = 1;
if (*(char *)&x == 1)
/* little endian */
else
/* big endian */
Slab Layer
Allocating and freeing data structures is one of the most common operations inside any
kernel.To facilitate frequent allocations and deallocations of data, programmers often
introduce free lists.A free list contains a block of available, already allocated, data structures.
When code requires a new instance of a data structure, it can grab one of the structures
off the free list rather than allocate the sufficient amount of memory and set it up for the
data structure. Later, when the data structure is no longer needed, it is returned to the free
list instead of deallocated. In this sense, the free list acts as an object cache, caching a frequently
used type of object
ECN vs PFC

Ethernet : Pause Frame


Pause Frame is a special type of control frame transmitted by MAC layer to control the rate of incoming
packet. Basically this packet says "I am now overwhelmed, please don't send any frame for this amount of
time".

you can easily find Pause frame from wireshark with the filter "macc.opcode == pause" as shown below.

Exported symbols are visible in /proc/kallsysms ( after inserting the module)


Users of High Resolution Timer
 The primary users of precision timers are user-space applications that utilize nanosleep, posix-timers and Interval Timer
(itimer) interfaces.
 In-kernel users like drivers and subsystems which require precise timed events (e.g. multimedia).

Trigger interrupt
static ssize_t etx_read(struct file *filp, char __user *buf, size_t len, loff_t *off)
{
printk(KERN_INFO "Read Function\n");
asm("int $0x3B"); //Triggering Interrupt. Corresponding to irq 11
return 0;
}

Kernel to user spacen sending signal

 KernelModule: send_sig_info(SIGRTMIN)
 UserProcess: sigaction(SIGRTMIN) (Wait signal by sa_sigaction handler)

thread_bind() to bind the thread to a particular core

Container_of()
I know the address of one member in the structure. But i don’t know the address of that structure. That
structure may have many members. So we can find the address of the structure using
this container_of macro in Linux Kernel.

7 #include <stdio.h>
8
9 #define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
10
11 #define container_of(ptr, type, member) ({ \
12 const typeof( ((type *)0)->member ) *__mptr = (ptr); \
13 (type *)( (char *)__mptr - offsetof(type,member) );})
14
15 int main(void)
16 {
17 struct sample {
18 int mem1;
19 char mem2;
20 };
21
22 struct sample sample1;
23
24 printf("Address of Structure sample1 (Normal Method) = %p\n", &sample1);

printf("Address of Structure sample1 (container_of Method) = %p\n",


container_of(&sample1.mem2, struct sample, mem2));

return 0;
}

container_of(&sample1.mem2, struct sample, mem2)


2 ||
3 ||
4 \/
5 const char * __mptr = &sample1.mem2;
6 struct sample * ((char*) __mptr - &((struct sample*)0)->mem2)
7 ||
8 ||
9 \/
10 const char* __mptr = 0x7FFD0D058784; //(Address of mem2 is 0x7FFD0D058784)
11 struct sample * (0x7FFD0D058784 - 4)
12 ||
13 ||
14 \/
15 struct sample* (0x7FFD0D058780) //This is the address of the container structure

The IRQ0 is mapped to vector using the macro,

#define IRQ0_VECTOR (FIRST_EXTERNAL_VECTOR + 0x10)


where, FIRST_EXTERNAL_VECTOR = 0x20
So if we want to raise an interrupt IRQ11, programmatically we have to add 11 to vector of IRQ0.

0x20 + 0x10 + 11 = 0x3B (59 in Decimal).

Hence executing “asm("int $0x3B")” will raise interrupt IRQ 11

Proc vs Sysfs
From the Linux 2.5 development cycle, a new interface called the /sys file system has been introduced. Sysfs is a
RAM based file system.
It is designed to export the kernel data structures and their attributes from the kernel to the user space, which then
avoids cluttering the /proc file system.
The advantages of sysfs over procfs are as follows:
- A cleaner, well-documented programming interface
- Automatic clean-up of directories and files, when the device is removed from the system
- The enforced one item per file rule, which makes for a cleaner user interface

O(1) was good for server loads not got interactive process.
Latest scheduled is CFS

Scheduler Classes
The Linux scheduler is modular, enabling different algorithms to schedule different types
of processes.This modularity is called scheduler classes. Scheduler classes enable different,
pluggable algorithms to coexist, scheduling their own types of processes

•Zones: to accommodate distinct regions


•Three ‘zones’ on 80x86:
– ZONE_DMA (memory below 16-MB)
– ZONE_NORMAL (from 16-MB to 896-MB)
ZONE_HIGHMEM (memory above 896-MB)
Sbuff is doubly link list.

This is what happens with NAPI devices:


1. The first packet causes the network adapter to issue an IRQ. To prevent further packets from
causing more IRQs, the driver turns off Rx IRQs for the adapter. Additionally, the adapter is
placed on a poll list.
2. The kernel then polls the device on the poll list as long as no further packets wait to be processed
on the adapter.
3. Rx interrupts are re-enabled again.

Wight is no of packets processing in a single poll


IFF_PROMISC

This flag is set (by the networking code) to activate promiscuous operation. By default,
Ethernet interfaces use a hardware filter to ensure that they receive broadcast packets and
packets directed to that interface's hardware address only. Packet sniffers such as tcpdump set
promiscuous mode on the interface in order to retrieve all packets that travel on the interface's
transmission medium

You might also like