IOS AUGMENTED TOPIC

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 15

CONTENET BEYOND THE SYLLABUS TOPICS – IOS

1.KERNEL DATA STRUCTURES

The kernel data structures are very important as they store data about the current state of the system. For
example, if a new process is created in the system, a kernel data structure is created that contains the details
about the process.

Most of the kernel data structures are only accessible by the kernel and its subsystems. They may contain data
as well as pointers to other data structures.

Kernel Components

The kernel stores and organizes a lot of information. So it has data about which processes are running in the
system, their memory requirements, files in use etc. To handle all this, three important structures are used.
These are process table, file table and v node/ I node information.

Details about these are as follows:

Process Table

The process table stores information about all the processes running in the system. These include the storage
information, execution status, file information etc.

When a process forks a child, its entry in the process table is duplicated including the file information and file
pointers. So the parent and the child process share a file.

File Table

The file table contains entries about all the files in the system. If two or more processes use the same file, then
they contain the same file information and the file descriptor number.

Each file table entry contains information about the file such as file status (file read or file write), file offset etc.
The file offset specifies the position for next read or write into the file.
The file table also contains v-node and i-node pointers which point to the virtual node and index node
respectively. These nodes contain information on how to read a file.

V-Node and I-Node Tables

Both the v-node and i-node are references to the storage system of the file and the storage mechanisms. They
connect the hardware to the software.

The v-node is an abstract concept that defines the method to access file data without worrying about the actual
structure of the system. The i-node specifies file access information like file storage device, read/write
procedures etc.

2.Implicit threading in OS

Implicit Threading

One way to address the difficulties and better support the design of multithreaded
applications is to transfer the creation and management of threading from application
developers to compilers and run-time libraries. This, termed implicit threading, is a
popular trend today.

Implicit threading is mainly the use of libraries or other language support to hide the
management of threads. The most common implicit threading library is OpenMP, in
context of C.

OpenMP is a set of compiler directives as well as an API for programs written in C, C++,
or FORTRAN that provides support for parallel programming in shared-memory
environments. OpenMP identifies parallel regions as blocks of code that may run in
parallel. Application developers insert compiler directives into their code at parallel
regions, and these directives instruct the OpenMP run-time library to execute the region
in parallel. The following C program illustrates a compiler directive above the parallel
region containing the printf() statement:

Example

Live Demo

#include <omp.h>
#include <stdio.h>
int main(int argc, char *argv[]){
/* sequential code */
#pragma omp parallel{
printf("I am a parallel region.");
}
/* sequential code */
return 0;
}

Output
I am a parallel region.

When OpenMP encounters the directive

#pragma omp parallel

It creates as many threads which are processing cores in the system. Thus, for a dual-
core system, two threads are created, for a quad-core system, four are created; and so
forth. Then all the threads simultaneously execute the parallel region. When each thread
exits the parallel region, it is terminated. OpenMP provides several additional directives
for running code regions in parallel, including parallelizing loops.

In addition to providing directives for parallelization, OpenMP allows developers to


choose among several levels of parallelism. Eg, they can set the number of threads
manually. It also allows developers to identify whether data are shared between threads
or are private to a thread. OpenMP is available on several open-source and commercial
compilers for Linux, Windows, and Mac OS X systems.

Grand Central Dispatch (GCD)

Grand Central Dispatch (GCD)—a technology for Apple’s Mac OS X and iOS operating
systems—is a combination of extensions to the C language, an API, and a run-time
library that allows application developers to spot sections of code to run in parallel. Like
OpenMP, GCD also manages most of the details of threading. It identifies extensions to
the C and C++ languages known as blocks. A block is simply a self-contained unit of
work. It is specified by a caret ˆ inserted in front of a pair of braces { }. A simple
example of a block is shown below −

{
ˆprintf("This is a block");
}

It schedules blocks for run-time execution by placing them on a dispatch queue. When
GCD removes a block from a queue, it assigns the block to an available thread from the
thread pool it manages. It identifies two types of dispatch queues: serial and concurrent.
Blocks placed on a serial queue are removed in FIFO order. Once a block has been
removed from the queue, it must complete execution before another block is removed.
Each process has its own serial queue (known as main queue). Developer can create
additional serial queues that are local to particular processes. Serial queues are useful
for ensuring the sequential execution of several tasks. Blocks placed on a concurrent
queue are also removed in FIFO order, but several blocks may be removed at a time,
thus allowing multiple blocks to execute in parallel. There are three system-wide
concurrent dispatch queues, and they are distinguished according to priority: low,
default, and high. Priorities represent an estimation of the relative importance of blocks.
Quite simply, blocks with a higher priority should be placed on the high priority dispatch
queue. The following code segment illustrates obtaining the default-priority concurrent
queue and submitting a block to the queue using the dispatch async() function:

dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);


dispatch async(queue, ˆ{ printf("This is a block."); });

Internally, GCD’s thread pool is composed of POSIX threads. GCD actively manages the
pool, allowing the number of threads to grow and shrink according to application
demand and system capacity.

Threads as Objects

In alternative languages, ancient object-oriented languages give explicit multithreading


support with threads as objects. In these forms of languages, classes area written to
either extend a thread class or implement a corresponding interface. This style
resembles the Pthread approach, because the code is written with explicit thread
management. However, the encapsulation of information inside the classes and extra
synchronization options modify the task.

Java Threads

Java provides a Thread category and a Runnable interface that can be used. Each need
to implement a public void run() technique that defines the entry purpose of the thread.
Once an instance of the object is allotted, the thread is started by invoking the start()
technique on that. Like with Pthreads, beginning the thread is asynchronous, that the
temporal arrangement of the execution is non-deterministic.

Python Threads

Python additionally provides two mechanisms for multithreading. One approach is


comparable to the Pthread style, wherever a function name is passed to a library method
thread.start_new_thread(). This approach is very much and lacks the flexibility to join or
terminate the thread once it starts. A additional flexible technique is to use the threading
module to outline a class that extends threading. Thread. almost like the Java approach,
the category should have a run() method that gives the thread's entry purpose. Once an
object is instantiated from this category, it can be explicitly started and joined later.

Concurrency as Language Design

Newer programming languages have avoided race condition by building assumptions of


concurrent execution directly into the language style itself. As an example, Go combines
a trivial implicit threading technique (goroutines) with channels, a well-defined style of
message-passing communication. Rust adopts a definite threading approach the same as
Pthreads. However, Rust has terribly strong memory protections that need no extra work
by the software engineer.

Goroutines
The Go language includes a trivial mechanism for implicit threading: place the keyword
go before a call. The new thread is passed an association to a message-passing channel.
Then, the most thread calls success := <-messages, that performs a interference scan
on the channel. Once the user has entered the right guess of seven, the keyboard
auditor thread writes to the channel, permitting the most thread to progress.

Channels and goroutines are core components of the Go language, that was designed
beneath the belief that almost all programs would be multithreaded. This style
alternative streamlines the event model, permitting the language itself up-to-date the
responsibility for managing the threads and programing.

Rust Concurrency

Another language is Rust that has been created in recent years, with concurrency as a
central design feature. The following example illustrates the use of thread::spawn() to
create a new thread, which can later be joined by invoking join() on it. The argument to
thread::spawn() beginning at the || is known as a closure, which can be thought of as
an anonymous function. That is, the child thread here will print the value of a.

Example

use std::thread;
fn main() {
/* Initialize a mutable variable a to 7 */
let mut a = 7;
/* Spawn a new thread */
let child_thread = thread::spawn(move || {
/* Make the thread sleep for one second, then print a */
a -= 1;
println!("a = {}", a)
});
/* Change a in the main thread and print it */
a += 1;
println!("a = {}", a);
/* Join the thread and print a again */
child_thread.join();
}

However, there is a subtle point in this code that is central to Rust's design. Within the
new thread (executing the code in the closure), the a variable is distinct from the a in
other parts of this code. It enforces a very strict memory model (known as "ownership")
which prevents multiple threads from accessing the same memory. In this example, the
move keyword indicates that the spawned thread will receive a separate copy of a for its
own use. Regardless of the scheduling of the two threads, the main and child threads
cannot interfere with each other's modifications of a, because they are distinct copies. It
is not possible for the two threads to share access to the same memory.
3.Memeory Management in Android

ava has automatic memory management. It performs routine garbage collection


to clean up unused objects and free up the memory. However, it is very important
for us to know how the garbage collector works in order to manage the
application’s memory effectively. Thus avoiding OutOfMemoryError and/or
StackOverflowError exceptions.
Let’s start with the memory structure first. For effective memory management,
JVM divides memory into Stack and Heap.
Stack Memory
Java Stack memory is used for the execution of the thread. They contain method-
specific values which that are short-lived and references to the other objects in
the heap that are getting referred from the method.
Example:
public void methodA() {
int a = 10;
methodB(a);
}
public void methodB(int value) {
int b = 10;
//Rest of the code.
}

Stack Memory
From the above picture, it is clear that local variables of the respective method
will be created in the same frame. For example, variable “b” of “methodB” can be
accessed by “methodB” only and not by “methodA”, as “methodA” is in separate
frame. Once the “methodB” execution is completed, the control will go to the
calling function. In this case, it’s “methodA”. Thus, the frame for “methodB” will
be removed from the stack and all the variables in that frame will also be flushed
out. Likewise, for “methodA”.
Heap Memory
Java heap space is used to allocate memory to the objects. Whenever we create
Java/Kotlin objects, these will be allocated in the Heap memory.
Garbage collection process runs in the heap memory. Let’s go through the basic
garbage collection process and structure of the heap memory in detail

Garbage Collection Process


Garbage Collection is a process of cleaning up the heap memory. Garbage
collector identifies the unreferenced objects and removes them to free the
memory space.
The objects that are being referenced are called ‘Live objects’ and those which
are not referenced are called ‘Dead objects’.
This process can be triggered at any time and we don’t have any control over it.
We can also request the system to initiate GC process in case we want to. But
there is no guarantee that it will be initiated by the system, it is up to the system
to decide.
Let’s go through the basic process involved in Garbage collection.
Step 1 : Marking
Most of us think that Garbage Collector marks dead objects and removes them. In
reality, it is exactly the opposite. Garbage Collector first finds the ‘Live objects’
and marks them. This means the rest of the objects that are not marked are ‘Dead
objects’.
Step 2 : Normal Deletion
Once Garbage Collector finds the ‘Dead objects’, it will remove them from the
memory.
Step 3 : Deletion with Compacting
Memory allocator holds the reference of the free memory space and searches for
the same whenever new memory has to be allocated. In order to improve
performance, it is better if we move all the referenced objects to one place. Thus,
this step helps in improving the memory allocation process.

Basic GC process
This algorithm is called a mark-sweep-compact algorithm.
As the number of objects increase, the above process i.e., Marking, Deletion and
Deletion with compacting is inefficient. As per the empirical analysis, most
objects are short-lived. Based on this analysis, the heap structure is divided into
three generations.
Heap Structure
The heap structure is divided into three divisions namely, Young Generation,
Tenured or Old Generation, and Permanent Generation.

Heap Structure
Young Generation – This is where all the new objects are allocated and aged.
This generation is split into Eden Space and two Survivor spaces.
Eden Space – All new objects are allocated here. Once this space is full, minor
Garbage Collection will be triggered. As mentioned, when the Garbage Collection
is triggered, it first marks all the live objects in Eden Space and moves them to
one of the Survivor spaces. Thus, Eden space is cleared so that the new objects
can be allocated there again.
Survivor Space – After Minor GC, the live objects from Eden space will be moved
to one of the survivor spaces S0 or S1.
The below diagram describes the Garbage Collection process in Young
Generation.

GC process in Young Generation


Let’s see how the object is allocated and either flushed (Garbage Collection
Process) or moved to an older generation in detail. Each point below explains the
respective state number mentioned in the above diagram:

1. Initially, all objects are allocated in Eden Space.

2. Once Eden Space is full, Minor GC will be triggered. Minor GC is always


“Stop the World Event”. This means that when this process is executed, the
application thread will be stopped.

3. The first step in GC Process as mentioned is Marking. Garbage Collector


identifies live objects and marks them. As shown in the above picture, two
of the objects are referenced while others are unreferenced objects.
4. Once marking is done, live objects in Eden Space are copied to one of the
Survivor spaces and the rest of the objects are removed. Thus, clearing
Eden Space. This algorithm is called a mark-copy algorithm. In order to
understand the aging of the objects, let’s consider the age of the object in
S0 as 1.
This shows the state of Young Generation after Step (4)

1. Now, if new objects are to be created then these will be allocated in Eden
Space.

2. Once again Eden Space is full, which in turn triggers Minor GC. Here,
objects in the Eden Space and Survivor space S0 are scanned. Garbage
collector marks all the live objects in Eden Space and S0. As shown in the
diagram, at this stage Eden space and S0 have one live object.

3. In Step (7), the marking process is completed. Thus GC will move all the
objects from Eden Space and S0 to S1. This clears both Eden Space and S0.
Let’s calculate the age of the object. The age of the object moved from S0
to S1 will be incremented by 1. Thus, its age will be 2. The age of the object
which is moved from Eden Space to S1 will be 1.

4. Now, if new objects are to be created, they will be allocated in Eden Space
of the young generation.

5. Once again Eden Space is full, which in turn triggers Minor GC. Here,
objects in the Eden Space and Survivor space S1 are scanned. Garbage
collector marks all the live objects in Eden Space and S1. As shown in the
diagram, at this stage Eden space and S1 have two live objects.

6. All the live objects from Eden space and S1 will be moved to S0. The age of
the objects moving from S1 to S0 will thus be 3 and 2 respectively as shown
in the diagram. This means that in this process, age will be incremented by
1. The objects moving from Eden Space to S0 will be 1.

7. This stage shows the object moving from the young generation to the old
generation. Let’s set the threshold for the age of the object to move to the
old generation to 9. Consider the scenario where the age of the object is 9
in the young generation. As the age of the object has met the threshold, this
object will be moved to the Old Generation.
Note: Observe that, at any given time only one survivor space has objects. Also,
note that the age of the object keeps increasing when switching between the
survivor spaces.
Old Generation – Here, long-surviving objects will be stored. As mentioned, a
threshold will be set to the object, on meeting which it is moved from the young
generation to old or tenured generation. Eventually the old generation needs to be
collected. This event is called a major garbage collection.
Major garbage collection are also Stop the World events. Often a major collection
is much slower because it involves all live objects. So for responsive applications,
major garbage collections should be minimized. Also note, that the length of the
Stop the World event for a major garbage collection is affected by the kind of
garbage collector that is used for the old generation space.
Note: Responsiveness means how fast an application can respond. The
applications that focus on responsiveness, should not have large pause times.
This in-turn means, memory management should be done effectively.
Permanent generation – This contains metadata required by the JVM to describe
the classes and methods used in the application. The permanent generation is
populated by the JVM at runtime based on the classes in use by the application.
In addition, Java SE library classes and methods may be stored here.

Types of Garbage Collectors

1. Serial GC

2. Parallel GC

3. Concurrent Mark and Sweep (CMS) collector

4. G1 Collector
These garbage collectors have their own advantages and disadvantages. As
Android Runtime (ART) uses the concept of CMS collector, we will only discuss
Concurrent Mark and Sweep (CMS) Collector here.
GC Algorithms
An important aspect to remember is that, usually two different GC algorithms are
needed – one for the Young generation and the other for the Old generation.
We have seen the core concepts of GC process. Let’s move to the specific GC
type which is used as default by Android Runtime.
The default GC type used by ART is CMS Collector. Let’s look into it further in
detail.

Concurrent Mark & Sweep (CMS) Collector


This collector is used to avoid long pauses during the Garbage collection process.
It scans heap memory using multiple threads. It uses parallel Stop the World
mark-copy algorithm in the young generation and concurrent mark-sweep
algorithm in the Old Generation.
As discussed, Minor GC occurs in young generation whenever Eden Space is full.
And this is “Stop the World event”.
GC process in Old generation is called Major GC. This garbage collector attempts
to minimize the pause duration that occurs during the GC process by doing most
of the Garbage Collection work concurrently with the application threads.
We can split the Major GC into the following phases:
Phase 1 – Initial marking
This is one of the “Stop the World” events in CMS. In this phase, the objects that
are either direct GC roots or are referenced from some live objects in the Young
Generation are all marked. The latter is important since the Old Generation is
collected separately.
Note: Every application will have a starting point from where objects get
instantiated. These objects are called “roots”. Some objects are referenced with
these roots directly and some indirectly. GC tracks the live objects from those GC
roots.
Phase 2 – Concurrent Marking
During this phase the Garbage Collector traverses the Old Generation and marks
all live objects, starting from the roots found in the previous phase of “Initial
Mark”. This phase runs concurrently with the application thread. Thus, the
application thread will not be stopped.
Phase 3 – Concurrent pre-clean
This is again a concurrent phase running in parallel with the application thread.
While marking the live objects in the previous phase, there is a possibility that
few of the references would be changed. Whenever that happens, the JVM marks
the area of the heap called “Card” that contains the mutated object as “dirty”.
This is known as Card Marking.
In the pre-cleaning phase, these dirty objects are accounted for, and the objects
reachable from them are also marked. The cards are cleaned when this is done.
Phase 4 – Concurrent Abortable Preclean
This phase again runs in parallel with the application thread. The purpose of this
phase is to mark most of the live objects, so that the next phase will not take
much time to complete. This phase iterates through the old generation objects to
identify the live objects. The duration of this phase depends on a few of the
abortion conditions such as the number of iterations, elapsed wall clock time,
amount of useful work done etc. When one of the mentioned conditions is met,
this phase will be stopped.
Phase 5 – Final remark
This is the second and last stop-the-world phase during the event. The goal of this
stop-the-world phase is to finalize marking all live objects in the Old Generation.
Since the previous preclean phases were concurrent, they may have been unable
to keep up with the application’s mutating speeds. A stop-the-world pause is
required to finish the marking.
Usually CMS tries to run final remark phase when Young Generation is as empty
as possible in order to eliminate the possibility of several stop-the-world phases
happening back-to-back.
Phase 6 – Concurrent Sweep
The purpose of this phase is to sweep off the dead objects in the old generation.
As the final marking is done, there is no dependency on the application thread
now. Thus, this phase runs concurrently with the application thread.
Phase 7 – Concurrent reset
This phase which runs concurrently with the application thread, resets the inner
data structures of the CMS algorithm, preparing them for the next cycle.

4.NFS (Network File System)

The advent of distributed computing was marked by the introduction of distributed file systems.
Such systems involved multiple client machines and one or a few servers. The server stores data
on its disks and the clients may request data through some protocol messages. Advantages of a
distributed file system:
 Allows easy sharing of data among clients.
 Provides centralized administration.
 Provides security, i.e. one must only secure the servers to secure data.
Distributed File System Architecture:

Even a simple client/server architecture involves more components than the physical file systems
discussed previously in OS. The architecture consists of a client-side file system and a server-
side file system. A client application issues a system call (e.g. read(), write(), open(), close()
etc.) to access files on the client-side file system, which in turn retrieves files from the server. It is
interesting to note that to a client application, the process seems no different than requesting data
from a physical disk, since there is no special API required to do so. This phenomenon is known
as transparency in terms of file access. It is the client-side file system that executes commands to
service these system calls. For instance, assume that a client application issues the read() system
call. The client-side file system then messages the server-side file system to read a block from the
server’s disk and return the data back to the client. Finally, it buffers this data into the read() buffer
and completes the system call. The server-side file system is also simply called the file
server. Sun’s Network File System: The earliest successful distributed system could be
attributed to Sun Microsystems, which developed the Network File System (NFS). NFSv2 was the
standard protocol followed for many years, designed with the goal of simple and fast server crash
recovery. This goal is of utmost importance in multi-client and single-server based network
architectures because a single instant of server crash means that all clients are unserviced. The
entire system goes down. Stateful protocols make things complicated when it comes to crashes.
Consider a client A trying to access some data from the server. However, just after the first read,
the server crashed. Now, when the server is up and running, client A issues the second read
request. However, the server does not know which file the client is referring to, since all that
information was temporary and lost during the crash. Stateless protocols come to our rescue.
Such protocols are designed so as to not store any state information in the server. The server is
unaware of what the clients are doing — what blocks they are caching, which files are opened by
them and where their current file pointers are. The server simply delivers all the information that is
required to service a client request. If a server crash happens, the client would simply have to retry
the request. Because of their simplicity, NFS implements a stateless protocol. File Handles: NFS
uses file handles to uniquely identify a file or a directory that the current operation is being
performed upon. This consists of the following components:
 Volume Identifier – An NFS server may have multiple file systems or partitions. The volume
identifier tells the server which file system is being referred to.
 Inode Number – This number identifies the file within the partition.
 Generation Number – This number is used while reusing an inode number.
File Attributes: “File attributes” is a term commonly used in NFS terminology. This is a collective
term for the tracked metadata of a file, including file creation time, last modified, size, ownership
permissions etc. This can be accessed by calling stat() on the file. NFSv2 Protocol: Some of the
common protocol messages are listed below.
Message Description

NFSPROC_GETATTR Given a file handle, returns file attributes.

NFSPROC_SETATTR Sets/updates file attributes.

NFSPROC_LOOKUP Given file handle and name of the file to look up, returns file handle.

NFSPROC_READ Given file handle, offset, count data and attributes, reads the data.

NFSPROC_WRITE Given file handle, offset, count data and attributes, writes data into the file.

NFSPROC_CREATE Given the directory handle, name of file and attributes, creates a file.

NFSPROC_REMOVE Given the directory handle and name of file, deletes the file.

Given directory handle, name of directory and attributes, creates a new


NFSPROC_MKDIR
directory.

The LOOKUP protocol message is used to obtain the file handle for further accessing data. The
NFS mount protocol helps obtain the directory handle for the root (/) directory in the file system. If
a client application opens a file /abc.txt, the client-side file system will send a LOOKUP request to
the server, through the root (/) file handle looking for a file named abc.txt. If the lookup is
successful, the file attributes are returned. Client-Side Caching: To improve performance of NFS,
distributed file systems cache the data as well as the metadata read from the server onto the
clients. This is known as client-side caching. This reduces the time taken for subsequent client
accesses. The cache is also used as a temporary buffer for writing. This helps improve efficiency
even more since all writes are written onto the server at once.

5.Ubandu Os Architecture

Ubuntu is a general-purpose, free-as-in-speech, zero-cost operating system based on Debian GNU/Linux,


designed for use on desktops, laptops, servers, and mobile devices. The project is committed to a regular six-
monthly release schedule, security updates for 9 months after release (or longer for Long Term Support
releases), and to providing a single installation CD with further packages available for download. All of these
commitments have an effect on its overall architecture. This page sketches the main architectural features of
Ubuntu, in order that ongoing development work can be designed in a consistent and elegant fashion.

For more detail on the processes involved here, see UbuntuDevelopment.

Design Principles

 There should be exactly one recommended way to accomplish a task in the default installation, to
promote consistent documentation and support
 Applications should work immediately upon installation, without requiring additional steps, whenever
possible. This "instant gratification" makes users more productive and allows them to explore the
software choices available
 Features should be easily discoverable, or they will go unused
 The system should fit on a single CD, to make it available to the widest audience (no longer valide
since ubuntu 12.10)
 Dialogs should conform to the GNOME Human Interface Guidelines (HIG) to promote ease of use
 User-visible text should be written in plain language so that non-technical users can understand what the
system is telling them

Packages
Ubuntu packages are created in the Debian format. That is, source packages consist of a .dsc control file, plus
either a .tar.gz tarball or an .orig.tar.gz tarball plus a .diff.gz patch, while binary packages
are .deb files (or .udeb files for specialised use by the installer). For further information, see
the dpkg manual pages and the Debian policy manual's sections on binary and source packages.

The Ubuntu policy manual applies, and is maintained collaboratively by the Ubuntu core developers. It is
derived from the Debian policy manual, and technical standards are similar to those in Debian. Different
standards apply to some subsystems.

Ubuntu-specific changes are logged in debian/changelog as usual, with ubuntu1, ubuntu2, etc. being
appended to the version number, or -0ubuntu1 appended to the upstream version in the event that a new
upstream release is needed in Ubuntu. The presence of ubuntu in a version number means that the package
will not be automatically synced from Debian, and must have changes merged by hand.

Ubuntu was originally initialised by taking a copy of Debian unstable, and new versions are merged from
Debian at the start of every Ubuntu release cycle in order to be able to make good use of the expertise and effort
available in the Debian development community. Ubuntu developers are expected to bear this process in mind
when making changes.

Archive
The Ubuntu archive is a Debian-style pooled package repository. The entries in the dists tree are divided into
one set of entries per Ubuntu release, with one entry per "pocket" (the main release, security updates, general
updates, proposed general updates, and backports of new features). A new development tree is created every six
months, and all routine development work takes place in the current development branch. Stable releases
receive updates from time to time for severe problems, but as a general rule stable releases are best kept stable
simply by not changing them unless absolutely necessary. As such, great effort goes into ensuring that each
release is as high-quality as possible by its release date, and the release cycle is structured around an initial
flurry of invasive feature development moving through to an atmosphere of great caution and conservative bug-
fixes.

At the beginning of each release cycle, the new development branch is resynchronised with the current contents
of Debian unstable, and other relevant "upstream" projects (i.e. those from which Ubuntu draws code). Packages
that have not been changed in Ubuntu are updated semi-automatically by one of the archive administration
team; others must be merged by hand.

Builds
Normal developer uploads to Ubuntu consist of source packages only. The upload of a source package and its
acceptance into the archive cause build daemons for each architecture to fetch the source and build binary
packages, which are in turn uploaded automatically to the archive if successfully built.

Occasionally, builds with circular build-dependencies must be bootstrapped by hand (e.g. new architectures,
compilers and language runtimes, etc.). This task must be performed by hand by a build administrator, usually
repeating the build process more than once in order to ensure that the new packages can continue to build
themselves.

Flavours and Derivatives


The modern Linux world is one of a small number of general-purpose distributions and many special-purpose
distributions. Ubuntu acknowledges this and strives to support it. We define two different levels of derivation. A
"flavour" is one hosted within the Ubuntu archive, which requires that it must draw from the same pool of
packages and so cannot make incompatible or go-it-alone changes, but may select a different set of packages. A
full "derivative" may make more extensive changes, but must provide its own archive hosting.

As it turns out, flavours are quite adequate for many purposes. Kubuntu, Edubuntu, and Xubuntu all make
extensive changes to their own specialised packages. Gobuntu is very similar to Ubuntu but removes packages
with restrictive licensing terms (Ubuntu does not support many of these, but makes exceptions for items like
firmware).

Acknowledging the prevalence and usefulness of derivation, packages in the Ubuntu archive should generally
avoid explicit user-visible Ubuntu branding.

Package Control
When the Ubuntu project started, the founding developers were faced with a huge selection of packages from
the Debian archive out of which to build Ubuntu. Among their considerations were ensuring that the main
supported set of packages was reasonably secure and sufficiently stable that it would be possible to support
them for 18 months after release; avoiding excessive duplication; selecting a set of packages known to be
widely useful; allowing users to install without excessive amounts of manual configuration; and maintaining a
level of quality and standards-compliance such that developers could do efficient work across the distribution.
These considerations live on in the UbuntuMainInclusionRequirements. Packages which have not been selected
for the main supported set are available from the universe and multiverse components of Ubuntu, and
may be promoted to main or restricted (according to their licensing terms) if they are necessary and meet
the requirements.

It quickly became necessary to maintain a list of those packages which had been selected for Ubuntu main, and
the reasons why they were selected; thus the seeds were created, and the germinate program to expand their
dependencies into full lists of packages. The initial seed list was base (the minimum set of packages required
to run an Ubuntu system), desktop (the set installed in a default Ubuntu desktop installation),
and supported (everything else in main). Nowadays, every object (CD/DVD images, live filesystems,
various installation profiles) built from the Ubuntu archive has a corresponding set of seeds listing the packages
it contains; the seeds and germinate are essential components of the Ubuntu archive and CD image building
systems.

The seeds are stored in Bazaar branches, one per flavour.

Installation
Quick and straightforward installation is tremendously important to Ubuntu, as is the ability to build installation
images with a minimum of human intervention. We support two basic installer designs:

 The "alternate" installer was the only supported installation mechanism up to and including Ubuntu
5.10, and is still supported as a secondary mechanism for desktop installs and as the primary mechanism
for server installs. It starts with a small self-hosting initial ramdisk which is capable of using
extra .udeb packages to build up its capabilities at run-time, like a miniature Linux distribution. It then
builds a functional base system on the target system from individual .deb packages
using debootstrap, and goes on from there to install extra packages appropriate for the installation
profile (desktop, server, etc.).
 The "desktop" installer was introduced in Ubuntu 6.06, and is now the primary mechanism promoted
for use by desktop users. It runs from a live CD, and operates by copying the contents of the live
filesystem to the target system and then making small configuration adjustments to it.
The live filesystem itself is produced by the livecd-rootfs package, using debootstrap to build a base
system and then installing desktop packages on top of that; the casper package deals with booting the live
filesystem and making the necessary small number of tweaks at run-time in order for it to function well as a live
CD.

Thus, all Ubuntu installations start out with an invocation of debootstrap, directly or
indirectly. debootstrap itself installs all packages
with Priority: required or Priority: important, which are set from the expansions of
the required and minimal seeds respectively. The remaining packages are typically installed using either
binaries from *-meta source packages (ubuntu-meta, kubuntu-meta, etc.), or using Task fields. Both
of these are generated automatically or semi-automatically from the corresponding seeds.

The upshot is that seed changes are typically propagated to installation images with at most an updated upload
of *-meta.

System boot
The boot process of an Ubuntu system starts with the boot loader (typically GRUB) which loads the kernel and
an initial ramdisk (initramfs). The initramfs sets up the console, usually including a splash screen, starts
the udev daemon to process device events from the kernel, finds and mounts the root filesystem, and chains to
it, mounting some important virtual filesystems along the way.

Once the root filesystem is mounted, upstart takes over as PID 1, and begins to start tasks specified
in /etc/init/. While this will change in the future, at the moment these tasks are essentially shims for
System V init scripts and runlevels, defined in /etc/rc*.d/. In the default configuration, upstart runs the
contents of /etc/rcS.d/, which sets up hardware and filesystems, and then runs the contents
of /etc/rc2.d/, which starts a variety of system services.

In a desktop installation, one of the services launched is an X display manager, which presents a login screen.
Otherwise, the user may log in on a virtual console using one of the getty instances started by upstart.

You might also like