Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Linux Filesystems

Course: Operating Systems

Author: Halil Berisha Professor: Stefan Traub


Table of Contents
Introduction to File Systems
EXT4 (Fourth extended file system)
XFS (Extended File System)
ZFS (Zettabyte file system)
BTRFS (B-Tree File System)
Linux File Directory
References
Introduction to File Systems
Storage devices are able to store data in the form of binary numbers. So, basically everything
stored in some storage device even though we see them as pictures, text files, videos, etc., are
nothing more than a collection of 0-s and 1-s.
The ability to transform, manage and manipulate this big chunk of binary numbers is one of the
main functions of a File System.
On top of the File System is an Operating System, and both of them together makes possible for
humans to work, view, create, and manage this data in a simple way.
So, every time a file is opened in the computer, or a smart device, the Operating System uses its
file system internally to load it from the storage device.
File system is working behind the hood every moment that the device is working, when copying,
editing, deleting, download a file, or even accessing a web page.
Every storage device, before the first use must be formatted. And before starting the formatting
process, the file system of that device must be chosen.
This, because every file system is different from each other and has its own way of managing the
data.
Also, it should be taken in consideration what kind of Operating System will be used in that
device, because many file systems are only compatible with particular Operating Systems.
UNIX / BSD – Operating Systems are able to work in some file systems which are not
compatible with Windows Operating Systems.
A standard Linux Distribution provides the choice of partitioning the disk from a number of file
systems.
The most common File Systems found in today Linux Distributions are:
• ext2
• ext3
• ext4
• jfs
• ReiserFS
• XFS
• ZFS
• Btrfs
Here will be analyzed only some of those file systems, such as: ext4, XFS, ZFS, and Btrfs.
EXT4 (Fourth extended file system)
EXT4 is the latest version of EXT file systems. EXT file system splits the disk into equally sized
sectors.
These sectors are grouped to form a block. A block can be from 1Kb to 64Kb, but typically is of
4Kb.
Blocks then are grouped together and form what’s known as Block Groups.
In EXT4 instead of having a fixed size block groups, there is a concept of “Extents”.
An Extent is a range of contiguous physical blocks. A single extent can map up to 128 Mb of
contiguous space if the block size if of 4Kb.
Files not always are stored in storage device sequentially. A part of a file can be stored in an
address which points in the beginning of the storage device, and another part in the end. This
makes it a not an easy job when working especially with big files, because to access it, there may
be a need to jump in different part of the storage device.
The role of the extent is to improve large file performance by placing them in contiguous space,
and as it is larger than a block group, a large file is stored in few number of extents, which then
requires less number of pointers to find the location of all the data in large files. This also
reduces fragmentation.
The storage device formatted with EXT4 as the filesystem organizes the device in the following
way.
The initial disk blocks are allocated for special purpose. First block stores the boot sector, the
next one is called “Super Block”, and every partition contains a super block that contains
metadata about other file system structures.
Then there is INODE BITMAP block which is a map containing 0 and 1 that corresponds to tell
if the block is occupied or vacant.
Next is INODE which stores the metadata about files and pointers to their physical block
address.
In the end we have the data blocks which contains all the actual files.
Figure 1 - EXT4 File System structure

EXT4 uses 32-bit for Inode pointers plus every single Inode can contain a maximum of 12 direct
pointers associated with the address of the file system blocks.

232 + 212 = 244


Therefore, for a block size of 4Kb the maximum file size allowed is 16Tb and the maximum
volume size is 1 Eb. This is sufficient for most regular use.
EXT4 file system is widely used for personal computers that run on Linux.
Features of the EXT4
• Journaling File System – This means that it maintains a journal containing the changes
that are in process, so they are not in committed state.
If the system crashes or loses the power, the data that was not committed in the journal
are re-run. This helps recover the system more quickly and with a lower likelihood of
becoming corrupted.
• Delayed Allocation – The data is stored in the buffer before it is returned into the data
blocks. This allows the file system to make better choices about how to allocate those
blocks. This reduces fragmentation and increases the performance significantly.
• Online Defragmentation – In EXT4 file system there is no need to unmount the disk in
order to do the defragmentation.

XFS (Extended File System)


In contrast to EXT4, XFS is widely used in enterprise environments. It’s used in well-known
scientific institutions like CERN, Fermilan, etc., to manage PetaBytes of storage for scientific
experiments.
The reason that those institutions of high importance use XFS is that this file system is ideal for
storing and managing large amount of data, and large files.
It also provides great I/O performance, it’s also really good in computing environments with
large number of CPU-s and huge Disk Arrays.
Like EXT4 it also uses a Journaling file system. But, the difference is that XFS is a 64-bit file
system.
Since it has a 64-bit address space, it supports a maximum file system size of 16EB.
Inode pointer is also of 64-bit so a maximum file size is 8EB and maximum Extent size is 8GB.
The disk is partitioned into allocation groups, and they are equally sized within the file system.
They can be considered as similar to block groups in EXT4 but typically much larger.
An Allocation Group can have a maximum size of 1TB, but still, they’re typically between 0.5
and 4GB.
Each of the Allocation Groups manages its own Inode and free space separately.

Figure 2 – XFS file system structure

Also, each of them can almost be thought of as an individual file system, which maintains its
own space usage.
This provides scalability and parallelism, thus multiple processes and threads can perform I/O
operations on the same file system simultaneously.
Another difference from EXT4 is that instead of using bitmaps or track-free disk blocks, XFS
uses B-Trees which are way more efficient.
Features of XFS
• Direct I/O – Via this feature, the kernel file cache can be bypassed and thus from the
user buffer the underlying I/O hardware can be directly connected.
This will avoid the copy into the kernel address space, which will significantly reduce the
CPU utilization for large I/O requests. Thus, the feature of multiple parallel writes to files
using direct I/O is possible.
• Guaranteed I/O – This allows the application to reserve the bandwidth to the file
system. XFS dynamically calculates the performance available from the storage devices
and can reserve sufficient bandwidth to meet the requested performance.
• Online Defragmentation – Like with the EXT4, also with XFS the disk can be
defragmented without having to unmount it.
• Online Resizing – The file system can be resized without having to unmount the device.

ZFS (ZetaByte File System)


ZFS is pretty different from other file systems. The biggest difference is its limit. The limits of
ZFS are designed in such a way and they are so large that they should never be encountered in
practice.
To fully populate a single zpool (a virtual storage which is constructed of virtual devices, and
they are constructed of block devices such as files, hard drive partitions, or entire drives) would
require 3 * 1024TB hard disk drives.
Its limits are really even hard to imagine:
• Maximum file size is 16EB(Exbibytes) or 264 bytes.
• Maximum size of Zpool is 256 quadrillion Zebibytes or 2128 bytes.
• Maximum of 264 number of devices in a Zpool
• Maximum of 264 number of Zpools in a system.
So, as can be seen, the limits of ZFS can’t be met in practice. The reason that ZFS can have and
process such a big amount of data is that it is a 128-bit file system.
There is also a big difference of ZFS with other file systems. ZFS is a combined file system that
acts both as a volume manager and file system.
ZFS mainly is used in Server Operating Systems and is a default file system for Ubuntu Server.
Disk Layout
ZFS has a very different approach with respect to internal file management. All of the disks
available are combined to create a one large Storage Pool. There’s also a Storage Pool Allocator
(SPA) which acts as a volume manager.
It decides how files are stored in the physical disk and it does all of the logical to physical
address conversion and data management.
So, in ZFS the first step is creating a storage pool.
After that a file system under it should be created, in our case ZFS. In this file system there are
nothing but folders in root directory and these file systems share all of the disk space with all the
systems in the pool.
By using ZFS there could be created a file system for each department, thus no longer is needed
to determine the size of a file system, since the disk space is allocated automatically from the
storage pool.
Every file system within the pool can immediately use the additional disk space, without any
additional work.

Figure 3 – ZFS file system using different storage devices

This makes it similar to RAM Memory, when you add a new stick in the computer, there’s no
need to run any configuration or commands, because all processes on the system will
automatically use the additional memory.
Features
Transactional Semantics – During a write operation the data is not overwritten. The data is
written in a separate buffer, called Copy-on-Write which ensures that even in cases when system
loses power or crashes during write process the data is consistent.
Checksums - are used for data consistency.
Snapshots – Snapshots of the file system are taken which makes it possible for quick system
recovery in case of any failures.
RAID-Z – Is a new raid configuration, which guarantees the data integrity.
Built-in Scrub – Regularly examines all the data and repairs silent corruption and other
problems.
BTRFS (B-Tree File System)
BTRFS is a high-performance copy on write file system.
It offers similar features as ZFS, but it also has some additional features. BTRFS is now seen as
the future next generation Linux file system that will be replacing the EXT4.
It’s also used by many big businesses including the tech giants like Meta for managing their
storage requirements.
BTRFS uses B-Tree as the underlying data structure to manage every element of the file system.
IT has one storage pool that consists of all storage disks, and inside the pool we can create sub-
volumes, which are like partitions and can be dynamically resized at will.
In contrast to ZFS it is a 64-bit file system, which means the maximum volume size as well as
maximum size of the file is 16EB.
Features
Single Volume Snapshots – Snapshots of single volume can be taken, and in case of some
failure it can be rolled back to that snapshot. Snapshots can also be read-only or read-write.
Transparent file compression – Is done in background automatically in order to optimize
storage capacity utilization. The default compression method used is z-lib, but also LZO and
ZSTD can be used as well.
Built-in RAID support – Raid 1, 5, 6, 10 are all supported to ensure data integrity.
SSD Optimization – It is an SSD aware file system which is not seen in other file systems. It
avoids unnecessary optimizations as there is no moving parts in SSD. It aggressively sends
writes in clusters, which results in larger write operation and faster write throughput.
Online File System Defragmentation – Defragmentation can be done without unmounting the
disk.
BTRFS-Convert – With this feature an existing file system EXT2, 3, or 4 can be converted into
BTRFS.
Linux File Directory
File directory design of Linux is similar to Mac OS, but very different from Windows Operating
Systems.
The layout for the most part is outlined in the FHS (File System Hierarchy Standard) which
defines the structure and layout, and it is maintained by the Linux Foundation.
Even though, still not all distributions follow this structure and layout, some of them customize
them for their own need.
But what is common for all Linux Distributions is that the File Directory is based on Trees.
As in trees, in Linux File Directory there is the root, which is specified only with a “/” slash.
Then this root has some children which will be briefly explained here.

Figure 4 – Linux File Directory

/bin is short for Binary. In this directory the most basic binaries can be found. Things like “cat”
which is used to display the output of a file, “ls” used to list the directory, etc.
/sbin are system binaries. Those would be used by a system administrator and a standard user
can’t access them without permission. Both /bin and /sbin contain the files that need to be
accessible when running in single user mode. Single user mode boots you in as a Root user to
allow system repairs, testing, or upgrading.
/boot is a directory which stores everything that the OS needs to boot. The bootloader is in here.
/dev is a directory where the devices live. Every disk connected to the file system can be found
here, for example a disk would be /dev/sda and a partition of this disk would be /dev/sda1, and
/dev/sda2, etc. Not only disks, here can be found also other devices like webcams, etc.
/etc in this directory all the configurations are stored. Configurations and files that are system-
wide, like sources.list which holds all the repos to which the system can connect. Here can’t be
found files that are not system-wide like for example open-office, which would have different
settings for each user.
/lib here the libraries are stored. Libraries are files needed and used by applications to perform
various functions.
/media, /mnt in both of these directories mounted drives can be found. Those can be USB sticks,
external hard drives, network drives, etc. In windows partitions and disks are shown with a letter
like C:, D:, or E:, and all of those can be found here.
The difference between /media and /mnt is that usually most distros will automatically mount
devices in /media and if the devices are mounted manually /mnt will be used. /media is managed
by the OS.
/opt this is the optional folder which is usually where manually installed software from vendors
reside. Although some software packages found in the repo can also find their way here. This is
also a place where you can install software you’ve created yourself.
/proc is where pseudo files are found. Those contain information about system processes and
resources. For example, every process would have a directory which contains all kind of
information on that process. They’re called pseudo files because if you navigate to a directory of
a process, the files there are not actually files on the system. Those are just kernel translating
other information to appear as files.
/root is like a home folder to the root users. But it doesn’t contain typical directories inside and it
doesn’t reside into the home directory. Files can be stored here, but the root permission is
needed.
is the root users home folder. Unlike a user’s home folder, it doesn’t contain typical directories
inside and it does not reside into the home directory. You can store files here, but root
permission is needed.
/run it’s a different directory from others. It’s a tempfs file system, which means it runs in RAM
Memory. This means that everything in it is gone when the system is rebooted or shut down. It’s
used for processes that start early in the boot procedure to store runtime information that they use
to function.
/srv is a service directory where service data is stored. If you run a server like web-server of ftp-
server, files that would be accessed by external users are stored here. This allows for better
security since its at the root of the drive and it also allows you to easily mount this folder from
another hard drive.
/sys is the system folder, a way to interact with the kernel. This is similar to the /run directory,
and it is not physically written to the disk. It’s created every time the system boots up.
/tmp is a temporary directory. This is where files are temporary stored by applications that can
be used during the session.
/usr directory is the user application space, where all of the applications will be installed. Any
application installed here are considered non-essential for basic system operation.
/var is the variable directory. Contains files and directories that are expected to grow in size. For
example /var/crash holds information about processes that have crashed.
/var/log contains log files for both system and many different applications which will constantly
grow in size as the system is used. Databases for mail and temporary storage for printer queues
are also found here.
/home directory is the home of the user. Each user has it’s home. This is where personal files and
documents are stored. In some cases users choose to mount the home folder on a different
partition or drive which allows to reinstall the system and in the same time preserve the files.
References
• Kevin D.Fairbanks, An analysis of Ext4 for digital forensics, Published in Digital
Investigation, Volume 9, Supplement, August 2012, Pages S118-S130
• Learning Debian GNU/Linux, Bill McCarty, 1st Edition September 1999, Chapter 4
“Issuing Linux Commands”
• Tolga Bagci, What is ZFS (Zettabyte File System)? - SYSNETTECH Solutions,
accessible at: https://www.sysnettechsolutions.com/en/what-is-zfs/
• Park, Y., Chang, H. & Shon, T. Data investigation based on XFS file system metadata.
Multimed Tools Appl 75, 14721–14743 (2016). https://doi.org/10.1007/s11042-015-
2713-3

You might also like