File Systems

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

FILE SYSTEMS

Introduction
At the core of a computer, it's all 1s and 0s, but the organization of that
data is not quite as simple. A bit is a 1 or a 0, a byte is composed of 8
bits, a kilobyte is 1024 (i.e. 210) bytes, a megabyte is 1024 kilobytes and
so on and so forth. All these bits andbytes are permanently stored on a
Hard Drive. A hard drive stores all your data, any time you save a file,
you're writing thousands of 1s and 0s to a metallic disc, changing the
magnetic properties that can later be read as 1 or 0. There is so much
data on a hard drive that there has to be some way to organize it, like a
library of books and the old card drawers that indexed all of them,
without that index, we'd be lost. Libraries, for the most part, use the
Dewey Decimal System to organize their books, but there exist other
systems to do so, none of which have attained the same fame as Mr.
Dewey's invention.

File systems are the same way. The ones most users are aware of are
the ones Windows uses, the vFat or the NTFS systems, these are the
Windows default file systems.

There are several different attributes which are necessary in defining file
systems, these include their:
 max file size,
 max partition size,
 whether they journal or not.
Journaling
Journaling has a dedicated area in the file system, where
all the changes are tracked. When the system crashes, the
possibility of file system corruption is less because of
journaling.

A journaling file system is more reliable when it comes to data storage.


Journaling file systems do not necessarily prevent corruption, but they
do prevent inconsistency and are much faster at file system checks than
non-journaled file systems. If a power failure happens while you are
saving a file, the save will not complete and you end up with corrupted
data and an inconsistent file system. Instead of actually writing directly
to the part of the disk where the file is stored, it first writes it to another
part of the hard drive and notes the necessary changes to a log, then in
the background it goes through each entry to the journal and begins to
complete the task, and when the task is complete, it checks it off on the
list. Thus the file system is always in a consistent state (the file got
saved, the journal reports it as not completely saved, or the journal is
inconsistent (but can be rebuilt from the file system)). Some journaling
file systems can prevent corruption as well by writing data twice.
Table
Now below is a very brief comparison of the most common file systems
in use with the Linux world.
Max Max
File File Partition
System Size Size Journaling Notes

Fat16 2 GB 2 GB No Legacy

Fat32 4 GB 8 TB No Legacy

(For Windows Compatibility)


NTFS-3g is installed by default in
Ubuntu, allowing Read/Write
NTFS 2 TB 256 TB Yes support

ext2 2 TB 32 TB No Legacy

Standard linux filesystem for many


years. Best choice for super-
Ext3 2 TB 32 TB Yes standard installation.

Modern iteration of ext3. Best


choice for new installations where
ext4 16 TB 1 EB Yes super-standard isn't necessary.

reiserFS 8 TB 16 TB Yes No longer well-maintained.

Yes Created by IBM - Not well


JFS 4PB 32PB (metadata) maintained.

Created by SGI. Best choice for a


Yes mix of stability and advanced
XFS 8 EB 8 EB (metadata) journaling.
GB = Gigabyte (1024 MB) :: TB = Terabyte (1024 GB) :: PB = Petabyte (1024
TB) :: EB = Exabyte (1024 PB)
Above you'll see a brief comparison of two main attributes of different
filesystems, the max file size and the largest a partition of that data can
be.
Of the above file systems the only one you cannot install Linux on is the
NTFS. It is not recommended to install Linux on any type of FAT file
system, because FAT does not have any of the permissions of a true
Unix FS.
Editing Files
Those used to a Windows file system (NTFS, FAT) know that it isn't
normally possible to change files while they are open. This restriction
does not exist in a Unix file system. This is because in Unix file systems,
files are indexed by a number, called the inode, and each inode has
several attributes associated with it, like permissions, name, etc. When
you delete a file, what really happens is the inode is unlinked from the
filename, but if some other program is using the file, it still has a link
open to the OS, and will continue to be updated. A file is not really
deleted until all links have been removed (even then, the data is still on
the disk, but not indexed in anyway and thus very hard to recover). All
of this means that you can delete executing programs while they're
running without crashing and move files before they're finished
downloading without corruption.
Fragmentation
Another common Windows practice that is not needed in Unix is
defragmenting the hard drive. When NTFS and FAT write files to the
hard drive, they don't always keep pieces (known as blocks) of files
together. Therefore, to maintain the performance of the computer, the
hard drive needs to be "defragged" every once in a while. This is
unnecessary on Unix File systems due to the way it was designed. When
ext3 was developed, it was coded so that it would keep blocks of files
together or at least near each other.
No true defragmenting tools exist for the ext3 file system, but tools for
defragmenting will be included with the ext4 file system.
Linux File Systems: Ext2 vs Ext3 vs Ext4

ext2, ext3 and ext4 are all filesystems created for Linux.
This article explains the following:
 High level difference between these filesystems.
 How to create these filesystems.
 How to convert from one filesystem type to another.

Ext2
 Ext2 stands for second extended file system.
 It was introduced in 1993. Developed by Rémy Card.
 This was developed to overcome the limitation of the
original ext file system.
 Ext2 does not have journaling feature.
 On flash drives, usb drives, ext2 is recommended, as it
doesn’t need to do the over head of journaling.
 Maximum individual file size can be from 16 GB to 2 TB
 Overall ext2 file system size can be from 2 TB to 32 TB

Ext3
 Ext3 stands for third extended file system.
 It was introduced in 2001. Developed by Stephen
Tweedie.
 Starting from Linux Kernel 2.4.15 ext3 was available.
 The main benefit of ext3 is that it allows journaling.
 Journaling has a dedicated area in the file system,
where all the changes are tracked. When the system
crashes, the possibility of file system corruption is less
because of journaling.
 Maximum individual file size can be from 16 GB to 2 TB
 Overall ext3 file system size can be from 2 TB to 32 TB
 There are three types of journaling available in ext3 file
system.
 Journal – Metadata and content are saved in the
journal.
 Ordered – Only metadata is saved in the journal.
Metadata are journaled only after writing the content
to disk. This is the default.
 Writeback – Only metadata is saved in the journal.
Metadata might be journaled either before or after
the content is written to the disk.
 You can convert a ext2 file system to ext3 file system
directly (without backup/restore).

Ext4
 Ext4 stands for fourth extended file system.
 It was introduced in 2008.
 Starting from Linux Kernel 2.6.19 ext4 was available.
 Supports huge individual file size and overall file system
size.
 Maximum individual file size can be from 16 GB to 16 TB
 Overall maximum ext4 file system size is 1 EB
(exabyte). 1 EB = 1024 PB (petabyte). 1 PB = 1024 TB
(terabyte).
 Directory can contain a maximum of 64,000
subdirectories (as opposed to 32,000 in ext3)
 You can also mount an existing ext3 fs as ext4 fs
(without having to upgrade it).
 Several other new features are introduced in ext4:
multiblock allocation, delayed allocation, journal
checksum. fast fsck, etc. All you need to know is that
these new features have improved the performance and
reliability of the filesystem when compared to ext3.
 In ext4, you also have the option of turning the
journaling feature “off”.
Use the method we discussed earlier to identify whether
you have ext2 or ext3 or ext4 file system.

ReiserFS is a general-purpose, journaled computer file system designed


and implemented by a team at Namesys led by Hans Reiser. ReiserFS
is currently supported on Linux (without quota support). Introduced in
version 2.4.1 of the Linux kernel, it was the first journaling file system to
be included in the standard kernel. ReiserFS is the default file system on
the Elive, Xandros, Linspire,GoboLinux, and Yoper Linux distributions.
ReiserFS was the default file system in Novell's SUSE Linux Enterprise
until Novell decided to move to ext3 on October 12, 2006 for future
releases.[3]
Namesys considered ReiserFS (now occasionally referred to as Reiser3)
stable and feature-complete and, with the exception of security updates
and critical bug fixes, ceased development on it to concentrate on its
successor, Reiser4. Namesys went out of business in 2008 after Hans
Reiser was charged with the murder of his wife. However, volunteers
continue to work on the open source project.
Journaled File System (JFS)

The JFS file system is a 64-bit file system created by IBM and ported to Linux in 1999. A
stable version was released in 2001. The first implementation was the Linux Kernel
2.4.18.

JFS was originally released in 1990 with AIX version 3.1. It is sometimes referred to as
JFS1
The file name size limit is 255 characters. To support large files and a larger partition
(more addressing values), the file system is 64-bit. The file and space limitations are as
follows:

File size: 4 PB
File system: 32 PB

To recover from an improper shut down, the file system uses journals to track metadata
for files and performs a recovery when a system is restarted from an improper shutdown.
Metadata can be restored so information is recovered instead of lost. Of course, this is
where the file system gets its name.

A B+ Tree is used to track directories/files and extent locations. The B+ Tree allows for
the searches to be performed much faster than most other stored data in files.

The JFS file system allows for the use of Dynamic Inode Allocation. The Inodes are 512
bytes each with 32 Inodes on a 16KB extent. Every file system has a limited number of
Inodes, but with Dynamic Inode Allocation, more can be created beyond the standard
limit. When Inodes are used up, files cannot be added until files have been deleted and
Inodes freed.

Extents are used to help prevent fragmentation. When Extents are used, a “reserved” free
space is kept after files. These contiguous free blocks are to allow the file to grow and not
cause fragmentation by placing parts of a file in non-contiguous spaces. When files are
spread out, or fragmented, system performance can be affected.

Allocating space on the file system is accomplished by using Extents. To manage the free
space on the file system, B+ Trees are used to track these spaces. Other file systems use a
bitmap to track free and used space. Two B+ Trees are used to track free space on a JFS
file system. One tree is used to store the starting block of the free extents, while the
second B+ Tree indexes the number of free extents for each starting block. To write a file,
the file system can check for a free space with enough contiguous extents, and then find
the starting block to begin writing.

NOTE: Bitmaps are used to track used and unused space. These bitmaps are not images,
but a file where each bit represents an addressable block. Each bit is either on (1) or off
(0) to represent if it is used or free.

To provide more storage, Compression can be used (on AIX only) to compress files so
more data can fit than without it being compressed.

JFS also allows for Concurrent I/O (CIO) for shared access of read and writes to a file.
Normally, when a file is read or written to, the file is in a "lock" mode to prevent other
processes from performing any I/O. With CIO, locks are a shared lock, which means that
other I/O can be performed. Read and writes are normally done in a serial fashion.
When requests are sent from applications to read or write, the requests are fulfilled as
they come - first come first serve, or first in, first out (FIFO). When read or writes are
performed, then to improve performance, Direct I/O is used.

JFS obtains faster throughput by using Allocation Groups. These are sections of a disk
volume where a read/write can occur simultaneously with other Allocation Groups in the
same disk volume. The process works better when the volume spans multiple disks.
Allocation Groups may store files within the group which are related. The relation may
be that they are from the directory and be from the same application. When a file is
opened, the Allocation Group as a whole is locked to prevent the files within the group
from being allocated elsewhere. Another allocation option is Sparse Files, where files are
spread out over the disk.

You might also like