Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Zettabyte File System (ZFS)

When originally developed, ZFS stood for Zettabyte File System. The acronym no longer stands
for Zettabyte File System and has no meaningful acronym as it started out having.

ZFS is a 128-bit file system developed by Sun Microsystems in 2005 for OpenSolaris.

The maximum volume size for a ZFS volume is 2 to the power of 64 for a total of 18 exabits.
The maximum number of files in a directory which ZFS supports is 2 to the power of 48 or
281,474,976,710,656 files. The filename can have a maximum length of 255 characters.

ZFS supports deduplication.

RAID is supported by ZFS. ZFS does support RAID-1, except that more than two disks can be
mirrored. Other RAID supported is not the standard RAID types, but RAID-Z. Specifically, ZFS
supports RAID-Z levels 1, 2, and 3. RAID-Z1 will mirror small blocks across disks instead of
using parity. RAID-Z2 uses double the parity across disks to allow for a maximum of two disks
to fail and the data on the RAID volume to remain accessible. RAID-Z3 uses triple parity to
allow for a maximum of three disks to fail before the volume is inaccessible. When a large disk
fails on a RAID system, it takes a long time to reconstruct the data from the parity. Disks with
storage capabilities in the high terabyte range can take weeks for data reconstruction from
parity. Using a higher level of RAID-Z allows for the disks to not be slowed down by allowing
disk access and data repair at the same time.

For increased size, ZFS supports Resizing. The file system can cover multiple block devices. In
this case, multiple drives can be joined in a ZFS Storage Pool (zpool). Each device, or hard
disk, is a virtual device (vdev). If one vdev fails, the whole zpool goes offline. To prevent this
from occurring, a zpool can be implemented with RAID so it has redundancy to remain online
in case of failure. The ZFS file system can support up to 18,446,744,073,709,551,616 vdev
devices in a zpool. The same number is the amount of zpools on a system.

It should be noted that when a volume has RAID enabled (RAID 0 - striping), the volume can
be increased by Resizing. When a new drive is added to a RAID volume, the stripes are
dynamically resized to allow for the new drive to be included into the RAID set.

Snapshots allow for an image that can readily be used for making a backup and not require
files to be locked. Files can also be skipped in some cases if the file is opened and being
modified at the time of the backup. For writing, clones can also be used.

A zpool can support Quotas to limit the available space to a user or group. Unlimited access
can allow certain users and/or groups to fill drives to full capacity.

To compensate for drive speed, ZFS uses a cache algorithm called ARC. For data that is
accessed often, ZFS will keep the data in RAM, which is faster than a hard disk. If the data is no
longer accessed as much, the data is not cached in RAM. If the hardware system has low RAM,
then the caching is not managed and all data is stored on disk only. ZFS works with low
memory systems, but works better with higher amounts of RAM.

All pointers to a block use a 256-bit checksum to provide data integrity. Data is written as a
Copy-On-Write (COW). Data is written to new blocks before the pointer is changed to the new
block location. Once done, the old blocks are marked as unused. Blocks are not overwritten.

Blocks can be of variable sizes, up to a maximum of 1,024 KB. When Compression is enabled,
variable block sizes are used to allow for smaller block usage when a file is shrunk.

ZFS supports compression. Compression is used to reduce file size before storing it on disk.
Compression saves space on the drives and produces faster reads. Since the data is
compressed, there is less data to be read from the disk. Write times can also be reduced, but it
does require a little overhead to compress before a write and uncompress after a read.
Compression/decompression is performed by the CPU. The available compression methods
are LZJB and gzip.

To help reduce the space taken up by data files, ZFS supports Deduplication. The process of
Deduplication is to remove repetitive data from a file making it smaller on the disk. Similar to
caching data to RAM, Deduplication requires RAM to perform the process. It is suggested to
have one to two gigabytes of RAM for every terabyte of drive space on the ZFS volume.

ZFS supports encryption which occurs in the I/O pipeline. Encryption modifies the data file
with an encryption key so the data is unreadable even if the drive is removed and placed into
another system.

Before a write, the order of precedence is:


Compression Data is compressed (if enabled)
Encryption Data is encrypted (if enabled)
Checksum Checksum is created for data
Deduplication Duplicated data sections are removed

When data is read from the disk and sent back through the I/O pipeline, these steps are
performed in reverse.

You might also like