Professional Documents
Culture Documents
Analyzing Real Time Behaviour of Flash Memories
Analyzing Real Time Behaviour of Flash Memories
Real-Time Group
Diploma Thesis
Flash Memories have an increasing importance for the construction of mechanically robust embedded
computer systems and consumer electronics. For consumer applications, at least five different solutions
exist: Compact Flash (CF), Sony Memory Stick (MS), Secure Digital (SD), Multimedia Card (MMC)
and the xD-Picture Card. Of every type, different generations with different technical parameters (access
speed, capacity) exist.
Usually, the embedded controller of the medium is responsible for wear-levelling and error correction.
If no controller exists, the file system must take care of these aspects. Therefore, a number of specialized
file systems have been developed, among them the Journalling Flash File System (JFFS2) and the Yet
Another Flash File System (YAFFS2).
The aim of this thesis is to develop methods to characterize the timing of access operations for flash
memories of different types and with different file systems. Amongst other the work should present
detailed analysis of the following aspects:
The term real-time signifies that not only an average value has to be obtained for every parameter, but
the worst case timing is interesting as well.
2
Abstract
Flash memories are used as the main storage in many portable consumer electronic devices because they
are more robust than hard drives. This document gives an overview of existing consumer flash memory
technologies which are mostly removable flash memory cards. It discusses to which degree consumer
flash devices are suitable for real-time systems and provides a detailed timing analysis of some consumer
flash devices. Further, it describes methods to analyze mount times, access performance and timing
predictability of flash memories. Important factors which influence access timings of flash memories
are pointed out and different flash file systems are evaluated with regard to their suitability for real-time
systems. Some remaining problems of existing flash file system implementations concerning real-time
use are discussed.
Contents
1 Introduction 1
2 Basics 2
2.1 Flash Memory Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Media Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.3 Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Flash Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 File Systems for Flash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
i
8 Further Work 39
8.1 Collect more data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
8.2 Worst-case analysis of flash file systems . . . . . . . . . . . . . . . . . . . . . . . . . . 39
8.3 Real-time support in flash file systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
9 Conclusion 41
A Glossary 44
1 Introduction
Over the last decade, flash memories have become cheaper and continuously enjoyed increasing impor-
tance. They enable the user to quickly transfer data from one device to another without the need for a
computer network. They are compatible with a wide range of platforms because most of them support
the FAT/FAT32 file system which has become a standard for removable flash devices.
Flash based USB drives have almost completely replaced the conventional floppy disk because they
are small in size, easy to handle and have a much higher storage capacity.
Flash technology is superior to conventional random access memory because it is non-volatile. This
means that none of the already stored data is lost when power is turned off or a power failure occurs. In
contrast to hard disk drives, flash storage is silent, robust against shock, and the raw device is accessible
shortly after power is turned on because there are no moving parts inside of a flash memory. A hard disk
drive needs several seconds after power-on until its disks have spun up and its data is accessible [1].
With flash memory instead, no disks have to be spun up, no head has to be moved and there is no need
to wait for the disk to spin until the data appears under the head because flash memory is implemented
using transistor cells. This makes flash an interesting technology for building robust and fault-tolerant
embedded systems which need a quick startup, but this requires some further investigation on the proper-
ties of flash devices especially with regard to their suitability for real-time systems.
Chapter 2 gives an overview of the properties of common removable and non-removable flash tech-
nologies. It also describes interfaces which can be used to connect flash memory devices to a host.
Chapter 3 presents a collection of tools which can be used to analyse flash devices and file systems. Fur-
ther, it points to previous benchmarks which analysed the average read and write performance of many
different flash devices and card readers. Chapter 4 describes the usage and functionality of a tool which
was developed for this work to perform specific tests on flash cards. Chapter 5 describes the setup and
discusses the results of the benchmarks which were performed on different flash cards. It highlights the
most relevant tests and anomalies. In order to store files on flash devices, a file system is required, and
when it comes to flash file systems, the mount time is often an issue. So the mount times of the VFAT
file system, which is normally used on removable flash devices, were also measured. The results of these
tests are described in chapter 6. This chapter is directly followed by chapter 7 which explains the main
problems of conventional flash file systems and why they should not be used in hard real-time systems.
Chapter 7 also highlights issues which should be considered in the design of new real-time file systems
for flash devices. Chapter 8 summarizes work which still needs to be done and highlights topics for
further research.
1
2 Basics
In order to analyse the behavior and anomalies of flash memories, one has to understand the technology
behind them. The following sections give an overview of basic flash technologies and common types of
flash memories.
2
2.1.2 NAND Flash
The NAND flash technology is used in most current devices and connects the transistor memory cells in
series. This makes a higher data density and lower cost per chip possible. A deficiency of this approach
is that direct bit access is not possible. Any cell of a series which shall not be modified has to be
masked with an offset voltage. Therefore, the data is written to an internal register which takes care of
addressing the flash cell. Afterwards, a write command on the command bus of the flash device initiates
a data transfer from the register to the flash cell [2]. Because of electron leakage in neighbour cells
when writing a certain cell, NAND flash only allows a maximum number of about 10 write operations
to a single erase unit until the whole erase unit must be erased. Erase units in NAND flash are usually
divided into pages of 512 bytes. Each of these pages allows storing 16 extra bytes of out-of-band (OOB)
information, like error correction codes (ECC), erase counters, and other meta-information [3]. NAND
flash always requires ECC data since this technology is prone to page errors by design.
There are two different types of NAND flash which heavily affect the speed and capacity of flash
devices. A single level cell (SLC) can only store a single bit in its floating gate, the multi level cell
(MLC) technology allows storing several bits in one cell. This can either be accomplished by using
several floating gates or different levels of amperage to encode different states [2]. The x4 NAND flash
technology [4] from m-systems allows storing 4 bits in each flash cell.
3
be implemented in external hardware or software. Originally, they were called Solid State Floppy Disk,
being only a bit smaller than usual floppy disks. Later, they were renamed to Smart Media. With the help
of an adapter, the 22-pad-connector of Smart Media devices can be connected to a floppy cable. There
are also adapters which connect them to a PCMCIA slot or a Compact Flash Type II slot. A memory
card reader can connect them via a controller to the Universal Serial Bus or IEEE-1394 (FireWire) port.
A deficiency of the Smart Media technology is that its maximum capacity is limited to 128 MiB, which
resulted in its replacement by the xD-Picture Card.
4
• Application layer takes care of file contents
• File management layer handles the FAT file system and defines the logical device structure
• Protocol layer provides a serial or parallel interface to the stick and handles incoming commands
• Physical layer specifies physical and electrical properties of the Memory Stick bus
Only the lower three layers are covered by the specification. It is not possible to bypass the protocol layer
in order to access flash memory directly. Memory Stick Pro are always controlled through a command
interface provided by a controller. There are three types of commands:
• Transfer Protocol Commands (TPC) are used to control the memory stick and access the data
buffer and registers of the Memory Stick Pro.
• Memory Access Commands are used to access the flash memory and allow READ/WRITE access
to the user or information block area. This set of commands also includes an ERASE command
which deletes data from the current address in the user area and a STOP command which terminates
any of the above operations.
• Function Commands provide special functions to control the device. The FORMAT command
formats the whole device and the SLEEP command allows the host controller to put the device
into a low power consumption state.
Memory Stick
Interface
Controller
VSS
Vcc
BS
Register
Serial
DATA1
Interface
SDIO/
DATA0
Data Memory
DATA2 Buffer Interface Flash Memory Flash
INS
Sequencer Interface
Memory
DATA3
Parallel ECC
SCLK Interface
OSC
VCC Controller
VSS
Both Memory Stick and Memory Stick Pro are available in a smaller form factor of 31x20x1.6mm.
These versions have a suffix "Duo" and are called Memory Stick Duo (MSD) and Memory Stick Pro Duo
(MSPD). Both Duo variants are usually shipped with simple plastic adapters to fit them mechanically
5
into normal sized Memory Stick slots because their electrical interface is identical to the interface of
their larger counterparts.
Memory Stick Micro has been introduced in the beginning of 2006. It also uses the Memory Stick Pro
technology, but with a size of only 12.5x15x1.2mm it is even smaller than Memory Stick Pro Duo. It
allows high capacities up to 32 GB as well as high transfer rates up to 20 MB/s (like Memory Stick Pro)
and is additionally extremely small sized.
6
error correction on newly erased units in order to verify if the data was written correctly. The internal
controller also implements an error management algorithm to correct and replace bits or even whole
sectors by reserved ones, which at least applies to the Samsung MMC device MC56U032NCFA [12].
MMC cards may have an automatic sleep mode to reduce power consumption when no command was
received for 5 milliseconds. This is why MMC cards are in sleep mode most of the time when they are
not being accessed. The sleep mode is transparent to the host, and the device automatically wakes up
when any command is received. Even though the duration between sleep mode and ready state might be
interesting to real-time developers, it is not mentioned in Samsungs data sheet [12].
The MMC specification [11] recommends that external MMC adapters should implement caching in
order to increase throughput by speeding up random access or read-modify-write operations, where a
cache can be very efficient on frequent updates. This recommendation indicates that MMC cards most
probably contain only a small cache or no cache at all.
7
2.2.6 Compact Flash (CF)
Compact Flash Cards were developed by SanDisk in 1994 as one of the first flash devices. They have
become a quasi standard for professional digital photography because of their high capacity, high speed
and strong mechanical robustness. Nevertheless, they are likely to be replaced by Secure Digital cards
because SD cards are smaller and becoming cheaper than CF cards. The interface of compact flash cards
is compatible to ATA and consists of 50 pins, of which 40 are an ATA connector and 10 are reserved for
alternative operating modes and the power supply.
There are two form factors for compact flash cards: Type I with dimensions of 42.8x36.4x3.3mm is
usually used for flash memories, because most digital cameras merely contain a thin CF slot. The thicker
type II with the dimensions 42.8x36.4x5.0mm is common for devices which need more space. Exam-
ples are devices like WLAN, ethernet and bluetooth adapters or barcode scanners which are connected
through the Compact Flash In/Out (CFIO) interface. Hard disks called Microdrives also require a Com-
pact Flash type II slot since they would not fit in the smaller type I slot. In fact, Microdrives do not
contain any flash memory and this is why they only shall be of minor interest in this document.
Four Compact Flash interface standards have been developed over time, and all of these standards are
fully compatible with each other. The CF 1.0 standard allows programmed I/O (PIO) mode 2 with a max-
imum transfer rate of 8.3 MByte/s, the CF 2.0 standard allows PIO 4 with a transfer rate of 16.6 MByte/s,
CF 3.0 allows using the UDMA-66 mode with a maximum theoretical transfer rate of 66 MByte/s, and the
current CF 4.0 standard [15] which supports UDMA-133 and a maximum transfer rate of 133 MByte/s
even exceeds the USB high-speed bandwidth of 400 MBit/s. Therefore, CF 4.0 based SanDisk Extreme
IV cards can only unfold their full performance when used in a card reader which allows a maximum
theoretical bandwidth of at least 100 MByte/s, e. g. a card reader supporting IEEE 1394b (FireWire-800
or FireWire-1600) [2].
8
device class (MSC) which means that the drive exports a linear device which hides the physical block
structure, number of flash chips, erasure of blocks, flash addressing and wear levelling from the host
system. This indirection allows using the same generic mass storage drivers for all USB drives but
makes it more complicated to analyze the behavior of USB flash drives.
Performance of USB drives may be influenced by the USB chipset on the mainboard, but for modern
mainboards this is rarely the case, according to a c’t article about USB flash drives [20]. The USB
protocol overhead may also slow down the transfer. An important parameter for USB keydrives is the
number of flash chips used because this determines whether the drive can be run in dual-channel mode
with interleaved memory access (which can almost double the effective transfer rate) or needs to be run
in single-channel mode. Another bottleneck for USB keydrives is the integrated mass storage controller
because vendors tend to keep their devices as cheap as possible and therefore use cheap controllers.
2.3 Interfaces
Card readers and flash drives may be connected to the host system in various ways. The following
sections will shortly summarize the most common interfaces and their capabilities.
9
2.3.1 USB - Universal Serial Bus
A popular interface which can be found in all modern PCs and some embedded devices is the universal
serial bus interface. The latest standard, USB 2.0, supports transfer rates up to 480 MBit/s. The actual
transfer rate of a transmission is influenced by transfer mode, bus transaction delays and USB protocol
overhead. The calculation of worst case transaction delays and protocol overhead can be obtained from
the USB specification [21]. Chapter 5 of the specification describes the four different transfer modes sup-
ported by USB: Control transfers are bidirectional, best-effort transfers where a packet has a maximum
payload size of 64 byte and every frame is acknowledged. These are intended to detect and configure
USB devices but not to transmit real payload data. Isochronous transfers use a synchronized clock to
deliver data in a timely manner in preference to correctness. This mode is best suited for multimedia
streams, which require data to be either delivered in time or discarded if it contained one or more er-
rors. Interrupt transfers provide a guaranteed maximum service period for transmission attempts, which
is useful for real-time applications. After the desired period, the transmission is either finished or has
failed, and it may be retried in the next period, to which the same guarantee applies. The last transfer
mode is called bulk transfer. This mode does not provide any timing guarantee, but tries to take advan-
tage of the highest bandwidth available. So the recommendation for real-time programmers is to make
use of either isochronous or interrupt transfer, but not bulk transfer.
The maximum theoretical transfer rate of USB 2.0 is high enough to access most of the currently
available flash memories at their maximum transfer rate. USB 2.0 can still be a bottleneck for fast flash
memory devices which comply with the Compact Flash 4.0 standard as explained in section 2.2.6. So
there is a need for alternative interfaces to connect high-speed card readers to a host system.
2.3.3 ATA
Compact Flash cards can be connected to an ATA port using a simple Compact Flash to ATA adapter and
communicate by using their True IDE I/O Transfer Function as specified in chapter 4.7 of the Compact
Flash specification [15]. It is important to note that the compact flash interface is hotpluggable, but the
ATA interface is not. So a compact flash may not be unplugged from a CF/ATA adapter while connected
to the ATA port of a running PC, otherwise the system might lock up.
10
2.3.4 Memory Technology Devices
Some flash memories, especially on-board devices and devices connected via serial peripheral interface,
can be directly accessed using an appropriate hardware driver. Such a driver can be either generic code
like the configurable MTD NAND driver which works out of the box, or a vendor customized driver
which can perform better on the specific flash chip [25].
Since neither block nor character devices differentiate between read, write and erase operations, Linux
supports a special Memory Technology Device (MTD), which provides a function set to access all fea-
tures of flash memories. The MTD subsystem in the operating system provides generic access to the
out-of-band and user area of flash memories while limiting the hardware specific driver part to the smal-
lest amount possible and keeping driver development simple. The MTD interface allows application
developers to create special flash file systems and block mapping layers like UBI (cf. section 2.4.3).
Software based on the MTD subsystem is independent from specific hardware while still being able to
access flash specific functions. There is a utility mtd_debug which uses the MTD subsystem to display
detailed information about the underlying memory technology device, e. g. the type of flash (NAND or
NOR) and the erase unit size. This information is gathered from the flash chip using the Common Flash
Memory Interface (CFI) [26].
2.3.5 Summary
Timings can only be guaranteed if all layers from the physical flash memory device layer to the applica-
tion layer specify and guarantee worst case timing constraints. Therefore, the data bus and its arbitration
and transfer modes need to be carefully selected in order to guarantee the bandwidth and latency that a
real-time application or memory device requires. Hard real-time systems additionally require real-time
drivers for the chosen interface which are able to guarantee maximum latencies.
Figure 2.2 shows a model of several possible flash architectures in which Linux is used as the operating
system. It illustrates the involved layers and their purposes in each architecture.
11
Layers SM / xD Flash Card USB Drive MTD UBI
Card Card
Driver NAND
Reader Protocol
Firm-
ware Flash Flash Flash
Flash
Management Management Management
Con-
troller Driver NAND NAND
NAND, NOR, NAND, NOR,
Flash Hardware NAND Flash NAND Flash NAND Flash
OneNAND OneNAND
Flash Management: Bad Block Handling + Wear Levelling + Block Mapping Algorithm (surrounded by black line)
UBI: Logical Volume Manager for Embedded Flash
MSC: Mass Storage Device Class (over USB or Firewire IEEE 1394)
data structures to increase flash lifetime and does not deal with real-time aspects.
Most vendors implement wear levelling techniques on the internal microcontroller of their flash memo-
ries, but keep source codes and algorithms secret. Smart Media and xD-Picture Cards do not provide
any wear levelling because they do not include a controller. The simplest approach to implement wear
levelling in software is to emulate a block device using a flash translation layer (FTL) as described
in section 2.4.2. Another approach is to develop specialized flash file systems which handle the wear
levelling problem and care about the eraseblock size and other specifics of the flash memory, but their
use only makes sense when no additional wear levelling mechanisms like flash translation layers or
hardware based wear levelling are in place. Stacking several wear levelling layers would only decrease
performance, so it should be implemented in exactly one layer.
DiskOnChip devices like SanDisk mDOC H3 implement the TrueFFS file system as firmware on the
integrated microcontroller. Even though SanDisk advertises all their DiskOnChip devices with buzz-
words like wear levelling, the mDOC H1 only implements the TrueFFS file system as a software driver
on the host system, in order to keep the hardware price low.
12
access to the device requires erasing a whole flash erase unit. If the data units written are smaller than
one erase unit of the flash device, and the file system is not aware of the underlying hardware, it uses
the hardware in a very inefficient way. The FTL is now deprecated because of the above problems
and patent issues. It is not further developed and only supported by the Linux kernel for backward
compatibility [30]. Modern approaches use log-structured flash file systems instead, which directly take
care of wear levelling and other specifics of the underlying flash hardware. Some of these file systems
will be presented in section 2.5.
13
2.5.1 FAT/FAT32
The most common file system used on removable flash memories is the File Allocation Table (FAT or
FAT32) file system which is supported by many platforms and provides long filename support. For
compatibility reasons, most digital cameras and operating systems can read and write this file system
and it is used on a lot of removable flash devices, even though it does not care about wear levelling. This
is not an issue because most of today’s removable flash devices do wear levelling in firmware and can be
seen as a block-oriented mass storage device to the outside world.
FAT and FAT32 do not provide any access restrictions, therefore anyone with access to a FAT file
system can create, alter or delete any file on it. There are some security mechanisms defined in the flash
memory standards which work on the physical device layer, but these approaches have not become popu-
lar yet. There are also some memory devices (e. g. USB keydrives) which are bundled with software that
adds security on the file system layer by creating an encrypted and an unencrypted partition. This vendor-
specific software often runs only on a single operating system which nullifies the wide compatibility of
the FAT file system and makes these partitions unreadable on other operating systems or devices.
With regard to compatibility, FAT is the best choice for most modern flash memories which do internal
wear levelling, but when using the FAT file system on a flash memory device without any wear levelling
mechanism, the following problem occurs: The file system driver frequently changes the file allocation
table at the beginning of the device. This happens at least each time a file is created, resized or deleted.
If there is no wear levelling layer below the FAT file system, the same flash block will likely get updated
a lot of times and worn out quickly which will render the flash device unusable very soon. This is why
the FAT/FAT32 is often combined with software or firmware to provide a simple form of wear levelling
and equally distribute write operations across the physical blocks of the device. Any file system can then
work the same way as on other block devices because it sees the flash device as a logical block device
which indirectly accesses the real physical device through the wear levelling component.
2.5.2 ExFAT
The extended File Allocation Table [34] has been developed for mobile devices with non-removable NOR
or NAND flash memory. It is an improved version of the FAT file system which supports files larger than
4GB and allows the file system to be customized [35] for device specific parameters like the physical
erase block size of flash devices. Additionally, devices using an exFAT file system may optionally
implement a transaction-safe FAT file system (TFAT) [36] which increases reliability and allows crash
recovery on interrupted file system operations. It is not recommended to use TFAT on removable devices
in combination with desktop operating systems which do not know about TFAT because then the file
system is treated like FAT and file system operations are not transaction-safe. TFAT manages two copies
of the file allocation table and reroutes the FAT chain on file modifications. This causes a performance
decrease in comparison with FAT.
2.5.3 JFFS2
The Journalling Flash File System is a log-structured file system based on a simple log-structured file
system (LFS) which has been released by Axis Communications AB in Sweden in 1999 under the GNU
14
General Public License. JFFS only supported one inode type (raw inode) and therefore lacked support
for hard links. Later, an advanced, portable version named JFFS2, with several inode types (raw inode,
directory entry, clean marker) has been developed.
JFFS2 [3] is designed to work on NOR and NAND flash chips and provides error correction (ECC)
which is obligatory for NAND flash. The file system accesses flash devices through the Memory Tech-
nology Device (MTD) subsystem of Linux and takes care of wear levelling and bad block management.
This avoids the inefficiencies which would occur with a journalling file system on top of the flash trans-
lation layer, which is some kind of journalling file system by itself.
JFFS2 supports compression on the file system level. With file compression enabled, it is difficult to
predict the duration of file system operations. Additionally, it provides a background garbage collection
task which is triggered when necessary and reclaims unused erase units marked for deletion. Such
a background task may cause unpredictable delays or even deadlocks, which makes it impossible to
guarantee worst case delays for file operations on JFFS2. There is a problem that JFFS2 write access
operations will hang during garbage collection, which has been confirmed on the Linux MTD mailing
list by David Woodhouse, the maintainer of JFFS2 [37].
Another deficiency of JFFS2 is that it is only suitable for smaller flashes. It does not store a central
index on flash, but each JFFS2 node contains all index information about itself. At mount time, it
performs a multi-stage process including a full scan of the physical device in order to gather all inode
information and store it in RAM, which is much faster than flash. For all nodes, the CRC value is checked
and only the valid nodes are stored in inode cache structures, collected in a hash table [3]. The full scan
requires a lot of time and largely depends on the size and read performance of the flash device. It also
requires a lot of memory to hold the inode cache structures in RAM. Small embedded systems might
not have enough RAM to cache all inode information and successfully mount a large JFFS2 partition
containing multimedia files.
JFFS2 prolongs flash lifetime by implementing wear levelling and garbage collection, but yet noone
has proven the formal correctness of the JFFS2 garbage collection algorithm, and as long as it is not
proven to be deadlock-free, JFFS2 is not a good choice for real-time systems. The above-mentioned
problems make the JFFS2 file system unsuitable for hard real time systems which require big flashes
or have small RAM. On power-failure, the file system needs to be remounted and the complete device
scanned until the file system gets ready. This would make the file system unavailable for several seconds
or even minutes which might lead to missed deadlines.
Since JFFS2 does not scale well and only works on small flashes, it will likely be superseded by ad-
vanced flash file systems like JFFS3 [38] or LogFS [39] when they have reached production state. Both of
these file systems use hierarchical tree structures and are designed to work on large flashes, save memory
and aim to reduce mount time by implementing check point nodes which store accumulated information
about inodes. The design papers do not discuss real-time issues, but there are already extensions [40] to
JFFS3 and any further improvements are welcome on the linux-mtd mailing list.
2.5.4 LogFS
LogFS [39] is a new log-structured flash file system designed to be scalable and provide an efficient,
deterministic garbage collection algorithm which never uses more space than it reclaims. It aims to
15
mitigate the deficiencies of JFFS2 and is designed so that mount times do not linearly depend on device
size. The LogFS file system is still under heavy development, but in contrast to JFFS2, the relatively
simple garbage collection strategy of LogFS can be proven to be deadlock-free. Jörn Engel, the main
developer of LogFS, explained the garbage collection algorithm and its correctness in a talk [41] at the
linux.conf.au conference in Sydney 2007.
LogFS does not keep track of deleted nodes, it uses a tree whose root is stored in a journal in a few flash
erase units. To prevent the wearout of such essential journal blocks, they can be indirectly referenced
by another block which points to the actual journal. This pointer can be updated and the journal can be
moved around when the maximum erase count approaches.
The LogFS file system handles inodes like files, in fact LogFS stores all inodes in a file which is
referenced by the journal. When files are added or changed, it allocates new erase blocks, writes the
data and restructures the tree to reference the new data. This produces some garbage nodes which can be
collected afterwards by walking down the tree up to the level where data has been changed.
2.5.5 YAFFS2
YAFFS2 is the successor of YAFFS which is an acronym for Yet Another Flash Filing System. The initial
YAFFS version has been developed in 2002 by Charles Manning for the Aleph One company because the
first JFFS/JFFS2 versions were mainly developed for NOR flash and did not scale well (cf. section 2.5.3).
YAFFS2 has much code in common with YAFFS. It is a log-structured, tree-based file system which is
specifically designed for newer NAND flashes. The main focus lies on high performance, robustness and
data integrity. A Linux kernel patch which supports the YAFFS2 file system can be obtained from [42].
A YAFFS file system image can be created with the tool mkyaffsimg. Alternatively, a new YAFFS
file system can be directly created on NAND flash using the tool mkyaffs. This tool simply erases all
undamaged blocks of a NAND memory technology device. Any erased blocks are considered as empty
space in the file system. YAFFS obeys the requirements of the most restrictive NAND flashes and takes
care of writing the pages of each erase block in sequential order. YAFFS2 writes each page only once
before the block is erased. When the size of a file is reduced a new chunk called shrink header is
written. Such shrink headers allow marking previous blocks as dirty without the need to overwrite the
discarded chunks. In order to reduce wear, YAFFS2 does not erase blocks until they are full of data
chunks which are marked dirty. Mount time is improved by a feature called checkpointing which stores
accumulated file system meta-data on flash when the device is unmounted. This trade-off slows down
the unmount process, but allows reconstructing necessary file system information very quickly when the
device is inserted and mounted. Since JFFS2 does not support checkpointing, YAFFS2 usually has 5
times shorter mount times than JFFS2 [43], [44]. Nevertheless it should be noted that on power-loss,
no clean unmount could be done, and consequently no checkpoint data would be written. Paper [45]
describes how to improve crash recovery time for log-structured file systems like YAFFS by introducing
a RAM based log-record manager. The log-record manager maintains checkpoint information in RAM
and regularly updates check regions on flash. This helps to reduce the number of pages scanned at mount
time when a file system had not been cleanly unmounted before.
The YAFFS file system has a modular architecture which separates file system functions from flash
management functions. The YAFFS Direct Infrastructure (YDI) documentation [46] claims that ’YAFFS
16
has highly optimised and predictable garbage collection strategies’ and describes how YAFFS can be
integrated into real-time operating systems.
2.5.6 Summary
Most current flash file systems put great efforts in wear-levelling and robustness, but care less about
predictability, timing guarantees or real-time aspects. A flash file system which takes care of flash spe-
cific requirements, flash lifetime and strongly deterministic timing for read/write operations would be
desirable.
Regardless of the file system chosen, it is always recommended to mount flash devices with the
noatime option. This prevents the file system from updating file access timestamps when files are ac-
cessed read-only. The omission of write operations while reading leads to higher read performance,
reduces wear and therefore increases the lifetime of the flash device.
17
3 State Of The Art
This chapter discusses the capabilities of existing file system and device benchmarks.
3.1.2 IOzone
IOzone [49] is a comprehensive file system benchmark available on many architectures. It supports a lot
of features, measures a variety of file operations and can be used to evaluate the overall performance of
different file systems. It provides gnuplot compatible output and supports synchronous operation which
flushes writes immediately to the storage device. Even though cache effects can be analysed with the
help of IOzone by varying the unit and file sizes, it displays only average values, so it would be hard to
detect rogue timing results with IOzone.
The original plan was to use IOzone for comparing the ext2 and jffs2 file systems on the BlackFin
BF537 board’s embedded 4MB flash memory technology device, but unfortunately, IOzone ran out of
memory when the cross-compiled generic target was executed under uClinux on the BlackFin board. On
desktop or server machines with plenty of RAM it should run well.
3.1.3 Tiobench
Tiobench [50] is a portable, threaded file system benchmark program. It is licensed under the GNU
General Public License and runs on any POSIX operating system which supports pthreads. Before
18
running the tool, a file system needs to be created. Each thread then creates a file in this file system to
measure read and write operations. Tiobench can be used to determine average and maximum access
latencies of read and write operations on different file systems and the average read and write throughput
at sequential or random offsets inside of files or raw block devices.
3.1.5 h2benchw
Heise has developed a disk benchmark [51] for DOS (h2bench) and for Windows (h2benchw) which
makes use of raw physical devices and measures the minimum, average and maximum access times as
well as the sustained read/write performance.
• A rough comparison report on usb flash drives, their prices, access times and average throughput
can be found at Tom’s Hardware [52].
• The CF/SD performance database [53] collects average performance data of various Compact
Flash and Secure Digital cards.
• Hans-Jürgen Reggel collects information about the compatibility of card readers and removable
flash memory cards from various vendors in his cardspeed project [54]. This project gives a broad
overview on which cards really provide the advertised bandwidth and which card reader is best
suited for which type of card. The results are based on the output of h2bench and a tool written by
himself.
• c’t magazine has also published an article [55] with comprehensive benchmark tables covering the
read and write throughput of many flash cards available, but these benchmarks also put their main
focus on average bandwidth.
3.3 Summary
Existing flash benchmark tools and reports usually compare average or best access times and data transfer
rates of different cards. Most of these benchmarks do not go into detail and often neglect the analysis
19
of special properties like anomalies, extreme cases or timeouts of flash memories and their interfaces.
Average bandwidth values are simply summarized in tables in order to compare the general performance
of certain devices and to make recommendations which ones to buy. This might be sufficient for most
end-users, but real-time developers will also be interested in worst-case results in order to estimate worst-
case execution time of flash access operations.
Most of the above disk benchmark tools specialize in hard disk anomalies related to moving parts, but
there are no moving parts in flash at all. Since flash is based on completely different technology, other
anomalies are to be expected and new tools are required to cover them. In order to implement some
special tests which were deemed relevant for flash memories, a tool named flanatoo was developed for
this thesis.
20
4 Flanatoo - A Flash Analysis Tool
This tool analyzes the behavior and the best, average and worst case performance of flash memories. Its
source code can be found in directory ./src/flanatoo of the enclosed tarball and can be compiled using
the make command. It provides features to perform a detailed analysis of a raw flash device: First, it
supports reading and writing at different positions to check whether performance depends on position.
Second, it allows writing different data patterns in order to check whether some patterns can be written
faster or slower than others. Third, it makes use of direct physical access to the device in order to bypass
caches. Finally, it records best, average and worst case times of access operations, to detect exceptionally
high delays and to get an impression of the worst case measurement results. The following sections will
describe the functionality in more detail.
Mode is the desired benchmark mode and filename is the raw block device on which the test will be run.
Both mode and filename are mandatory options.
The complete syntax and a list of available tests can be requested as follows:
# ./flanatoo --help
There are some additional parameters which may be applied to several tests: --max-blocksize
sets the maximum block size for tests where the flash is benchmarked using differently sized read/write
units. The --max-devsize option sets an upper size boundary on the device which may not be crossed
when reading or writing.
21
used to calculate the average number of processor cycles which passed per second. The result is stored
in variable options->cycles_per_second.
Another important parameter is the measurement overhead which is the minimum number of pro-
cessor cycles needed for the measurement itself. This measurement delay is determined in function
calibrate_tsc by running a loop with ten iterations and reading the time stamp counter twice in each
pass. The minimum number of cycles from all iterations is considered the minimum overhead of reading
the time stamp counter and is stored in variable options->rdtsc_overhead. The arithmetic mean of the
overhead is not used, because if only one pass requires much more time (e. g. if the process is inter-
rupted between the two read operations), this will have a great impact on the arithmetic mean of the
overhead, even though all the other values were much smaller. Unlike the arithmetic mean, the minimum
overhead represents the additional time required for each measurement and may be subtracted from any
measurement value.
22
mitigated by filling random data into the buffer before it is written to the device. The O_SYNC flag
prevents an early return before each write operation is finished.
23
offset of each part and the minimum, average and maximum number of milliseconds needed to read four
blocks from these positions. These results can be used to check if read operations are equally fast for all
positions.
24
5 Removable Flash Card Analysis
25
Only the card reader or USB drive were plugged into the USB 2.0 port of the host system and no other
USB devices were connected. The test system was an AMD Athlon64 3000+ machine with 1 GB of
RAM running a 2.6.18 Linux kernel. An exception where the card reader was not involved was the 4 MB
parallel NOR flash [56]. This chip is found on the BlackFin BF537 STAMP board and can be directly
accessed on the board through the MTD subsystem of uClinux. The Microdrive (MD1GHIT) contains a
small harddisk instead of flash memory, but due to its compact flash interface it can be plugged into the
card reader like a normal flash card and so it was included for comparison.
RESULTS=Name-of-Flash-Device
DEVICE=/dev/sdb
cd /dev/shm
mkdir $RESULTS
cd $RESULTS
benchmark.sh $DEVICE
The output files were then copied from RAM disk to a non-volatile medium and converted into diagrams
using gnuplot and a configuration file which can be found in measurements/gnuplotrc:
cp -a $O $HOME
cd $HOME/$RESULTS
gnuplot gnuplotrc
The results can be found in the measurements subdirectory of the enclosed archive.
26
So the flash test is rather useless since the assumption that solely blocks filled with 0xFF trigger block
erases turned out to be false. Nevertheless, the next section shows that comparing the overall perfor-
mance for different block sizes can help to estimate the physical erase block size if the device cannot be
opened and the flash chip is not directly accessible.
Figure 5.1: SanDisk 2 GB Compact Flash: Requires block size 64 KB to achieve full performance
27
5.3.3 Reading is faster than writing
When the logical block size is set to at least 64 KB, write bandwidth is lower than read bandwidth for
each of the devices. This confirms the claim that reading from flash memories is usually faster than
writing to them. Figure 5.1 shows that this is not always the case: In some exceptional cases reading
may be even slower than writing when block sizes below the physical erase block size are used.
1. Writing 0xFF blocks is slightly (about 1 to 8ms) faster than writing any other data
2. Writing 0xFF blocks takes similar time as writing any other data
3. Writing 0xFF blocks is about 32ms slower than writing any other data
Figure 5.2 illustrates that on the CF256TOSH card writing 0xFF blocks (below 158ms) on average is
slightly faster than writing any other data patterns, so it can be assigned to category 1. Sporadically, a
write operation requires about 270ms instead of about 160ms, no matter which data has been written.
This might be an indication of the device performing additional block erasures on some of the write
operations. The CF32MED, CF64-1, CF64-2, CF64-3, CF64-4, MMC32, MS16SONY, MS32PROSAN,
NOR4ST and SD32MIT devices all show similar characteristics: writing the 0xFF pattern produces the
shortest delay in comparison to writing any other patterns. Additionally, any of the write operations on
Multimedia Card MMC32 took between 193 and 196ms, no matter which data was written. Inside of
this range, the card shows characteristic patterns depending on which data was stored on the card before
the write operation started. Write operations were rather fast (193.0 to 194.2ms) when there was 0x00,
0x33, 0x66 or 0x99 on the card, while write operations on blocks filled with 0x11, 0x88, 0xBB or 0xEE
were rather slow (194.9 to 196ms). The speed of any other write combinations varied, but did not leave
the above-mentioned range. The delay for writing data to the embedded parallel NOR flash (NOR4ST)
strongly depends on both the previously stored and the newly written data patterns. Figure 5.3 shows
there are three groups of data patterns which produce different write delays. The rightmost data column
also reflects these three groups, but each of them has a write delay which is about 200ms smaller. This
means that when the device contains the 0xFF pattern before any new data is written, writing is about 5
percent faster.
28
Figure 5.2: Toshiba 256 MB Compact Flash: Writing 0xFF is on average faster than other data
Figure 5.3: ST M29W320DB: Writing is clearly faster when there was an 0xFF pattern before
29
Devices MS16SONY (figure 5.4) and SD32MIT (figure 5.5) show a clear distinction between the
delays when writing 0xFF versus other data. Writing 0xFF to the 16 MB Sony Memory Stick was 8ms
faster than the worst case of other data, and writing four 0xFF blocks to the 32 MB Secure Digital Card
from Mitsuca was 5ms faster in contrast to the worst case of writing other data. The difference between
the best and worst case for writing data other than 0xFF to the SD32MIT device was only 7ms, which
clearly shows that writing blocks filled with 0xFF is an exceptional case and handled differently.
Since this behavior applies to a lot of the tested cards it seems to be common for flash memories. A
freshly erased physical block contains only 0xFF. It is therefore obvious that the 0xFF pattern is written
fastest because no memory cells need to be modified after an erase operation. The controller can simply
erase one physical block and leave it as is. Writing any other data would require to address and modify
memory cells which takes time. Since erase and write are two separate flash commands, a simple erase
command suffices to fill one or more blocks with 0xFF and the write command would not even need to
be issued.
This category includes all devices where writing speed is not significantly determined by the written data
pattern. In contrast to the devices from category 1, these devices do not expose the internal flash behavior
to the outside world and do not show any significant timing difference between writing blocks filled with
0xFF and other blocks. The CF1GSAN compact flash card from SanDisk belongs to category 2 because
the delay for writing blocks filled with 0xFF does not significantly differ from blocks with other data
30
Figure 5.5: Mitsuca 32 MB Secure Digital card: Writing 0xFF is clearly faster
patterns. However, this compact flash card shows another interesting write pattern: While most write
combinations require between 26.5 and 27.5ms, some data patterns have a better best case taking about
1 or 2ms less time, especially when the blocks contained 0x33, 0x77, 0xBB or 0xFF before the new
pattern was written. The CF2GSAN card also belongs to category 2. It behaves very deterministic and
writing any data always requires between 26.25 and 26.60 ms at position 0. However, there may be other
characteristics like sensitivity to position (cf. section 5.3.5), which can affect the worst case writing
time. Writing delays varied among devices with different interfaces, like a 8 MiB compact flash card
(CF8HP), a 512 MiB dual-voltage reduced-size MultimediaCard Plus (MMC512MOB), a 1 GiB Secure
Digital Card from Hama (SD1GHAMA), the 512 MiB Lexar USB drive (USB512LEX) and of course
the 1 GiB Microdrive (MD1GHIT) from Hitachi which will not show flash specific behavior because it
does not contain any flash memory at all.
A few flash devices were writing four 0xFF blocks about 32ms slower than other data patterns, which
is the total opposite to category 1 devices where writing 0xFF blocks is faster than writing other data
patterns. Category 1 devices have still one thing in common with category 3 devices: write time for
0xFF blocks clearly contrasts with writing blocks containing different data patterns.
Slow writing of 0xFF blocks was observed on a 128 USB drive (USB128SEC) and on a 512 MB
Secure Digital Card from SanDisk (SD512SAN). Figure 5.6 illustrates that writing four 0xFF blocks to
the 512 Secure Digital Card required between 282 and 284ms while any other data patterns required only
31
250 to 252ms. A variability of only 2ms for 0xFF data as well as non-0xFF data but a difference of 32ms
between 0xFF and non-0xFF data indicates that additional internal activity is going on in the flash device
in the special case where blocks filled with 0xFF are written to the device.
Figure 5.6: SanDisk 512 MB Secure Digital Card: Writing 0xFF blocks requires 282-284ms
Several devices including CF256TOSH, CF1GSAN, MS32PROSAN, SD32MIT and SD512SAN show
a clearly better write performance at position 0 than on any other positions. This might be an indication
for optimizations in a region of the device where the file allocation table is usually stored. However,
there is also a device (Lexar USB drive USB512LEX) which writes data significantly slower to position
0, while write performance on all the other positions varies heavily.
On the CF2GSAN and MS16SONY devices, the write delay of four 64KB blocks more than doubled
at the middle position of the device. This might indicate the existence of two memory banks which are
32
accessed sequentially and have to be switched back and forth. Writing to the middle of a flash device
might lead to worst case performance if there are several memory banks present. For hard real-time
systems, it is therefore recommended to take care of memory bank switching and to check whether
sequential access over several memory banks might have an impact on worst case. If direct access to
embedded flash memory chips was possible, several memory banks could be accessed in parallel using
several processing units. The problem with removable flash devices, however, is that the memory chips
are usually hidden behind a single controller which does not allow direct access to the memory chips.
The 512 MB MultiMedia Card mobile (MMC512MOB) shows another interesting effect. Figure 5.7
illustrates that above 256 MiB the read performance suddenly becomes worse. This indicates that the
device probably contains two memory banks, a fast and a slow one. Number, type and quality of memory
banks is definitely an issue, and position-dependent read-write-tests can help to identify such memory
bank related problems.
Figure 5.7: ExtreMemory 512 MB MMC mobile card: Worse read performance above 256 MB
Some of the devices do not show characteristic anomalies, but the write performance varies across the
device. For example, the write performance on the 1 GB Secure Digital card from Hama (SG1GHAMA)
extremely varied in a range from 100ms to 500ms.
33
Position independent devices
There was also a group of devices where position does not matter: Performance of reading/writing four
64 KiB blocks from or to the devices CF32MED, CF64-1, CF64-2, CF64-3, CF64-4, CF8HP, MD1GHIT,
MMC32 and USB128SEC is not significantly determined by the positions accessed.
5.3.7 Conclusion
The above sections have shown that even two devices with the same interface may expose a totally
different timing because the behavior of removable flash memories strongly depends on the integrated
controller and flash chips. Devices from the same vendor might contain different hardware, but devices
from the same product line are likely to contain the same hardware, e. g. the four 64 MiB compact
flash cards have shown a very similar behavior in all the tests. Without having profound knowledge
of the firmware used, controller and flash chip, and without extensively measuring it, the general and
worst-case behavior of removable flash devices cannot be reliably predicted.
In order to determine best and worst-case behavior of flash memories without knowing the internal
hardware details, one needs to run at least the flash tests for writing different data patterns, writing data
at different positions and with different block sizes, in order to find out basic anomalies of a specific
device and to estimate the access block size required for optimal throughput. By looking at the test
results and graphics one can decide whether the specific device is suitable for real-time use and which
block size the application or file system should use.
34
6 VFAT Mount Times
Flash file systems usually use a virtual to physical block mapping which is stored in volatile random
access memory during runtime. On the flash device, each physical block keeps track of which logical
unit it belongs to. The logical to physical mapping information has to be rebuilt when the device is
plugged in and the file system is mounted. Depending on the type of file system and the size of the flash
device, this can result in a long delay when mounting the device. In contrast to flash file systems, the
VFAT file system should produce rather short and deterministic mount times because it was originally
not designed for flash and does not care about wear levelling or remapping of blocks. It simply stores its
file system information at the beginning of the device and relies on lower layers like the hardware to do
the necessary wear levelling.
6.1 Setup
Mount times of the VFAT file system were compared on different flash cards using the HAMA card
reader. The host and kernel setup was the same as for the removable flash card analysis which has been
described in chapter 5. The script src/scripts/fill-directory was used to create 30 files containing 1 MB
of random data from /dev/urandom. Then a VFAT file system was created on each flash device using the
script src/scripts/create-vfat. This script also copied the files containing random data to the device and
then unmounted it.
6.2 Measurements
The script src/scripts/mnt takes a device as the first argument and measures the time to mount it using
the time utility. The device is automatically mounted and unmounted 10 times, and between each pass
it waits for a key press. This allows the user to remove and re-insert the device before the script tries to
mount it again.
Table 6.1: Mount times of VFAT file system filled with 30 MB of data (in ms)
35
6.3 Discussion
Table 6.1 shows that mount times are very reproducible for each device (maximum difference between
best and worst case is 3ms), but mount times significantly differ between the devices, depending on their
general performance. There is no linear dependency between the VFAT mount times and the device size.
VFAT mount time significantly depends on the overall device performance, e. g. a relatively slow 128
MB Secure Pendrive (USB128SEC) requires about 255ms to mount a 128 MB VFAT file system, while
a modern 1 GB HAMA card (SD1GHAMA) on average requires only 28ms to mount a 1 GB VFAT
file system when both are filled with the same 30 MB of random data. Since the file allocation table
has a fixed size and read performance from flash is very deterministic, mount times are almost constant
for VFAT, the most frequently used file system on flash cards. However, the time between the card
insertion and the mount process depends on user interaction. During this time, the card reader and the
flash card get the chance to initialize themselves without being measured, which might lead to smaller
measurement values for mount times. The time between the re-insertion and the moment when the card
becomes readable should be added to the mount time, if it can be reliably measured. Also note that flash
file systems like JFFS2 require reading more data when mounting the device, which heavily increases
the total mount time.
36
7 Real-Time Support for Flash File Systems
If a real-time application requires working with files and the underlying layers do not support wear
levelling (like some embedded flash chips and controller-less flash cards when used without FTL or
UBI), a suitable real time flash file system becomes necessary. The following sections explain why most
standard flash file systems are not suitable for real-time use and make proposals which aspects a flash
file system developer should take care of.
37
mechanism must guarantee a maximum time for garbage collection actions performed on each file sys-
tem operation. Paper [60] proposes design principles for real-time garbage collection on the block device
layer. These proposals should be combined with the principles of log-structured flash file systems. LogFS
includes a garbage collection mechanism with a bounded execution time which depends on the modified
node’s depth in the hierarchical file system tree. Further information about LogFS and its garbage collec-
tion algorithm can be found in Jörn Engel’s talk [41]. The LogFS garbage collector could possibly serve
as a basis for developing a deterministic real-time garbage collector for log-structured flash file systems.
As proposed in [61], section 4, any sophisticated flash file system should reduce its memory footprint
by storing block related metadata in the out-of-band area on flash, while only caching the topmost levels
of a hierarchical file system structure in RAM for quick access. The required amount of metadata in
the device’s out-of-band area should be minimized. For example, LogFS is designed to cache only a
small portion of metadata in RAM and store the rest on the flash device. This might be a good basis
for developing a real-time flash file system which supports big flashes, provides fast mount times, has a
small memory footprint and a predictable garbage collection.
In order to enable simultaneous read and write operations and to prevent data loss, paper [61] proposes
using two flash memory chips where information is always written to both chips one after another. One
of the chips is in read mode while the other chip is being written. When the chip has finished writing, the
chip is switched to read mode and the other one enters write mode. This method allows doing garbage
collection on the write-enabled chip while the application is reading from the other one. In order to get
enough time for garbage collection, the paper proposes constraining the inter-arrival time of requests.
The redundancy provided by the two chips improves reliability because data is always available twice,
and if a memory block contains defects, data can still be read from the other chip. If a write process is
half finished during power-failure, serial numbers ensure that the latest intact version of a block can be
found and recovered easily. The paper proposes the following crash-recovery strategy: When a page is
about to be modified, the page and its serial number are read from flash, then the page is modified in a
buffer and then the modified page is written back to flash with a higher serial number (modulus 4). When
the writing finished successfully, the direct map is updated to point to the new page and the old page
is marked dirty by clearing the dirty bit in the out-of-band area of the old page. One aspect the paper
misses is the indirect page table which maps each physical to a logical block. It is distributed across the
out-of-band areas of all flash pages. So the whole device is required to be scanned at mount time in order
to construct a direct page map and build up a consistent file system state in RAM. LogFS addresses this
issue by storing an index on flash in dedicated journal blocks so that it can quickly locate the flash blocks
containing a certain file.
38
8 Further Work
39
case. Besides, it should still maintain most of the advantages of modern log-structured flash file systems
like automatic wear-levelling, deadlock-free garbage collection, O(1) mount times and a small memory
footprint. If it turns out that it is not possible to achieve all these aims by changing an existing file system,
a new open source real-time flash file system needs to be designed which above all takes care of real-time
aspects, but also pays attention to flash specifics.
40
9 Conclusion
This thesis summarized existing flash technologies as well as tools and file systems in the flash memory
area. It described methods to analyse the real-time behavior and mount times of removable flash devices.
Flash file systems were discussed with regard to real-time aspects and proposals for further improvement
of file system modules were made. The author is not aware of any open source flash file system which
supports garbage collection with real-time guarantees. Among others, a predictable garbage collection
mechanism as proposed in [60] needs to be implemented as an integral part of a real-time flash file
system.
The tested flash memories have proven to be very predictable on read operations with fixed block
sizes while write performance tests showed a lot of anomalies. Each device is different and write access
time strongly depends on the technology around and inside the flash memory. The consequence is that
each series of removable flash memory devices intended for hard real-time systems needs to undergo
extensive performance and latency testing before going into production. This helps to detect anomalies
and to ensure that the device complies with the data sheet specification even in worst-case situations. In
comparison to hard disk drives, flash memories are clearly more suitable for real-time systems because
they are more robust and provide far better worst-case access latencies. For raw flash devices which do
not include a controller [56], vendors even guarantee maximum access times for program operations in
their data sheets. Access times measured in experiments with the M29W320DB flash reliably met the
maximum access times provided in the data sheet. Sensitivity to data patterns as described in section
5.3.4 is negligible as long as the worst case data pattern meets the data sheet specification. Therefore it
is worth to put further research into real-time file systems for embedded flash memories.
41
List of Tables
6.1 Mount times of VFAT file system filled with 30 MB of data (in ms) . . . . . . . . . . . . 35
42
List of Figures
5.1 SanDisk 2 GB Compact Flash: Requires block size 64 KB to achieve full performance . 27
5.2 Toshiba 256 MB Compact Flash: Writing 0xFF is on average faster than other data . . . 29
5.3 ST M29W320DB: Writing is clearly faster when there was an 0xFF pattern before . . . 29
5.4 Sony 16 MB Memory Stick: Writing 0xFF is clearly faster . . . . . . . . . . . . . . . . 30
5.5 Mitsuca 32 MB Secure Digital card: Writing 0xFF is clearly faster . . . . . . . . . . . . 31
5.6 SanDisk 512 MB Secure Digital Card: Writing 0xFF blocks requires 282-284ms . . . . 32
5.7 ExtreMemory 512 MB MMC mobile card: Worse read performance above 256 MB . . . 33
43
A Glossary
CF Compact Flash
DV Dual Voltage
MS Memory Stick
MSPD Memory Stick Pro Duo (high speed and small dimensions)
44
MSP Memory Stick Pro (high speed)
NAND A type of flash where cell compounds are addressed through negated AND logic
NOR A type of flash where each cell is addressed through negated OR logic and supports XIP
OOB Out Of Band Area (16 Bytes additional space for each NAND flash page)
RS Reduced Size
45
Bibliography
[1] Booting Linux Really Fast, Daniel Parthey, University of Technology Chemnitz, April 2006
http://archiv.tu-chemnitz.de/pub/2006/0066
[3] JFFS: The Journalling Flash File System, David Woodhouse, Red Hat Inc.
http://sources.redhat.com/jffs2/jffs2.pdf
[6] New Memory Stick Platform Strategy for The Broadband Era, Sony Press, January 2003
http://www.sony.net/SonyInfo/News/Press_Archive/200301/03-0110aE
[7] Memory Stick Pro Specification Summary, Memory Stick Developers’ Site Office
http://www.memorystick.org/eng/simplefmt/memorystick_pro_
specification_summary_non-licensee_e.pdf
[9] Comparison of Memory Cards, Memory Stick Developers’ Site Office, September 2004
http://www.memorystick.org/eng/aboutms/detail/outline_spec.html
[12] Samsung MMC data sheet for MC56U032NCFA, Samsung, June 2004
http://www.samsung.com/Products/Semiconductor/FlashCard/MMC/
NormalMMC/FullSize/MC56U032NCFA/ds_mc56u032ncfa_rev09.pdf
46
[13] SD Specifications Part E1, Simplified SDIO Specification
SD Card Association, September 2006
http://www.sdcard.org/sdio/Simplified%20SDIO%20Card%
20Specification.pdf
[15] Compact Flash 4.0 Specification, Compact Flash Association, May 2006
http://www.compactflash.org/cfspc4_0.pdf
[21] USB 2.0 Specification, USB Implementers Forum Inc., April 2006
http://www.usb.org/developers/docs
[25] Mount costs too long, Charles Manning, Linux MTD mailing list, November 2006
http://lists.infradead.org/pipermail/linux-mtd/2006-November/
016778.html
[26] Common Flash Memory Interface Specification 2.0, Advanced Micro Devices, December 2001
http://www.amd.com/us-en/assets/content_type/DownloadableAssets/
cfi_r20.pdf
47
[27] Eraseblocks torture: OneNAND results, Artem Bityutskiy, December 2006
http://lists.infradead.org/pipermail/linux-mtd/2006-December/
017017.html
[29] Understanding the Flash Translation Layer Specification, Intel Corporation, December 1998
http://www.intel.com/design/flcomp/applnots/29781602.pdf
[30] David Woodhouse: JFFS2, Richard Ibbotson, Linux-Magazine Issue 17, 2002
http://www.linux-magazine.com/issue/17/DavidWoodhouse_JFFS2.pdf
[31] UBI - Unsorted Block Images, Thomas Gleixner, Frank Haverkamp, Artem Bityutskiy, 2006
International Business Machines Corp.
http://www.linux-mtd.infradead.org/doc/ubidesign/ubidesign.pdf
[34] Extended FAT file system, MSDN Library, Microsoft Corporation, February 2007
http://msdn2.microsoft.com/en-us/library/aa914353.aspx
[35] OEM Parameter Definition with exFAT, MSDN Library, Microsoft Corporation, February 2007
http://msdn2.microsoft.com/en-us/library/aa914663.aspx
[37] GC does not handle big syslog files, David Woodhouse, Linux MTD mailing list, September 2003
http://www.infradead.org/pipermail/linux-mtd/2003-September/
008545.html
[40] JFFS3 Plan Extension, Ferenc Havasi, Zoltán Sógor, Mátyás Majzik, University of Szeged
http://www.inf.u-szeged.hu/jffs2/jffs3-plan-extension-20060928.
pdf
48
[41] Garbage Collection in LogFS (Video), Jörn Engel, IBM
http://lca2007.linux.org.au/talk/91
[43] YAFFS2 Specification and Development Notes, Aleph One Ltd., May 2005
http://www.aleph1.co.uk/node/38
[44] YAFFS patch, Comparison of JFFS2 and YAFFS2 mount times, Keisuke Yasui, Celinux developer
mailing list, April 2005
http://tree.celinuxforum.org/pipermail/celinux-dev/2005-April/
000924.html
[45] Efficient Initialization and Crash Recovery for Logbased File Systems over Flash Memory,
Tei-Wei Kuo, Li-Pin Chang, Chin-Hsien Wu, No, 21st ACM Symposium on Applied Computing
(ACM SAC), 2006
http://www.cis.nctu.edu.tw/~lpchang/papers/SAC_wu_sac06.pdf
[47] Bonnie, Benchmark for Unix file systems, Copyright Tim Bray 1990-1996
http://www.textuality.com/bonnie
[48] Bonnie++, Improved C++ Version of Bonnie, Copyright Russell Coker 2000
http://www.coker.com.au/bonnie++
[52] Comparing High Speed USB Flash Drives, Tom’s Hardware, May 2005
http://www.tomshardware.com/2005/05/20/data_transfer_on_the_run/
[55] In die Karten geschaut, Boi Feddern, Heise, c’t 2006/23, page 142
49
[56] M29W320DB Flash data sheet, STMicroelectronics
http://www.st.com/stonline/products/literature/ds/7876.pdf
[58] The Journalling Flash File System, David Woodhouse, Red Hat Inc., October 2001
http://sources.redhat.com/jffs2/jffs2-slides-transformed.pdf
[59] JFFS2 as transactional FS, David Woodhouse, Linux MTD mailing list, March 2007
http://lists.infradead.org/pipermail/linux-mtd/2007-March/017654.
html
[60] Real-time garbage collection for flash-memory storage systems of real-time embedded sys-
tems, Li-Pin Chang, Tei-Wei Kuo, Shi-Wu Lo, November 2004, ACM Transactions on Embedded
Computing Systems, Volume 3, Issue 4, Pages 837-863
http://doi.acm.org/10.1145/1027794.1027801
[61] Real-time support of flash memory file system for embedded applications
Sudeep Jain & Yann-Hang Lee, April 2006, Department of Computer Science and Engineering,
Arizona State University, Tempe, AZ 85287, ISBN: 0-7695-2560-1
http://ieeexplore.ieee.org/search/wrapper.jsp?arnumber=1611716
50
Declaration of Authorship
I hereby declare that the whole of this diploma thesis is my own work, except where explicitly stated
otherwise in the text or in the bibliography. This work is submitted to Chemnitz University of Technology
as a requirement for being awarded a diploma in Computer Science ("Diplom-Informatik"). I declare that
it has not been submitted in whole, or in part, for any other degree.
51