Professional Documents
Culture Documents
Disk Configurations: Ideal Configuration Scenarios
Disk Configurations: Ideal Configuration Scenarios
While Postilion International does not count disk setup as a core competency (we should defer to EMC,
Stratus, et al for that) it is inevitable that our clients will rely on us to provide guidance and other input. This
page is intended for sharing information about disk setups, so we can be more informed, especially as
more move further into the arena of high volume, mission critical systems for more prominent clients.
Ideally:
RAID
o Use RAID 10 (aka RAID 1+0) for Database data files
(Realtime/PostCard/PostOffice data) – speed, redundancy and resilience
o Use RAID 1 for Database Log files, DoubleTake buffers, and OS+Applications
(and potentially the temp_db: see below) – speed and redundancy
RAID 10 can of course be used for Log files too, if the customer
wishes to afford the increased number of drives
o do not rule out RAID5:
RAID5 is typically slower than RAID10, especially if software RAID
is used; however some hardware RAID implementations do give
very good RAID5 performance (even better than RAID 10
potentially)
see the Internal Postilion Benchmarks, e.g. ftScalableStorage
(ftSS)
how do the rebuild/recovery times after failure compare for RAID5
vs RAID10?
RAID10 gives consistent times irrespective of the
number of drives in the volume
RAID 5 rebuild is perhaps slower and more intensive??
RAID50 (5+0) is now available on some arrays, and has
good performance too.
See Werner's tuning documents.
Allocate a single volume for a single usage, do not mix usages as different usages load
the disks differently, e.g.
o each data file (Realtime/PostCard/PostOffice) should have its own volume
o each log file (Realtime/PostCard/PostOffice) should have its own volume
o if you suspect that the Office database will have to do some heavy sorting,
consider moving the temp_db to its own RAID 1 volume
o Double-Take buffer should have its own volume, one that does not contain
files that are part of a DT replication set
don't partition a volume, i.e. a spindle should not be shared across partitions (because
you might violate the 1-volume-per-usage rule above);
SAN:
o just because SANs are fast, does not mean it's a good idea to share a volume
between servers (see ReD case study below)
o make sure that the route to the disk is not shared, e.g. sharing disk controllers
can reduce throughput and also the consitency of the throughput even if the
spindles aren't shared.
o note that the disk utilisation profile (volume and proportion of reads to writes
etc) varies considerably between Realtime, PostCard, and Office ~ hence
being able to configure different cache profiles etc for each of these DBs would
be beneficial.
anyone got any rules of thumb for this?
? perhaps 75% write cache (25% read) for Realtime volume, 75%
read cache (25% write) for PostCard and Office volumes?
? 75% write for temp_db volume?
Disk types
EMC Clariion SAN
AX-range (AX100/AX150/...)
this is the low-budget entry-level SAN.
it is capable of RAID 5 and "RAID 1/0"
o RAID 5 is the default setup; it is optimised for this (EMC has its own RAID5
optimisations, which improve performance further)
o "RAID 1/0" means it can support RAID1 and RAID 1+0 (... though this may be
0+1?)
There are typically two 2Gbps Fibre Channels, with a SAN disk controller/processor per
channel:
o if the disks are configured as a single RAID volume, then only one channel will
be used (the other is a redundant backup) though the channels significantly
outperform the disk spindles so it does not bottleneck here;
o if the disks are configured as two (or more) RAID volumes, then both the
channels will activate; one for each volume in a two volume system if
everything is operating normally;
additional disks can be configured as hot-standbys, which the AX will create and bring
into service automatically should a RAIDed drive fail.
CX-range (CX300/CX500/CX700/...)
This is a high-end SAN.
Can also be equipped with SRDF, EMC's replication technology that copies whole
volumes to a remote site over a high-bandwidth connection (think hardware DoubleTake
on steriods)
o Abbey (major UK bank, but not a Postilion user) uses this to replicate whole
machines (i.e. the Primary & DR machines run as diskless servers booting
from the SAN) – it works, but makes Stratus nervous.
Stratus ftServer
internal disks
These are set up in RAID1 pairs.
o NB: the mirroring is done in software, i.e. takes CPU cycles from the server.
o the software is called RDR (Rapid Disk Resync) and is also used to remirror
disks efficiently after an outage (planned or unplanned)
at installation, the system disk (C:) gets a 16GB partition by default
o this is what Stratus has determined is sufficient for most clients;
o the rest of the volume is unformatted, but the intention is that it is reserved
for ActiveUpgrade
o if our clients are not planning on using ActiveUpgrade, then the space can be
used (hopefully the systems are defended-in-depth to reduce the number and
frequency of OS hotfixes, and hopefully there is a DR server for use during
planned outages if needs be)
ActiveUpgrade
o is a process by which both the system (CPU, memory, etc) and the system
disk's RAID1 mirror can be simplexed (brought out of fault-tolerant/redundant
mode) for upgrades;
o this leaves one half of the system running the production functionality as
normal;
o the other half runs just the OS on the system disk, but is totally isolated from
any data disks;
o software patches can be applied and tested
if approved then the system will commit the changes and bring the
new software into production and re-duplexes itself into fault-
tolerant mode;
if aborted then the changes are abandoned and the system re-
duplexes itself;
o currently this works for Microsoft Hotfixes, will in the future work for Stratus
Upgrades too
o it is of little use for Postilion upgrades since most need access to data that is
on the isolated disks, and you cannot store data on the system disk because
any changes will be lost when the system re-duplexes.
ftScalableStorage (ftSS)
Stratus ftScalableStorage aka ftSS aka SftSS aka ftStorage aka ...
Stratus's own high performance external disk enclosures
o currently (April 2007) available in non-SAN version, soon to have a SAN
version too
o see the Internal Postilion Benchmarks performed on the non-SAN version
from email with Stratus
o you should be able to use PerfMon for each volume in an array. You will not
see physical disks as these are hidden by the raid controller and the volume
appears as a single spindle
o if you enable Window's write caching (Computer Management --> Device
Manager --> DiskDrives --> (Disk) --> Policies) then this is responded to but
ignored by the drive... it does its own caching.
o Storage cache cannot be tuned for read or write bias: writes get priority unless
the drive is not experiencing amny writes, at which point it shifts its cache to a
read bias
only the read-ahead cache can be tuned, this on a volume by
volume basis
RAID
Skip to end of metadata
RAID is an acronym for Redundant Array of Inexpensive Disks. Depending on the configuration it can
provide a performance boost, fault tolerance, or both, and to varying degrees.