Storage and Hyper-V Part 4 Formatting and File Systems

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Storage and Hyper-V Part 4: Formatting and File

Systems
www.altaro.com/hyper-v/storage-and-hyper-v-part-4-file-systems/

October 16, 2013

16 Oct 2013 by Eric Siron

Rate

Very little is said about file systems and formatting for


Hyper-V Server deployments, but there are often a number of questions. This offering in our
storage series will examine the aspects of storage preparation in detail.

Part 1 – Hyper-V storage fundamentals


Part 2 – Drive Combinations
Part 3 – Connectivity
Part 4 – Formatting and file systems
Part 5 – Practical Storage Designs
Part 6 – How To Connect Storage
Part 7 – Actual Storage Performance

Sector Alignment
This section is mostly to lay to rest some advice that was very good for the Windows Server
2003 days and earlier. It’s still floating around as current, though, which is wasting a lot of
people’s time. When a disk is formatted, a portion of the space is for partitioning and
formatting information. After that, the actual data storage begins. The issue can best be
seen in an image:

For single disk systems, it doesn’t


really matter much how this is all laid
out on the disk. Read and writes occur
the same way no matter what. That
changes for striped RAID systems. In
all Microsoft disk operating systems, Sector Alignment
any given file always uses an exact
amount of clusters. So, if you have a 4
Kb cluster size, a 4Kb file will use an entire 512 Kb cluster. A 373 Kb file will use exactly 94

1/6
clusters. Two files will never share a cluster. So, imagine you are writing a filed to the
pictured system. The contents of a file are placed into data cluster 2. As you’ll recall from
part 2 of this series, this will trigger a read-modify-write on stripes 3 and 4. If the clusters
and sectors were aligned, then it would cause an entire stripe to be written and the RMW
wouldn’t be necessary. That means only one write operation instead of two reads and two
writes. In aggregate, that’s a substantial difference.

So, the advice was to modify the offset location. In the above image, that just changes
where the first data cluster begins so that it lines up with the start of a stripe. If you’re
interested, there is a Microsoft knowledgebase article that describes how to set that offset.
The commands still work. However, starting with Windows Server 2008, the default offset
should align with practically all RAID systems. If you want to verify for your system, you’ll
first need to know what the stripe size of your array is. Then, use the calculation examples
at the bottom of the linked article to determine your alignment.

The illustrations and wording so far indicates a matched cluster and stripe size. In most
production systems, trying to perfectly match them is usually not feasible. Your goal should
only be to ensure that the alignment is correct. Stripes are generally larger than clusters,
and, because a cluster can contain data for only a single file, large clusters result in a lot of
slack space. Slack space is the portion of a cluster that contains no data because the file
is ended. For instance, in our earlier example of a 373 Kb file on a 4 Kb cluster size, there
will be 1 Kb of slack space. If you managed to format your system with 256 Kb clusters (the
actual maximum is 64 Kb) to match a 256 Kb stripe size, you’d have 117 Kb of slack space
for the same file. As you can imagine, this can quickly result in a lot of wasted space. That
said, the files you’re most concerned with for your Hyper-V storage are large VHD files, so
using a larger cluster size might be beneficial. If you choose to investigate this route, keep
in mind that performance gains are likely to be minimal to the point of being undetectable.

Partitioning and Disk Layout


You have a finite number of disks and a number of concerns they need to address. In a
standalone system, you may be using all internal disks. It’s a common, and natural
question, as to the best approach for these disks. You need to think about where you’re
going to put Hyper-V and where you’re going to put your virtual machines.

Internal and Local Storage


First, Hyper-V needs nearly no disk performance for itself. The speed of the disk(s) you
install Hyper-V on will affect nothing except startup times. Once started, Hyper-V barely
glances at its own storage. It also does not perform meaningful amounts of paging, so do
not spend time optimizing the page file. This is all a very important consideration, as more
than a few people are insisting that Hyper-V should be installed on SSDs. This is an absurd
waste of resources. Nirmal wrote an article detailing how to run Hyper-V from a USB flash
disk. This is more than sufficient horsepower.

The benefit to a USB stick deployment is that all of your drive bays are free for virtual
machines. The drawback is that USB sticks are somewhat fragile. Of course, they’re also
easily duplicated, so this is something of a manageable risk. Despite this, it may not be an
2/6
acceptable risk. In that case, you’ll need to plan where you’re going to put Hyper-V.

One option is to create a single large partition and use it for Hyper-V and virtual machines.
If you’ve already installed Hyper-V, then you know that it defaults to this configuration. It’s
generally not recommended, but there’s no direct harm in doing so. A preferable
deployment is to separate Hyper-V and its management operating system from the virtual
machines. With internal disks, there are a couple of ways to accomplish this.

One is to create a single large RAID system out of the physical disks using dedicated
hardware. Most software solutions can’t use all disks, as they need an operating system to
be present before they can place disks into an array of any kind except a mirror. On the
hardware RAID, create two logical volumes (this is Dell terminology, your hardware vendor
may use something else): one of 32 GB or greater (I prefer 40) and create a large logical
volume on the rest of the space for virtual machines. You can also divide the space into
smaller volumes and allocate virtual machines as you see fit. You may also use fewer
logical volumes and carve up space with Windows partitioning, if you prefer.

Another option is to dedicate one or two disks for Hyper-V and the rest for the virtual
machines. This establishes a clear distinction between the hypervisor/management
operating system and virtual machines. If you choose this route, there’s no practical benefit
in using more than two disks in a RAID-1 configuration for the hypervisor. One way that you
can benefit from this is that the remaining disks can be used in a Storage Space in the
absence of a dedicated array controller. Remember that this places a computational burden
on the CPUs that will also be tasked with running your virtual machines.

Remote Storage
The basics of remote storage aren’t substantially different from the above. Drives are
placed into arrays and then either presented as a whole or carved up into separate units.
Your exact approach will depend on your equipment and your needs.

Don’t forget that you’ll need to consider where to place the hypervisor. You can use any of
the internal storage options presented above, or you can use a boot-from-SAN approach.

Virtual Machines on Block-Level Storage


I’m not going to distinguish between FC and iSCSI here. Either connectivity method results
in the same overall outcome: your Hyper-V host(s) treat block-level storage as though it
were local. What matters here is how that storage is configured.

If your storage system supports it, you can choose to create an array out of a few disks and
present that entire array as a storage location to one or more hosts. You can then place
other disks into other arrays, and so on. The benefit with this approach is that it’s generally
simpler to manage and it’s much easier to conceptualize. You’ll be able to put a sticker on
certain physical disks and indicate which host(s) they exclusively belong to. It also helps to
reduce the amount of risk that any given host is exposed to. Let’s say that you have fifteen
disks in your storage system and you create five three-disk RAID-5 arrays. Each of these
five arrays can lose a disk without losing any data. A rebuild of one array to replace a failed

3/6
drive does not place the disks in other arrays at any risk at all. The drawback is that this is a
very space-inefficient approach. In the five-by-three example, the space of five disks is
completely unusable. Also, any given array can only operate at the speed of three drives.

Another approach is to create a single large array across all available disks in a storage
location. Then, you can create a number of smaller storage locations that use all available
drives. You wind up with a level of data segregation, but the data is at greater risk since the
likelihood of multiple drive failures increases as you add drives. However, performance is
improved as all drives are aggregated for all connected systems.

Of course, you also have the option to create a single large storage location and connect all
hosts to it. For block-level storage, all Hyper-V hosts must be in the same cluster for this to
work. Otherwise, there is no arbitration of ownership or access and, assuming you can get
it to work at all, you’re going to run into troubles. This is not a recommended approach. I’ve
used Storage Live Migration (and the regular storage move offered in System Center
Virtual Machine Manager 2008 R2) to solve a number of minor problems and was very glad
to have distinct storage locations. One thing that I did was disconnect storage from a 2008
R2 cluster and connect it to a new 2012 cluster and import the virtual machines in place.
Having the storage divided into smaller units allowed me to do this in stages rather than all
at once.

In block-level storage, these storage locations are typically referred to as LUNs. The
acronym stands for “Logical Unit Number”, although it’s not quite as meaningful today as it
was when it was first coined since we rarely refer to them with an actual number anymore.
The “logical” portion still stands. The storage system is responsible for determining how a
LUN is physically contained on its storage, but the remote system that connects to it just
sees it as a single drive.

Virtual Machines on File-Level Storage


As discussed in part three, SMB 3.0 and later allows you to place virtual machines directly
on file shares. Connecting hosts have no concern with the allocation, layout, format, or
much of anything else of the actual storage. All they need is the contents of the files they’re
interested in and the permissions to access it. This will be important, as SMB is an open
protocol that can be implemented by anyone with the intent. Reports are that it will be
incorporated into the Samba project, which implements SMB on Linux operating systems.
Storage vendors such as EMC are already producing devices with SMB 3.0 support and
more are on the way. The storage location is prepared on the storage side with a file
system and then a share is established on it. Access permissions control who can connect
and what they can perform.

If you’re configuring the SMB 3.0 share on a Windows system, remember your NTFS/share
training. Both NTFS and share permissions have their own access control lists and each
assigns the least restrictive settings applicable to a user, with Deny overriding. However,
when combining NTFS and share permissions, the most restrictive settings are applied. As
an example, consider an account named Sue. Sue is a member of a “Storage Admins”
group and a “File Readers” group. The “Storage Admins” group is a member of the
“Administrators” group on SV-FILE1, which contains a folder called “CoFiles” that is shared
4/6
as “CompanyFiles”. The NTFS permissions on “CoFiles” gives the built-in group
“Authenticated Users” read permissions and “Administrators” Full Control. The
“CompanyFiles” share gives “File Readers” Read permissions. When Sue accesses the
files while sitting at the console of SV-FILE1 and navigating through local storage, she has
full rights. This is because she is in both “Authenticated Users” and “Administrators”; the
Full Control permissions granted by her membership in “Administrators” is less restrictive
than her Read permissions as a member of “Authenticated Users” and no Deny is set, so
she ends up with Full Control at the NTFS level. However, when she accesses the same
location through the “CompanyFiles” share, she only has Read permissions. She is not a
member of any group that has more than Read permissions on that share’s access control
list, so “Read” is her least restrictive permission. That she has Full Control at the NTFS
level does not matter, because the share permissions won’t even let her try. Remember
that when you’re creating shares for Hyper-V to use, you grant access permissions to
computer accounts, not user accounts. Microsoft details how to set up such a share on
TechNet.

A benefit of the SMB share approach is that a lot of storage processing is offloaded to the
storage device. Your Hyper-V hosts really only need to process basic reads and writes.
Processing for low-level I/O and allocation table maintenance falls squarely on the storage
device. With block-level access, much of that processing occurs on the Hyper-V host even
with the various offload technologies that are available. As SMB continues to mature, it will
add features to bring it more feature-parity with the block-level techniques. Expect it to gain
many new offloading features as time goes on.

Another advantage of SMB 3.0 is administrator familiarity. I happen to work in an institution


where we have dedicated storage administrators to configure host access, but a great
many businesses do not. Besides, it’s not fun going into a queue in another department
when you need to get more storageright now. Configuring FC and iSCSI access methods
aren’t hard on their own, but they’re unfamiliar. Also, Windows administrators generally
have a good idea of the mechanisms of share and NTFS permissions. Not as many actually
understand things like RADIUS and CHAP administration or how to troubleshoot or watch
for intrusions.

Disk Formats
We’ve already talked a bit about NTFS. For now, this is the default and preferred format for
your Hyper-V storage. It’s the only option for boot volumes. This format has withstood the
test of time and is well-understood. It has robust security features and you can enhance it
with BitLocker drive encryption.

With 2012 R2, you’ll be able to use the new ReFS system for Hyper-V virtual machine
storage. ReFS adds protection against sudden drops of the storage, usually as a results of
power outages. It also allows for larger volumes than NTFS, which will eventually become a
necessity. It can also work better at finding and repairing data corruption. The integrity
stream feature doesn’t work with VHDs yet, though.

In my opinion, ReFS is good — still a work in progress, but viable. I wouldn’t get in a hurry
to replace NTFS to implement it, but I wouldn’t be afraid to use it either. Without the
5/6
integrity streams, one of the data protection powers of ReFS is unavailable. That may be
rectified in a later version. There are a number of NTFS features that ReFS doesn’t support,
but none that should present any real problem for Hyper-V storage. In time, those may be
brought into ReFS as well. 2012 R2 only represents the second version of this technology,
so it’s likely to have quite a future.

What’s Next
This article concludes the theoretical portion of the series. The next piece will slide into
practice. You’ll see some possible ways to design and deploy a solid storage solution in a
high-level overview.

Download: Free Hyper-V Backup


Download your FREE copy of Altaro VM Backup, Altaro's easy to use backup software for
Hyper-V trusted by 40,000+ organizations worldwide. FREE, forever for up to 2 VMs per
host

Download Now!Learn more

Get notified about new articles by email


Receive all our free Hyper-V articles and checklists and get notified firstwhen we release
new eBooks and announce upcoming webinars!

Eric Siron
I have worked in the information technology field since 1998. I have
designed, deployed, and maintained server, desktop, network, and
storage systems. I provided all levels of support for businesses ranging
from single-user through enterprises with thousands of seats. Along
the way, I have achieved a number of Microsoft certifications and was
a Microsoft Certified Trainer for four years. In 2010, I deployed a
Hyper-V Server 2008 R2 system and began writing about my
experiences. Since then, I have been writing regular blogs and
contributing what I can to the Hyper-V community through forum participation and free
scripts.

ALL Post : ALL POST

6/6

You might also like