Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Technical white paper

Adaptive optimization for


HP 3PAR Storage

Table of contents
Abstract.......................................................................................................................................2
Storage tiers: Opportunity and challenge ........................................................................................2
HP 3PAR Adaptive Optimization software .......................................................................................3
Brief overview of volume mapping ..............................................................................................3
Adaptive optimization implementation .........................................................................................4
Design tradeoff: Tiering vs. caching ............................................................................................5
Configuration ...........................................................................................................................5
Tiering analysis algorithm ..........................................................................................................6
Design tradeoff: Granularity of data movement ............................................................................7
Results .........................................................................................................................................7
Customer case study .....................................................................................................................9
Summary ...................................................................................................................................10
Abstract
The availability of a wide range of storage media such as SSDs, high-performance HDDs, and high-capacity HDDs
presents the opportunity for optimizing cost and performance of storage arrays. But doing so effectively and without
increasing the administrative burden presents a new challenge since the tradeoffs for storage arrays are different from
well-understood ones for CPU memory hierarchies. This white paper explains some of the tradeoffs, describes an
implementation of adaptive optimization on HP 3PAR Storage, and illustrates its effectiveness with performance results.

Storage tiers: Opportunity and challenge


Modern storage arrays support multiple tiers of storage media with a wide range of performance, cost, and capacity
characteristics—ranging from inexpensive (~$100 USD) 2 TB SATA HDDs that can only sustain about 75 IOPS/s to
expensive (~$500 USD) 50 GB SLC flash memory based SSDs that can sustain over 25,000 IOPS. Volume RAID and
layout choices enable additional performance, cost, and capacity options. This wide range of cost, capacity, and
performance characteristics is both an opportunity and a challenge.
The opportunity is that the performance and cost of the system can be optimized by appropriate placement of data on
different tiers: move the most active data to the fastest (and most expensive) tier and move the idle data to the slowest
(and cheapest) tier. The challenge, of course, is to do this in a way that minimizes the burden on storage
administrators while also providing them with appropriate controls. Currently, data placement on different tiers is a
task usually performed by storage administrators, often based not on application demands but the price paid by the
users. Without careful analysis, they tend to allocate storage where space is available rather than based on
performance requirements. Consequently, HDDs with the largest capacity also tend to have the highest number of
accesses. But the largest HDDs are often also the slowest HDDs resulting in severe performance bottlenecks.
There is an obvious analogy with CPU memory hierarchies. Although the basic idea is the same (use the smallest,
fastest, most expensive resource for the busiest data), the implementation tradeoffs are different for storage arrays.
We note that while deep CPU memory hierarchies (first, second, and third level caches; main memory; and finally
paging store) are ubiquitous and have mature design and implementation techniques, storage arrays typically have
only a single cache level (the “cache” on disk drives usually acts more like a buffer than a cache). Automatic tiering
in storage arrays is a recent development, and not commonplace at all. There is still much to be learned about it.
In this white paper, we discuss the tradeoffs for storage arrays and describe technology that adaptively optimizes
storage on HP 3PAR arrays.

2
HP 3PAR Adaptive Optimization software
Brief overview of volume mapping
In order to understand HP 3PAR Adaptive Optimization, it is necessary to understand volume mapping on HP 3PAR
Storage as illustrated in figure 1.

Figure 1. HP 3PAR Adaptive Optimization

HP 3PAR virtual volumes (VVs) are organized into volume families (or trees) consisting of a base volume at the
root and optional copy-on-write (COW) snapshot volumes of the base VV or of other snapshot VVs in the tree.
Each volume family has three distinct data storage spaces: user space for the base volume, snap space for the
copy-on-write data, and admin space for the mapping metadata for the snapshots. If the base volume is fully
provisioned, there is a direct, one-to-one mapping from the VV virtual address to the user space. If the base volume
is thin provisioned, only written space in the base volume is mapped to user space and the mapping metadata is
stored in the admin space, similar to COW snapshots. The unit of mapping for the snapshot COW or thin
provisioned VVs is a 16 KB page. Caching is done at the VV space level and at a granularity of 16 KB pages.
Physical storage in HP 3PAR Storage is allocated to the volume family spaces in units of logical disk (LD) regions.
The region size for the user and snap spaces is 128 MB and the region size for the admin space is 32 MB.
Logical disk storage is striped across multiple RAID sets built from 256 MB allocation units of physical disks (PDs)
known as chunklets. Every RAID set within one LD has the same RAID type (1, 5, or 6), set size, and disk type (SSD,
FC, and SATA Nearline (NL)). These parameters determine the LD characteristics in terms of performance, cost,
redundancy, and failure modes.
HP 3PAR Storage is a cluster of controller nodes and the chunklets for one LD are allocated only from PDs with
primary access path directly connected to the same node, known as the LD owner node. System level data striping is
achieved by striping the volume family space across regions from LDs owned by different nodes. This also achieves
overall workload partitioning as the cache and snapshot management for each particular VV block address is
performed by the LD owner node for the region corresponding that VV block address within the family base VV user
space mapping. This ownership partitioning is one reason why thin-provisioned volumes still contain a user space
mapping where each region maps to a dummy zero LD with no physical storage.

3
A common provisioning group (CPG) is a collection of LDs and contains the parameters for additional LD space
creation including RAID type, set size, disk type for chunklet selection, plus total space warning and limit points.
Multiple VV family spaces may be associated with a CPG from which they get LD space on demand. The CPG is
therefore a convenient way to specify a tier for adaptive optimization since it includes all the necessary parameters
and it permits adaptive optimization to operate after the cache (there is no reason to bring busy data that is in the
controller cache into high-performance storage below the cache). An additional benefit of tiering at this level is that
all three volume spaces, not just user space, are candidates for adaptive optimization. In fact measurements show
that admin space metadata regions are frequently chosen to be placed in the fastest tier.
Figure 1 illustrates the volume mapping for both non-tiered as well as tiered (adaptively optimized) volumes. For
non-tiered VVs, each space (user, snap, or admin) is mapped to LD regions within a single CPG and is therefore in
a single tier. For tiered (adaptively optimized) VVs, each space can be mapped to regions from different CPGs and
the rest of this white paper describes how this tiering is implemented and the resulting benefits.
Finally, it is important to keep in mind that although this mapping from VVs to VV spaces to LDs to chunklets is
complex, the user is not exposed to this complexity; the system software automatically creates the mappings.

Adaptive optimization implementation


In order to implement tiering, HP 3PAR Adaptive Optimization needs to do several things: (1) collect historical data of
accesses for all the regions in an array (this can be a lot of data), (2) analyze the data to determine the volume
regions that should be moved between tiers, (3) instruct the array to move the regions from one CPG (tier) to another,
and (4) provide the user with reports showing the impact of adaptive optimization.
HP 3PAR has an application called System Reporter that periodically collects detailed performance and space data
from HP 3PAR arrays, stores the data in a database, analyzes the data, and generates reports. HP implemented
adaptive optimization by enhancing System Reporter to collect region-level performance data, perform tiering
analysis, and issue region movement commands to the array as shown in figure 2.

Figure 2. Adaptive optimization implementation using System Reporter

4
Design tradeoff: Tiering vs. caching
An obvious choice for an algorithm to manage the different tiers of storage could be to use traditional caching, where
data is copied from slower tiers into the fastest tier whenever it is accessed, replacing older data using a simple,
real-time algorithm such an least recently used (LRU). These caching algorithms have been extensively studied in the
context of CPU-memory hierarchies. However, disk storage tiers in an array are different from a typical memory
hierarchy in several respects.
In memory hierarchies, it is almost always the case that the faster tiers are much smaller than the slower tiers and
regions that are cached in the faster tier also occupy space on the slower tier but the space duplicated on the slower
tier is a small fraction of its total size. In contrast, on arrays it is often the case that the total space for mid-tier FC
drives is a significant fraction of the space on the slow-tier NL drives and it would not be desirable to “lose” the
duplicated space.
Memory hierarchies require very fast response times so it is not feasible to use complex analysis to figure out what
should be cached or replaced. Simple algorithms such as LRU are all that designers can afford. For storage tiers, it is
possible to devote time to more sophisticated analysis of access patterns to come up with more effective strategies
than simple LRU algorithms.
Memory hierarchies typically use different hardware resources (memory buses) for different tiers and transferring data
between tiers may not significantly impact the available bandwidth to the fastest tier. Disk tiers may often share the
same resources (FC ports) and the bandwidth used while transferring data between tiers impacts the total backend
bandwidth available to the controllers.
For these reasons HP chose to move regions between tiers instead of caching.

Configuration
Administration simplicity is an important design goal so it is tempting to make the adaptive optimization completely
automatic, requiring the administrator to do no configuration at all. However, analysis indicated that some controls
were in fact desirable for administration simplicity. Since HP 3PAR Storage is typically used for multiple applications,
often for multiple customers, HP allowed administrators to create multiple adaptive optimization configurations so that
they can use different configurations for different applications or customers. Figure 3 shows the configuration settings
for an adaptive optimization configuration.

Figure 3. Configuration settings

5
You can select CPGs for each of the tiers, and also set a tier size if you want to limit the amount of space that the
algorithm will use in each tier. You can set a very large number if you do not want to limit the size available for any
given tier. Note that adaptive optimization will attempt to honor this size limit in addition to any warning or hard limit
specified in the CPG.
You should define tier 0 to be higher performance than tier 1, which in turn should be higher performance than tier 2.
For example, you may choose RAID 1 with SSDs for tier 0, RAID 5 with FC drives for tier 1 and RAID 6 with NL or
SATA drives for tier 2.
You can specify the schedule when a configuration will execute along with the measurement duration preceding the
execution time. This allows the administrator to schedule data movement at times when the additional overhead of
that data movement is acceptable (non-peak hours, for example).
You can also set a mode configuration parameter to one of three values: performance mode biases the tiering
algorithm (described in the next section) to move more data into faster tiers, cost mode biases the tiering algorithm to
move more data into the slower tiers, and balanced mode is a balance between performance and cost. The mode
configuration parameter does not change the basic flow of the tiering analysis algorithm, but rather it changes certain
tuning parameters that the algorithm uses.

Tiering analysis algorithm


The tiering analysis algorithm that selects regions to move from one tier to another considers several things described
in the following sections.

Space available in the tiers


If the space in a tier exceeds the tier size (or the CPG warning limit), the algorithm will first try to move regions out of
that tier into any other tier that has space available to try to lower the tier’s size below the limit. If no other region has
space then the algorithm logs a warning and does nothing (note that if the warning limit for any CPG was exceeded,
the array would have generated an alert). If space is available in a faster tier, it chooses the busiest regions to move
to that tier; and similarly if space is available in a slower tier, it chooses the idlest regions to move to that tier. The
average tier service times and average tier access rates are ignored when data is being moved because the size
limits of a tier have been exceeded.

Average tier service times


Normally, HP 3PAR Adaptive Optimization tries to move busier regions in a slow tier into higher performance tiers.
However, if a higher performance tier gets overloaded (too busy), performance for regions in that tier may actually
be lower than regions in a “slower” tier. In order to prevent this, the algorithm does not move any regions from a
slower to a faster tier if the faster tier’s average service time is not lower than the slower tier’s average service time by
a certain factor (a parameter called svctFactor). There is an important exception to this rule because service times are
only significant if there is sufficient IOPS load on the tier. If the IOPS load on the destination tier is below another
value (a parameter called minDstIops) then we do not compare the destination tier’s average service time with the
source tier’s average service time and instead use an absolute threshold (a parameter called maxSvctms).

Average tier access rate densities


When not limited, as described above, by lack of space in tiers or by high average tier service times, adaptive
optimization computes the average tier access rate densities (a measure of how busy the regions in a tier are on
average, calculated with units of IOPS per gigabyte per minute) and compares them with the access rate densities of
individual regions in each tier and then decides whether to move the region to a faster or slower tier.
We first consider the algorithm for selecting regions to move from a slower to a faster tier. For a region to be
considered busy enough to move from a slower to a faster tier, its access rate density and accr(region) must satisfy
these two conditions:
First, the region must be sufficiently busy compared to other regions in the source tier.
accr(region) > srcAvgFactorUp(Mode) * accr(srcTier)
where accr(srcTier) is the average access rate density of the source (slower) tier and srcAvgFactorUp(Mode) is a
tuning parameter that depends on the mode configuration parameter. Note that by selecting different values of
srcAvgFactorUp for performance, balanced or cost mode values HP 3PAR Adaptive Optimization can control how
aggressive the algorithm is in moving regions up to faster tiers.

6
Second, the region must meet one of two conditions: it must be sufficiently busy compared with other regions in the
destination tier or it must be exceptionally busy compared with the source tier regions. This second condition is added
to cover the case where a very small number of extremely busy regions are moved to the fast tier, but then the
average access rate density of the fast tier create too high a barrier for other busy regions to move to the fast tier.
accr(region) > minimum((dstAvgFactorUp(Mode) * accr(dstTier)), (dstAvgMaxUp(Mode) * accr(srcTier)))
The algorithm for moving idle regions down from faster to slower tiers is similar in spirit but instead of checking for
access rate densities greater than some value, the algorithm checks for access rate densities less than some value:
accr(region) < srcAvgFactorDown(Mode) * accr(srcTier)
accr(region) < maximum((dstAvgFactorDown(Mode) * accr(dstTier)), (dstAvgMinDown(Mode) * accr(srcTier)))
HP makes a special case for regions that are completely idle (accr(region) = 0). These regions are moved directly to
the lowest tier.

Design tradeoff: Granularity of data movement


The volume space to LD mapping has a granularity of either 128 MB (user and snapshot data) or 32 MB (admin
metadata) and that is naturally the granularity at which the data is moved between tiers. Is that the optimal
granularity? On the one hand having fine-grain data movement is better since we can move a smaller region of busy
data to high-performance storage without being forced to bring along additional idle data adjacent to it. On the
other hand, having a fine-grain mapping imposes a larger overhead since HP 3PAR Adaptive Optimization needs to
track performance of a larger number of regions, maintain larger numbers of mappings, and perform more data
movement operations. Larger regions also take more advantage of spatial locality (the blocks near a busy block are
more likely to be busy in the near future than a distant block). HP results show that the choice is a good one.

Results
HP measured the access rate for all regions for a number of application CPGs, sorted them by access rate and
plotted the cumulative access rate versus the cumulative space as shown in figure 4. For all the applications, most
of the accesses are concentrated in a small percentage of the regions. In several applications, this concentration of
accesses is very pronounced (more than 95 percent of the accesses to less than 3 percent of the data) but less so
for others (over 30 percent of the space is needed to capture 95 percent of the accesses). In total, just 4 percent
of the data gets 80 percent of the accesses. This indicates that the choice of region size is reasonably good at
least for some applications.

Figure 4. Distribution of IO accesses among regions for various applications

7
Because SSD space is still extremely expensive relative to HDD space (10x to 15x), for SSDs to be cost-effective, very
pronounced concentration of IO accesses to a small number of regions are needed. For applications that show less
pronounced access concentration, HP 3PAR Adaptive Optimization may still be useful between different HDD tiers.
One of the simple but important ideas in the implementation is the separation of the analysis and movement by CPGs
(or applications).
See example results in figure 5 that describe region IO density after HP 3PAR Adaptive Optimization has run for
while. Both charts are histograms with the x-axis are the IO Rate Density buckets; the busiest regions to the right and
the idlest to the left. The chart on the left shows on the y-axis the capacity for all the regions in each bucket, while the
chart on right shows on the y-axis the total IOPS/min for the regions in each bucket. As shown in the charts, the SSD
tier (tier 0) occupies very little space but absorbs most of the IO accesses, whereas the Nearline tier (tier 2) occupies
most of the space but absorbs almost no accesses at all. This is precisely what the user wants.

Figure 5. Region IO density report after adaptive optimization

8
Customer case study
This section describes the real benefits that a customer derived from using HP 3PAR Adaptive Optimization. The
customer had a system with 96 300 GB 15k rpm FC drives and 48 1 TB 7.2k rpm NL drives. They had 52 physical
servers connected and running VMware with more than 250 VMs. The workload was mixed (development and QA,
databases, file servers) and they needed more space to accommodate many more VMs that were scheduled to be
moved onto the array. However, they faced a performance issue: they had difficultly managing their two tiers (FC and
NL) in a way that kept the busier workloads on their FC disks. Even though the NL disks had substantially less
performance capability (since there were fewer NL disks and they were much slower) they had larger overall capacity
and as a result had more workloads allocated on them and tended to be busier and incurred long latencies. The
customer considered two options: either they would purchase an additional 96 FC drives, or they would purchase an
additional 48 NL drives and 16 SSD drives and use HP 3PAR Adaptive Optimization to migrate busy regions onto the
SSD drives. They chose the latter and were very pleased with the results that are summarized in figure 6.

Figure 6. Improved performance after adaptive optimization

Before HP 3PAR Adaptive Optimization as described in the charts, on the left of figure 6 and even though there are
fewer NL drives, in aggregate they incur greater IOPS load than the FC drives and consequently have very poor
latency (~40ms) compared with the FC drives (~10ms). After HP 3PAR Adaptive Optimization has executed for a little
while, as shown in the charts on the right, the IOPS load for the NL drives has dropped substantially and has been
transferred mostly to the SSD drives. HP 3PAR Adaptive Optimization moved ~33 percent of the IOPS workload to
the SSD drives even though that involved moving only 1 percent of the space. Performance improved in two ways: the
33 percent of the IOPS that were serviced by SSD drives got very good latencies (~2ms), and the latencies for the NL
drives also improved (from ~40ms to ~15ms). Moreover, the investment in the 16 SSD drives permitted them to add
even more NL drives in the future since the SSD drives have both space and performance headroom remaining.

9
Summary
HP 3PAR Adaptive Optimization is a powerful tool for identifying how to configure multiple tiers of storage devices
for maximum performance. Its management features can deliver results with minimal effort. As in all matters
concerning performance, “your results may vary,” but proper focus and use of HP 3PAR Adaptive Optimization
can deliver significant improvements in device utilization and total throughput.

Break the boundaries of traditional storage with HP 3PAR Storage solutions.


To know more, visit hp.com/go/3PAR

Get connected
hp.com/go/getconnected
Current HP driver, support, and security alerts
delivered directly to your desktop

© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services.
Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or
omissions contained herein.

4AA4-0867ENW, Created April 2012

You might also like