Dr. PLK Priyadarsini, Sastra-Information Storage Management

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 40

Lecture 13

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Chapter 4
INTELLIGENT STORAGE SYSTEM

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Introduction
 Business – Critical applications
 High levels of performance
 Availability
 Security
 Scalability
 Hard disk – core element governing performance
 Older disk array technologies could not overcome
performance constraints due to mechanical components
and limitations of Hard disks
 RAID is also insufficient for today’s apps

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
What is an Intelligent
Storage System
Intelligent Storage Systems are RAID arrays that are:
Highly optimized for I/O processing
Operating environment that controls the management, allocation,
and utilization of storage resources
Have large amounts of cache for improving I/O performance
Have operating environments that provide:
◦ Intelligence for managing cache
◦ Array resource allocation
◦ Connectivity for heterogeneous hosts
◦ Advanced array based local and remote replication options

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Components of an
Intelligent Storage System
Four Key components

◦ FRONT END
◦ CACHE
◦ BACK END
◦ PHYSIICAL DISKS

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Components of an
Intelligent Storage System
Intelligent Storage System

Host Front End Back End Physical Disks

Cache
Connectivity

FC SAN

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Intelligent Storage System:
Front End
Intelligent Storage System

Host Front End Back End Physical Disks

Cache
Connectivity

FC SAN

Ports Controllers

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Front End
 Interface between Storage System and the host. Has two
components
 Front-end Port
 connects host to the Intelligent Storage System
 Has processing logic – executes appropriate transport protocol –
SCSI, Fibre Channel, or iSCSI
 Redundant ports foor High availability

 Front End Controller


 Routes data to and from Cache via internal data bus
 On receiving data at Cache, sends acknowledgement
 Optimizes I/O processing using Command queuing algorithms

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Front End Command Queuing
 Determines the execution order of received commands to reduce
unnecessary drive head movements and improve disk
performance
 When a command is received for execution, it assigns a tag that
defines a sequence of execution
 Multiple commands – concurrent execution
 Three kinds of algorithms
 FIFO – No optimization – Inefficient
 Seek Time Optimization – optimizes read/write head movements –
results in command reordering
 Access Time Optimization – Combination of Seek Time
Optimization and analysis of rotational Latency

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Front End Command Queuing
A

D C B A D C B A

C D
I/O Requests I/O Processing
B
Order

Front-End Cylinders
Controller

Without Optimization (FIFO)

D C B A D B C A

C D
I/O Requests I/O Processing B
Order

Front-End Cylinders
Controller

With command queuing


DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE
MANAGEMENT
Lecture 14

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Cache
 semiconductor memory
 data is placed temporarily to reduce host I/O requests service time
 Improves storage system performance by isolating hosts from the
mechanical delays (seek times, rotational latency)
 Write data is placed in cache and then written to disk
 But the host is acknowledged immediately

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Intelligent Storage System:
Cache
Intelligent Storage System

Host Front End Back End Physical Disks

Cache
Connectivity

FC SAN

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Structure of Cache
 Organized into pages or slots - smallest unit of cache allocation
 size of a cache page - according to application I/O size.
 Cache consists of the data store and tag RAM
 The data store holds the data
 Tag RAM tracks the location of the data in the data store
 Entries in tag RAM - where data is found in cache and on disk
 Tag RAM has a dirty bit flag - indicates whether the data in cache has
been committed to the disk or not
 contains time-based information - time of last access -to identify
cached information not accessed for long –can be freed

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Structure of Cache

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Read Operation with Cache
On Host’s Read Request:
Front-end Controller accesses Tag RAM – to find whether data on
Cache
If found – called Read Cache hit or read hit - Fast response time to
the host
If not found – Cache miss – then back-end controller accesses
appropriate disk and retrieves temporarily stores in Cache
Front-end Controller sends to host – increased I/O response time\

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Read with Cache: ‘Hits’ and
Misses’
Data found in cache = ‘Hit’
Cache
Read
Request

No data found = ‘Miss’


Cache
Read
Request

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Read Operation with
Cache…
 pre-fetch or read-ahead algorithm - If sequential
requests – several contiguous blocks kept in cache in
advance – significant improvement in response time
 Fixed pre-fetch – suitable when I/O sizes are uniform
 Variable pre-fetch -in multiples of the size of host request

 Not at the expense of other I/O – so Maximum pre-fetch

 Read performance measured in read hit ratio or hit rate –in


percentage - no. of read hits / total no. of read requests

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Write Operation with Cache
 Has performance advantage
 Far less time than directly to disk
 Optimization for Sequential write – many smaller writes coalesced for
larger transfer. Two ways:
 Write-back cache:
 Data placed on cache –ack – then several writes are committed (de-staged)
to disk
 Write response times much smaller-due to isolation from mechanical delays
 uncommitted data is at risk of loss in the event of cache failures

 Write-through cache: Data is placed in the cache and immediately


written to the disk, and an acknowledgment is sent to the host

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Write Operation with Cache
Write-through Cache
Cache
Write
Request

Acknowledgement

Write-back
Cache
Write
Request

Acknowledgement Acknowledgement

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Cache Implementation
 Dedicated cache
 separate sets of memory locations reserved for reads and writes

 Global cache
 reads and writes can use any of the available memory addresses
 Cache management more efficient as only one global set of addresses

 may allow users to specify percentages of cache for reads and writes.
 the read cache is small - should be increased if application is read
intensive
 ratio of cache available for reads versus writes is dynamically
adjusted based on workloads

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Cache Management
 Cache - a finite and expensive resource
 when all cache pages are filled, some pages have to be freed
 Various cache management to proactively maintain a set of free
pages and a list of pages that can be potentially freed up
 Least Recently Used (LRU)
 Continuously monitors – identifies pages not accessed for long
 Either frees up or marks
 With assumption that data not accessed for long will not be requested
 Data written to disk if not yet done, before reuse
 Most Recently Used (MRU)
 Converse of LRU
 On assumption that recently accessed may not be required

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Cache Management: Algorithms
New Data

Least Recently Used (LRU)


◦ Discards least recently used data

Most Recently Used (MRU)


◦ Discards most recently used data
Oldest Data

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Cache Management…
 As Cache fills – flush dirty pages (data in Cache but not in Disk)
 Flushing- process of committing data from cache to the disk
 watermarks set in cache, to manage the flushing process, on the
basis of the I/O access rate
. High watermark (HWM) is the cache utilization level at which the
storage system starts high speed flushing
Low watermark (LWM) is the point at which the storage system
stops the high-speed or forced flushing and returns to idle flush
Behavior

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Cache Management…
Idle flushing
◦ continuously, at a modest rate, when cache utilization level is
between the high and low watermark
High watermark flushing
◦ When cache utilization hits high watermark.
◦ The storage system dedicates some additional resources to
flushing.
◦ has minimal impact on host I/O processing
Forced flushing
◦ In the event of a large I/O burst when cache reaches 100%
capacity
◦ significantly affects the I/O response time
◦ dirty pages are forcibly flushed to disk.

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Cache Management: Watermarking
o Manage peak I/O requests “bursts” through flushing/de-staging
o Idle flushing, High Watermark flushing and Forced flushing

o For maximum performance:


o Provide headroom in write cache for I/O bursts

100 %

HWM

LWM

Idle flushing High watermark flushing Forced flushing


Cache Data Protection
Protecting cache data against failure:
◦ Cache mirroring
◦ Each write to the cache is held in two different memory
locations on two independent memory cards
◦ only for write, as for read, data is on disk
◦ Better utilization of available Cache
◦ Problem of Cache Coherency – mirror data identical all time –
responsibility of array operating environment

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Cache Data Protection
◦ Cache vaulting
◦ Cache is exposed to the risk of uncommitted data loss due
to power failure
◦ Solutions
◦ battery power until AC restores
◦ Or use battery for write
◦ For extended power failure in systems with numerous disk writing,
battery not suitable
◦ Vendors use set of disks to dump
◦ This is cache vaulting and disk is called vault drive

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Lecture 15

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Intelligent Storage System:
Back End
 Interface between cache and the physical disks
 Two components: back-end ports and back-end controllers
 Physical disks connected to ports on the back end
 Back end controller communicates with disks for reads and writes
 provides additional, but limited, temporary data storage.
 The algorithms on back-end controllers provide error detection and
correction, along with RAID functionality
 Dual controllers with multiple ports - high data protection and
availability, facilitate load balancing
 if disks are dual-ported - each disk port connect to a separate controller.

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Intelligent Storage System:
Back End
Intelligent Storage System

Host Front End Back End Physical Disks

Cache
Connectivity

FC SAN

Controllers Ports

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Intelligent Storage System:
Storage
 Disks
 connected to the back-end with either SCSI or a Fibre Channel interface
 use of a mixture of SCSI or Fibre Channel drives and IDE/ATA drives
 Split into logical volumes - Logical Unit Numbers(LUNs) for improved disk
utilization
 mapping of LUNs to their physical location on the drives is managed by the
operating environment
 For RAID protected drives, logical units are slices of RAID sets spread across
all the physical disks
 Logical partition of a RAID set given to a host as a physical disk
 Aggregate LUNs to expand capacity of LUN: meta-LUN

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
What the Host Sees – RAID
Sets and LUNs
Host 1
Intelligent Storage System

LUN 0 Front End Back End Physical Disks

Cache
Connectivity LUN 0

FC SAN

LUN 1

LUN 1

Host 2
DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE
MANAGEMENT
LUN Masking
 Access control mechanism for hosts
 prevents unauthorized or accidental use in a distributed
environment
 Process of masking LUNs from unauthorized access
 Implemented by Front End Controller
 Storage group logical entity that contains one or more LUNs and one
host

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Intelligent Storage Array
Two Categories
High-end storage systems
 with active-active arrays
 For large enterprises for centralizing corporate data
Midrange storage systems
 with active-passive arrays
 Typically in small- and medium sized enterprises
 optimal storage solutions at lower costs

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
High-end Storage Systems
Active-Active Configuration

Also referred as Active-active arrays


◦ I/O’s to LUNs through all available paths

High-end array capabilities:


◦ Large storage capacity

Controller
A
Port
◦ Huge cache to service host I/Os Active
◦ Fault tolerance architecture LUN

Controller
◦ Port

B
Connectivity to mainframe computers and Active
open systems hosts
◦ Multiple front-end ports and support to Host
Storage
interface protocols Array

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
High-end Storage Systems
Active-Active Configuration

High-end array capabilities…


Availability of multiple back-end Fibre
Channel or SCSI RAID controllers to manage
disk processing

Controller
Scalability to support increased connectivity,

A
Port
Active
performance, and storage capacity
LUN
requirements

Controller
Port

B
Active
Ability to handle large amounts of concurrent
I/Os Host
Storage
Array
Support for array-based local and remote
replication
Designed for large enterprises
DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE
MANAGEMENT
Midrange Storage Systems
 Also referred as Active-passive arrays
 Host can perform I/Os to LUNs only through active paths (to owning
controller of that LUN)
 Other paths remain passive till active path fails
 Midrange array have two controllers, each with cache, RAID controllers
and disks drive interfaces
 less storage capacity and global cache, Less scalable
 Fewer front-end ports to servers
 Ensure high redundancy and high performance for applications with
predictable workloads
 They also support array-based local and remote replication

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Midrange Storage Systems

Controller
Port

A
Active

LUN Active-Passive Configuration

Controller
Port

B
Passive

Host
Storage
Array

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT
Thank You

DR. PLK PRIYADARSINI, SASTRA-INFORMATION STORAGE


MANAGEMENT

You might also like