2018 IBM Systems Technical University: 22-26 Oct Rome, Italy

Data sharing in today's global
world using Spectrum Scale
Trishali Nayar
Spectrum Scale, IBM 2018 IBM Systems
Technical University
22-26 Oct Rome, Italy
Session Objectives
— How to share data between Spectrum Scale Clusters
— Understand various Use Cases
— Learn updates from the latest release (5.0.2)
2 © Copyright IBM Corporation 2018

Topics
— Introduction/Definitions
— Capabilities and Supported Modes
— Use Cases
— Disaster Recovery
— 5.0.2 Changes

Introduction
— Spectrum Scale is a fast, scalable and complete storage solution for today’s data-intensive
enterprise.
— Integrated tools designed to help organizations manage petabytes of data and millions of files.
— Active File Management is a clustered file system cache, using the underlying file system.
— Moves data on demand, periodically and continuously which makes it extremely flexible.
— Helps increase global collaboration and immensely increases data availability.

Definitions
— Home Cluster/Site
The cluster or main site where data is stored.
— Cache Cluster/Site
The cluster where data is cached.
Note:
The home and cache sites are created independent of each other in terms of storage and
network configuration. The number of nodes in each of these sites can vary based on workload.

Node Definitions
— Gateway (GW) Node
On the cache site, a few nodes in the cluster are assigned special responsibility of acting as
gateway nodes. These gateway nodes are used to send and receive data from the home cluster.
Multiple nodes can be configured as gateway nodes for load balancing, workload distribution and
better performance. The master GW node manages the entire data transfer for the fileset.
— Application Node
An application node is any node in the cache cluster that gets I/O requests from applications.
— A node can be both an application node and a GW node.

File system Operations
— Synchronous Operations

Operations done at the cache like reads, lookups or stats which need to get a response from the home site, before the application
can be served.
The first time one gets a cache “miss” performance, but future times it becomes a cache “hit”.
Configuring revalidation is possible for some modes.
— Asynchronous Operations
Operations done at the cache like creating directories/files, writes, renames, removes, truncates or setting permissions/attributes
etc.
Once the operation is completed on the local filesystem at the cache and queued at the GW node, the response is returned to the
application.

The GW node maintains a queue of all these asynchronous operations that need to be performed at the home cluster. These will
happen at the home cluster after some delay and this process is asynchronous, but continuous.

Data Flow
— Pull Data

This is used to refer to the direction of data flow, when data is pulled into the AFM cache from the home. Eg- on
demand

— Push Data

This is used to refer to the direction of data flow, when data is pushed from the AFM cache to home. Or from
primary site to secondary site, in case of Disaster Recovery scenarios.

— Revalidation
The process of comparing the metadata at cache and home to determine if the data has changed at home. And if
it has, then fetch the latest contents.


Modes Available
— Read-only (RO)

This is a mode used for pulling data from home. The data can be pulled on-demand i.e. on access or it can be prefetched as well. The data is
modified only at home and any changes get pulled into the cache after the revalidation duration. The cache behaves like a read-only file system and
creating and modifying files is not allowed.
— Single-writer (SW)

When a cache is configured in this mode, the cache site can exclusively write data. All asynchronous operations at the cache get pushed to the
home site asynchronously, hiding WAN latencies. This also helps provide better performance to any applications which are run at the cache, as
write-back caching is done. When any asynchronous operation happens, an application can proceed as soon as the operation happens locally on its
filesystem at the cache. This same operation also gets queued on the AFM gateway node.
There is a 1:1 relationship between the AFM single-writer cache fileset and the home fileset. This implies that all the data is to be written at the
single cache site and the home is used only for reading. AFM cannot detect or prevent home site modification of data, the administrators need to
ensure that the data is not modified or accidently corrupted.

— Local-update (LU)

This is used to pull data from home, but any changes made at the cache are not pushed to the home. When a cache is configured in this mode, the
cached data is available for both reading and writing. But the data modified at the cache site is not sent back to the home site. So, this mode serves
as a scratch-cache. After the data is modified at cache, new updates made at home for that particular data object are not pulled into the cache.

Modes Available
— Independent-writer (IW)

This mode allows multiple cache filesets, located in different cache clusters to be associated with a single home fileset, hence this is an
example of N:1 mapping. But the important point to be noted is that each cache site should perform asynchronous operations (includes
writes) on different files. There is no inter-cluster locking for a file getting modified, at multiple cache clusters. Each cache makes its
updates independently and these changes in the IW caches are pushed to the home. In case multiple sites modify the same file and cause
conflicts then the last writer will win. It is administrator’s responsibility to control who has write access to files, to avoid such conflicts.

Once data is updated at home, all connected IW caches can fetch those changes on-demand based on the revalidation intervals set. So on
next data access all the IW caches will get synchronized with the home. Data can also be pre-fetched into the cache.

— Note:
As seen in the above modes, depending on where the data is created/modified sometimes the home site can be referred to as the local site
and the cache site can be referred to as the remote or edge or geographically disperse site. Eg. In RO mode, the home cluster can be
called the local site and cache cluster can be considered as the remote site.

The vice versa is also true Eg- in the SW/IW mode the cache site is where data is generated and can be considered as the local site and
the home site can be considered as the remote site. So these terms local or remote site can be applied to both the cache and home sites,
based on location of data creation and direction of data flow.


Capabilities - Eviction
— When the cache needs to be smaller than home, you can save storage costs.
— Eviction means that data blocks of files residing in the cache are removed from the local file system,
but the metadata of these files is retained at the cache.

— Automatic Eviction: The automatic eviction is based on fileset quotas.
— Manual Eviction: can be done for specific files selected by an Information Lifecycle Management
(ILM) policy. This adds more flexibility in terms of specifying which particular files shouldn’t be eating
up your disk space.

Capabilities
— Prefetch Data

This refers to pre-populating the cache or pulling in the data from home in advance. This can be done for the
entire data at home. Or it can be done for selective files, based on Information Lifecycle Management (ILM) policy
where you can specify which file names or files based on modification time etc., need to be prefetched.

— Parallel I/O

If the files written at the cache or read into the cache, are of large size and above a configurable threshold limit,
then the parallel I/O feature of AFM can be used. This feature helps to break this write/read of a large file into
various chunks and distributes these chunks across multiple gateway (GW) nodes in the cluster. Hence multiple
channels of communication with the home cluster can be used to quickly move the data to and from the home
site.

Data Distribution
• Media and Entertainment

• Software/Binary Distribution

Backup at Home
• Healthcare
and Life
Sciences

• Data
Archives/
Libraries
• Govt.
Institutions

Global Namespace
• Central and
Branch Offices
• Data can be made

available at all
sites.

Ingest and Disseminate Data

Capabilities
— Relation is always at a fileset level. Only supports independent filesets.
— The ‘mmafmctl’ command has options like getstate, flushpending, resumeRequeued
— Can create a new home for a SW/IW fileset.
— Can create a new cache from a home.
— Peer-snapshots.

Disconnection
— Cache can continue despite no connectivity with home or periods when home is inaccessible.
— Updates to home are queued.
— Data is served from local cache, there is no revalidation with home
— Data not available in cache return as not existing error (ENOENT)

Disaster Recovery
— Primary and Secondary

AFM can be used for disaster recovery (DR) solutions and there are 2 sites/clusters in this case, the
primary and secondary.
Both these sites have the entire data on them.

The AFM modes useful for DR are also called primary and secondary filesets.

When an AFM fileset is configured in this mode, the primary(RW, Active) site exclusively creates or writes
data. The secondary site(RO, Passive) cannot modify the data. AFM has a mandatory 1:1 mapping
between primary and secondary filesets. This ensures that only a particular primary can talk to a
secondary.
Note: AFM DR feature is disabled by default and customers need to review the deployment with
the Spectrum Scale Development for approval
Disaster Recovery
AFM
(configured as
Client switches to secondary)
secondary on failure
NAS client
Push all updates

asynchronously
AFM
(configured as primary)

Disaster Recovery
— Failover

An operational mode in which the functions of a system component (such as a server/network) are assumed by
secondary system components when the primary component becomes unavailable through either failure or scheduled
down time.

— Failback

The process of restoring operations and applications to the primary facility after they had been moved to a secondary
machine or facility during failover.

— Recovery Point Objective (RPO)

The interval indicating the amount of data loss which can be tolerated in the event of failures or disasters.

— Recovery time objective (RTO)

The amount of time it takes for an application to fail over when a disaster occurs.
User Defined Gateway Node Mapping
— Cluster wide tunable afmhashVersion.

— Once tunable set to version ‘5’ via “mmchconfig” command, it enables support of User Defined
Gateway Node mapping to AFM/AFM DR fileset.
— Requires upgrade to latest FS and release version. Requires cluster shutdown to update the version.
— A Gateway Node can be defined at the time of fileset creation or fileset modification by using
‘mmcrfileset’/ ‘mmchfileset’ command.
— Once a Gateway Node is defined for a fileset, preference will be given to this Gateway Node.

Migration Enhancements
— Support of Read-Only exports for AFM RO mode filesets.
— Added Statistics counters to AFM prefetch like Completed, Failed, size etc.
— Handling failed file prefetch cases and allow re-try for only the failed files.
— Added support of Directory level prefetching.

Useful Links
— https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.2/ibmspectrumscale502_welcome.html
— https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.2/com.ibm.spectrum.scale.v5r02.doc/b
1lins_quickreference_afm.htm
— https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.2/com.ibm.spectrum.scale.v5r02.doc/bl
1xx_soc.htm

Session summary
— Sharing Data between geographically separated Spectrum Scale clusters is possible in multiple use
cases.
— Storage costs can be saved using AFM.
— A Disaster Recovery Solution can be created using AFM.

Thank you!
Trishali Nayar
Spectrum Scale
ntrishal@in.ibm.com
Please complete the Session

Evaluation!

Notices and disclaimers
• © 2018 International Business Machines Corporation. No part of • Performance data contained herein was generally obtained in a
this document may be reproduced or transmitted in any form controlled, isolated environments. Customer examples are
without written permission from IBM. presented as illustrations of how those
• U.S. Government Users Restricted Rights — use, duplication • customers have used IBM products and the results they may have
or disclosure restricted by GSA ADP Schedule Contract with achieved. Actual performance, cost, savings or other results in other
IBM. operating environments may vary.
• Information in these presentations (including information relating • References in this document to IBM products, programs, or services
to products that have not yet been announced by IBM) has been does not imply that IBM intends to make such products, programs or
reviewed for accuracy as of the date of initial publication services available in all countries in which IBM operates or does
and could include unintentional technical or typographical business.
errors. IBM shall have no responsibility to update this information.
• Workshops, sessions and associated materials may have been
This document is distributed “as is” without any warranty,
either express or implied. In no event, shall IBM be liable for prepared by independent session speakers, and do not necessarily
any damage arising from the use of this information, reflect the views of IBM. All materials and discussions are provided
including but not limited to, loss of data, business for informational purposes only, and are neither intended to, nor
interruption, loss of profit or loss of opportunity. shall constitute legal or other guidance or advice to any individual
IBM products and services are warranted per the terms and participant or their specific situation.
conditions of the agreements under which they are provided. • It is the customer’s responsibility to insure its own compliance
• IBM products are manufactured from new parts or new and used with legal requirements and to obtain advice of competent legal
parts. counsel as to the identification and interpretation of any
In some cases, a product may not be new and may have been relevant laws and regulatory requirements that may affect the
previously installed. Regardless, our warranty terms apply.” customer’s business and any actions the customer may need to
take to comply with such laws. IBM does not provide legal advice
• Any statements regarding IBM's future direction, intent or or represent or warrant that its services or products will ensure that
product plans are subject to change or withdrawal without the customer follows any law.
notice.

Notices and disclaimers continued
• Information concerning non-IBM products was obtained from the suppliers of • IBM, the IBM logo, ibm.com and [names of other referenced
those products, their published announcements or other publicly available IBM products and services used in the presentation] are
sources. IBM has not tested those products about this publication and cannot trademarks of International Business Machines Corporation,
confirm the accuracy of performance, compatibility or any other claims registered in many jurisdictions worldwide. Other product and
related to non-IBM products. Questions on the capabilities of non-IBM service names might be trademarks of IBM or other
products should be addressed to the suppliers of those products. IBM does companies. A current list of IBM trademarks is available on
not warrant the quality of any third-party products, or the ability of any such the Web at "Copyright and trademark information" at
third-party products to interoperate with IBM’s products. IBM expressly : www.ibm.com/legal/copytrade.shtml.
disclaims all warranties, expressed or implied, including but not limited
• .
to, the implied warranties of merchantability and fitness for a purpose.
• The provision of the information contained herein is not intended to, and does
not, grant any right or license under any IBM patents, copyrights, trademarks
or other intellectual property right.

2018 IBM Systems Technical University: 22-26 Oct Rome, Italy

Uploaded by

Copyright:

Available Formats

You might also like

2018 IBM Systems Technical University: 22-26 Oct Rome, Italy

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2018 IBM Systems Technical University: 22-26 Oct Rome, Italy

Uploaded by

Copyright:

Available Formats

Data sharing in today's global

world using Spectrum Scale

2 © Copyright IBM Corporation 2018

3 © Copyright IBM Corporation 2018

— Helps increase global collaboration and immensely increases data availability.

4 © Copyright IBM Corporation 2018

5 © Copyright IBM Corporation 2018

— A node can be both an application node and a GW node.

6 © Copyright IBM Corporation 2018

Configuring revalidation is possible for some modes.

8 © Copyright IBM Corporation 2018

10 © Copyright IBM Corporation 2018

11 © Copyright IBM Corporation 2018

12 © Copyright IBM Corporation 2018

• Media and Entertainment

13 © Copyright IBM Corporation 2018

14 © Copyright IBM Corporation 2018

• Data can be made

15 © Copyright IBM Corporation 2018

16 © Copyright IBM Corporation 2018

— Relation is always at a fileset level. Only supports independent filesets.

— The ‘mmafmctl’ command has options like getstate, flushpending, resumeRequeued

— Can create a new home for a SW/IW fileset.

— Can create a new cache from a home.

17 © Copyright IBM Corporation 2018

— Updates to home are queued.

— Data is served from local cache, there is no revalidation with home

— Data not available in cache return as not existing error (ENOENT)

18 © Copyright IBM Corporation 2018

Push all updates

20 © Copyright IBM Corporation 2018

— Cluster wide tunable afmhashVersion.

22 © Copyright IBM Corporation 2018

23 © Copyright IBM Corporation 2018

24 © Copyright IBM Corporation 2018

25 © Copyright IBM Corporation 2018

Please complete the Session

26 © Copyright IBM Corporation 2018

27 © Copyright IBM Corporation 2018

28 © Copyright IBM Corporation 2018

You might also like