C6 - PureApp in Production - HADR, Monitoring, Enterprise Security

NA ISSW Technical Interchange 2014
PureApp in Production
Enterprise Security, Monitoring, and HA/DR
Session Number: C6
Bobby Woolf – PureStart Technical Account Manager
Kyle Brown – Distinguished Engineer, SOA and Emerging Technologies
© 2014 IBM Corporation

Agenda NA ISSW TI 2014
• Enterprise security
 Security roles
 Workload access control
• Monitoring
 Hardware
 System
 Middleware
• High availability and disaster recovery
 High availability
 Backup and restore
 Disk replication
 File replication
 Shared disk
2 © 2014 IBM Corporation

NA ISSW TI 2014
Enterprise Security

Security Roles NA ISSW TI 2014
• Administrators • Workload Management

 Workload resources administration  Deploy patterns in the cloud
 Cloud group administration  Create new patterns
 Hardware administration  Create new environment profiles
 Auditing  Create new catalog content
 Security administration  IBM License Metric Tool (ILMT)
• Levels of access
 View all workload resources (Read-only)
 Manage workload resources (Full permission)
• Allow delegation when full permission is
selected
Workload resources Cloud group Hardware Security

Auditing
administration administration administration administration
Administrators
Workload Deploy patterns in

Create new patterns
Create new Create new catalog IBM License Metric
the cloud environment profiles content Tool (ILMT)
Management

Security Role Assignment NA ISSW TI 2014
• Assigned to users and user groups

 System Console > System > Users
 System Console > System > User Groups
 Permissions section
• Authorization
 Access to console menus
 Access to CLI commands
 Access to REST objects

Workload Management Subroles NA ISSW TI 2014
• Workload Console > Patterns

 Button to create a new pattern
• Workload Console > Cloud > Environment
Profiles
 Button to create a new profile
• Workload Console > Catalog
 Create button on each artifact page
• Ability to manage licenses via the REST API

Delegate administration role NA ISSW TI 2014
• Delegate administration role

 If a user has Manage level-of-access for an administration role, can they give that role to other users?

Workload Resources Administration NA ISSW TI 2014
• Workload Console > Instances > Shared Services • Workload Console > Cloud > Shared Services
• Workload Console > Catalog > Database Workload • Workload Console > Cloud > System Plug-ins
Standards
• Workload Console > Cloud > Pattern Types
• Workload Console > Catalog > DB2 Fix Packs
• Workload Console > Cloud > Default Deploy Settings
• Workload Console > System > Troubleshooting
• Workload Console > System > Storehouse Browser

Cloud Group Administration NA ISSW TI 2014
• System Console > Cloud

• System Console > Reports
• System Console > System > Product Licenses

Hardware Administration NA ISSW TI 2014
• System Console > Hardware • System Console > System > Job Queue
• System Console > Reports • System Console > System > Events
• System Console > System > Settings • System Console > System > Troubleshooting
• System Console > System > Customer Network • System Console > System > Problems
Configuration
• System Console > System > Product Licenses

Hardware Administration NA ISSW TI 2014
• System Console > System > Auditing

Security Administration NA ISSW TI 2014
• System Console > Reports

• System Console > System > Users
• System Console > System > User Group
• System Console > System > Security

LDAP Settings NA ISSW TI 2014
• Configure PureApp to get user groups

from LDAP
 Finds users automatically

Roles Principles NA ISSW TI 2014
• Assign security roles to user groups, not to individual users

• Do not assign administrative roles (full access) and the auditing role (full access) to the same user
group
• Change the default administrator’s password
• Create a user group for every security role and every level-of-access
• Create a user group for each team that will use environment profiles to deploy patterns
 These should be mapped from LDAP
• Create a separate user for each person who will log into IPAS
 These should be mapped from LDAP

Workload Component Access Control NA ISSW TI 2014
• Access granted to • Read

 Component grants access to one or more users  View in lists
 Creator is automatically owner, gets All access  Pattern: Clone, deploy
 Other users are granted a level of access  Environment profile: Deploy
• Access levels • Write
 Read – Read only  Pattern instance: manage its lifecycle
 Write – Read + change • Start, stop, store
 All – Write + delete + grant access
• Any user can deploy a pattern

 No security role for deployment
 Deploy patterns in the cloud is an implied role
 Need read access to the pattern
 Need read access to the environment profile

Catalog Content Access Control NA ISSW TI 2014
Managing Patterns
• Clone, edit, and lock a pattern
 User must have the Pattern creation role (i.e. Workload Management > Create new patterns)
 User must have Write access to the pattern
• Delete a pattern
 User must have the Pattern creation role
 User must have All access to the pattern
Managing Other Catalog Content

• Environment profile
 User must have the Environment profile creation role (i.e. Workload Management > Create new environment
profiles)
• Catalog content
 User must have the Catalog content creation role (i.e. Workload Management > Create new catalog content)
• Pattern instance
 To start, stop, store, and delete, a non-owner user needs to be granted the Environment profile creation role
 Any user with the Workload resources administration role can manage any pattern instance

NA ISSW TI 2014
Monitoring

Monitoring NA ISSW TI 2014
• Why monitoring?
 Every customer wants to know how to do it
 None of them ever end up doing it*
• Monitoring goals (are rather subjective)
 To be able to see what’s going on within a system
 To be notified when something’s going wrong with a system
• 19th century essayists would’ve loved monitoring
 “Lies, damned lies, and statistics” – Mark Twain
 “Everybody complains about the weather, but nobody does anything about it.” -- Charles Dudley Warner
 Monitoring is like a weather report for your system
• Three key features
1. Reports (Machine Activity; also Product Licenses)
2. Events (and Event Forwarding and external monitoring)
3. PureApplication System Monitoring Portal
Unless/until they connect it to external monitoring.

*
Which means they already have external monitoring in their data center.

Monitoring is Role-based NA ISSW TI 2014
• Role based visibility

 Deployers
• Can drill down only into their middleware/db metrics
• Security role: Workload Management > Deploy patterns in the cloud
 Monitor Operators
• Allowed access to all user deployments
• Security role: Cloud group administration (read only), Hardware administration (read only)
 Monitor Administrators of cloud and hardware
• Allowed to configure monitoring
• Allowed to see all hardware
• Security role: Cloud group administration (full), Hardware administration (full)
• Single sign-on (SSO)

• Seamless, role-based drilldown

Three Levels of Monitoring NA ISSW TI 2014
• Hardware
 All components run ITM agents
 Also, each compute node has an Integrated
Management Module (IMM)
 Status shown in Infrastructure Map
• Middleware
 Every VM runs an ITM OS agent
 Tracks VM lifecycle and health status
 WAS, DB2, IHS VMs include additional agents
 Other ITM agents available separately
• System
 Jobs: System management tasks
 Events: Notifications of significant situations
 Problems: Events to call support
 Reports: Graphs showing resource usage
 Product licenses: License compliance
 Auditing: Log of administrative changes

Hardware Monitoring NA ISSW TI 2014
• Infrastructure Map shows hardware status

 System Console > Hardware > Infrastructure Map (Graphics View)
 Critical and Warning situations per component
 Summary of Events log
• Component views show monitored details
 Compute Nodes
 Management Nodes
 Flex Chasis
 Storage Devices
 Network Devices
 Virtual Networks
• Details about a component
 Events
 Jobs
 State/Status
 Capacity and usage
 Temperature
 Power status
 LEDs
 Energy information
 Health statistics

System Monitoring NA ISSW TI 2014
• Jobs Queue
 System Console > System > Jobs Queue
 Jobs are system tasks to be performed asynchronously
 IWD tasks (deploying, deleting) are a series of jobs
 Optionally display internal jobs like configuring components
 Some jobs, like backup, are blocking, which makes them exclusive
• Events
 System Console > System > Events
 ITM situations that are significant
 Type: Different kinds of components like compute node and virtual machine
 Severity: Fatal, Critical, Major, Minor, Warning, etc.
 Category: Alert, Resolution, Call support, and Customer serviceable
• Problems
 System Console > System > Problems
 Events with the “Call support” category
 Additional details suitable for adding to a PMR

System Monitoring (cont.) NA ISSW TI 2014
• Reports
 System Console > Reports
 Machine Activity
• Graphs that show resource allocation and consumption
• Trend shows when usage will exceed capacity
• Ex: CPU, memory, and VMs by cloud group or compute node
• Ex: IP usage by IP group
 User Activity: List of resources by user
 Metering: For customers who pay based on CPU metering
 Chargeback: Shows usage by user or group
 IP Usage: Shows the resource using each IP address
• Product licenses
 System Console > System > Product Licenses
 Set license capacity, enforcement, and notification
 View license usage
• Auditing
 System Console > System > Auditing
 Log of administrative changes and access to secure objects
 Can be used to show HIPAA and SOX compliance

Event Forwarding NA ISSW TI 2014
• SNMP traps can be set to forward events for external monitoring

 System Console > System > Settings > Event Forwarding
 System Identification describes the source of the events
 Trap Destinations specify SNMP listener clients
 Each trap can filter events by severity
 Events (from the event log) are forwarded to the traps
• External PureApplication System Agent
 Can be installed in existing ITM server
• Additional support for IBM Tivoli Monitoring (ITM)
 MIB and OMNIbus rules

Virtual Machine Monitoring NA ISSW TI 2014
• A virtual system pattern instance’s VMs show CPU and memory usage
 Workload Console > Instances > Virtual System
 Select the instance, expand the Virtual Machines section
• PureApp shows monitored details for virtual machines
 Workload Console > Instances > Virtual Machines
 Shows events, jobs, usage of CPU, memory, disk, network
 This is the VM equivalent of hardware monitoring

Virtual Application Console NA ISSW TI 2014
• For a particular VAP

 Workload Console > Virtual Application Instances > your VAP
 Manage button
OS view
WAS view
DB2 view

Workload Monitoring Shared Services NA ISSW TI 2014
• Shared services must be deployed to enable workload monitoring

 Enables the monitor link on the VM listed in the workload console
• Four workload monitoring shared services
1. System Monitoring (ITM-Hub-TEMS and OS-level monitoring)
2. System Monitoring for HTTP Servers (ITCAM for HTTP)
3. System Monitoring for WebSphere Application Server (ITCAM for WAS)
4. Database Performance Monitoring (InfoSphere Optim)
• Workload monitoring shared services
 Patterns included as part of PureApp
 Instances need to be deployed to each cloud group
• PureApplication System Monitoring Portal
 A.k.a. Tivoli Enterprise Portal (TEP)
 Part of Smart Cloud Monitoring, f.n.a. IBM Tivoli Monitoring (ITM)
 Java GUI that connects to the System Monitoring shared service
 Opened by the various “endpoint” and “monitoring” links
TEP (Tivoli Enterprise Portal)
TEPS (Tivoli Enterprise Portal Server)

TEM
TEMAgent
TEMAgent
Agent Hub-TEMS (Tivoli Enterprise Monitoring Server)
PureApplication System Monitoring Portal NA ISSW TI 2014

Opening the Monitoring Portal NA ISSW TI 2014
• For the whole system

 Workload Console > Shared Service Instances > System Monitoring 1.0.1.0
 Middleware perspective: Hub-TEMS
 Virtual machine perspective: Hub-TEMS
 Endpoint link

Opening the Monitoring Portal (cont.) NA ISSW TI 2014
• For a particular VAP

 Workload Console > Virtual Application Instances > your VAP
 Virtual machines of the virtual application instance
 WAS VM’s Monitor link
 DB2 VM’s Monitor link

NA ISSW TI 2014
High Availability and Disaster

Recovery

Definitions NA ISSW TI 2014
• High availability (HA)

 A highly available system is durable and likely to operate continuously without failure for a long time
 In practice, clients may experience outages for a few seconds/minutes while failover occurs
 High availability allows for planned outages but disallows unplanned outages
 Continuous availability disallows planned and unplanned outages
• Disaster recovery (DR)
 A plan and process for reconstructing a data center’s operations in a different data center
 Goal is business continuity, which drives requirements for acceptable service levels
 Systems are recovered in priority order
• Recovery time objective (RTO)
 The maximum desired length of time allowed between a disaster and the resumption of normal operations
and service levels
 If a disaster makes a system unavailable, how much time do you have to make it available again?
• Recovery point objective (RPO)
 The maximum acceptable amount of data loss measured in time due to a disaster
 If a disaster makes a system unavailable, how much data (i.e. recent changes) can be unrecoverable?
Techopedia Technology Dictionary: http://www.techopedia.com/it-dictionary

High Availability in PureApp NA ISSW TI 2014
Intra-rack HA by design Inter-rack WAS/DB2 HA

• Redundant hardware, networking, and storage • WAS and DB2 application across local racks
• No single points of failure  WAS: Option 2: Two cells, one per rack, cluster
• Recovery from hardware failures with workload members share a database
 DB2: Deploy "DB2 Enterprise HADR Standby" part
mobility
on one rack
• Additional capacity can be added and utilized
 DB2: Deploy "DB2 Enterprise HADR Primary" part on
without any service interruption other rack
 DB2: Configure primary server with settings (host,
Additional local racks can increase HA further port, password) from standby server
• WebSphere clustering across racks
 Option 1: Single cell, Dmgr runs on one of the
racks, cluster members on both racks
 Option 2: Two cells, one per rack, cluster members
share a database
• DB2 can be configured across racks using HADR
• MQ and other patterns can also be clustered
similarly

PureApp Disaster Recovery Basics NA ISSW TI 2014
1. Replicate management data

 Cloud configuration: IP groups, cloud
groups, environment profiles Manual Network Change
 Workload components: Patterns, virtual
images, pattern types, script packages,
plug-ins, etc. Active (primary) Standby (backup)
copy
2. Replicate application data
 Data needed to backup each application: Management
Databases, configuration, logs, etc. Data
3. Redirect network traffic
 From the primary to the backup
Application
Data

DR solution ranges NA ISSW TI 2014
Backup and Restore
File, Disk Replication
Shared File Systems
RPO
Zero Seconds Minutes Hours Days time

PureApp Backup and Restore NA ISSW TI 2014
1. System backup
 Console feature backs up management software and system settings
 Can only be restored on the same system at the same fix pack level
2. Cloud environment
 Use CLI scripts to create IP groups, cloud groups, and environment profiles
 Use properties files for data center-specific values
3. Workload components
 Export patterns
 Export any extended virtual images
 May want to export pattern parts (i.e. script packages, add-ons, etc.) to CM separately
4. Application data
 Back up applications’ internal state, such as databases, config files, and logs
 Central coordination to develop general backup strategy (i.e. TSM), middleware-specific strategy (i.e. Portal
and ODM), sample scripts for pattern developer to emulate

Simple DR using Backup and Restore NA ISSW TI 2014
• Lenient business continuity requirements

 RTO: 48 hrs
 RPO: 24 hrs
1. Initial setup before a disaster occurs:

 Use one rack for prod, one for non-prod (a.k.a. development and test)
 For prod DR, prod rack is active/primary, non-prod is standby/backup
 Replicate management data: On non-prod rack, set up a Prod cloud group for DR, pre-stocked with IP groups
(containing production IPs and VLANs) and a single compute node. (Use the rest of the capacity for non-prod.)
 Produce application data backups of each pattern instance every 24 hrs
2. To failover prod to the non-prod rack:

 Shut down (i.e. store or delete) workloads in the Non-prod cloud group to release resources
 Quiesce, stop, and remove compute nodes from the Non-prod cloud group
 Add compute nodes to the Prod cloud group
 Deploy patterns for the workloads to be recovered
 Replicate application data: Restore application backups into the newly deployed workloads
3. To make prod on non-prod rack available:

 Redirect network traffic: Make the workloads available for users to log in

PureApp DR Feature using Disk Replication NA ISSW TI 2014
• Setup of Disaster Recovery feature in PureApp

 Requires two systems; must be same model (W1500 vs. W1700), size (Small vs. Large), and capacity (cores)
 Connect the two system’s V7000 storage controllers via a fibre channel connection
 Configure each with a DR profile, configure trust, and set one to primary and the other to backup
 Primary’s disk (entire set of V7000s) is replicated to Backup’s disk, then kept synchronized
• Failover
 Backup takes over all IP addresses that were in use by the Primary
 Manually restart workloads
 Update external routers
• Advantages
 Simple set up that fails over everything: System, cloud environment, patterns, workloads, data, etc.
 RPO < 1 second; RTO is the time it takes to restart the workloads (up to 4-6 hours)
• Disadvantages
 Requires a second system on standby, unused
 IP takeover: New rack uses all the IPs that were on the old rack; IPs must be supported by both data centers
 Fibre channel connection maximum length: 8000 KM (trans-Atlantic but not trans-Pacific)

Double-Take Availability for File Replication NA ISSW TI 2014
• Double-Take Availability from Vision Solutions

 High-speed file replication agents copy changes to protected files
 The technology uses minimum bandwidth, works over standard IP networks, and is compatible with Linux, AIX
and Windows
• Double-Take Availability on PureApp
 Separately licensed product that can be added to PureApp
 Applied on a per-pattern basis, not for the entire rack
• Example: Typical middleware running active/standby
 Customize a pattern to store application data in shared (a.k.a.
protected) file system
 Customize pattern with script packages for Install DoubleTake,
Configure DoubleTake, and Initiate Recovery
 Deploy pattern on both racks, standby/target first; config
active/source with hostname of target
 Configure script syncs the patterns instances, replicates, and
shuts down the target middleware
 When disaster occurs, manually run Recovery script, which
changes hostname, starts target middleware, and becomes
the source

GPFS for Shared Disk NA ISSW TI 2014
• IBM General Parallel File System (GPFS™)

 Parallel cluster file system based on a shared-disk model (provided by an underlying SAN)
 Appears to the user / application as a standard file system
 GPFS allows multiple nodes concurrent access to the same data
 No application changes required – single image file access
• GPFS on PureApp
 Can use V7000s LUNs for SAN storage
 A GPFS server VM runs on each rack; a tie-breaker VM runs on a third server
• Example: Typical middleware running active/standby
 Customize a pattern to store application data in shared file system
 Deploy pattern on both racks, stop the pattern or its middleware on the standby rack
 On failure, start pattern/middleware on standby rack and redirect network
• Example: Highly Available MQ and Message Broker Configuration
 Deploy two queue managers, one on each rack
 Configure queue managers as a multi-instance queue manager in a cluster
 Configure queue manager storage in shared file system
 Deploy Message Broker on both queue managers
 Configure brokers as active-active

Resources NA ISSW TI 2014
• “Managing administrative access in IBM PureApplication System” by Bobby Woolf

 http://www.ibm.com/developerworks/websphere/library/techarticles/1211_woolf/1211_woolf.html
 http://cattail.boulder.ibm.com/cattail/#view=bwoolf@us.ibm.com
• “IBM PureApplication System v1.0: Monitoring Overview” by Thomas Blattmann
 ISSW PureApplication Systems Community of Practice
 https://w3-connections.ibm.com/files/app#/file/853e0378-d6d5-42ff-bef8-c9addaf99d4b
• “Monitoring architecture” in IBM Tivoli Monitoring Information Center
 http://publib.boulder.ibm.com/infocenter/tivihelp/v15r1/topic/com.ibm.itm.doc_6.2.3fp1/welcome.htm
• “PureApp: Implementing HA and DR” by Kyle Brown and Lin Overby
 http://cattail.boulder.ibm.com/cattail/#view=bwoolf@us.ibm.com
• “Achieving business continuity in IBM PureApplication System” by Kyle Brown and Lin Overby
 http://www.ibm.com/developerworks/websphere/techjournal/1309_brown1/1309_brown1.html

NA ISSW TI 2014
Legal Disclaimer
• © IBM Corporation 2014. All Rights Reserved.
• The information contained in this publication is provided for informational purposes only. While efforts were made to verify the completeness and accuracy of the information contained
in this publication, it is provided AS IS without warranty of any kind, express or implied. In addition, this information is based on IBM’s current product plans and strategy, which are
subject to change by IBM without notice. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this publication or any other materials. Nothing
contained in this publication is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and
conditions of the applicable license agreement governing the use of IBM software.
• References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. Product release dates and/or
capabilities referenced in this presentation may change at any time at IBM’s sole discretion based on market opportunities or other factors, and are not intended to be a commitment to
future product or feature availability in any way. Nothing contained in these materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken by
you will result in any specific sales, revenue growth or other results.
• If the text contains performance statistics or references to benchmarks, insert the following language; otherwise delete:
Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will
experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage
configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.

C6 - PureApp in Production - HADR, Monitoring, Enterprise Security

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

C6 - PureApp in Production - HADR, Monitoring, Enterprise Security

Uploaded by

Copyright:

Available Formats

NA ISSW Technical Interchange 2014

© 2014 IBM Corporation

2 © 2014 IBM Corporation

3 © 2014 IBM Corporation

• Administrators • Workload Management

Workload resources Cloud group Hardware Security

Workload Deploy patterns in

4 © 2014 IBM Corporation

• Assigned to users and user groups

5 © 2014 IBM Corporation

• Workload Console > Patterns

6 © 2014 IBM Corporation

• Delegate administration role

7 © 2014 IBM Corporation

8 © 2014 IBM Corporation

• System Console > Cloud

9 © 2014 IBM Corporation

10 © 2014 IBM Corporation

• System Console > System > Auditing

11 © 2014 IBM Corporation

• System Console > Reports

12 © 2014 IBM Corporation

• Configure PureApp to get user groups

13 © 2014 IBM Corporation

• Assign security roles to user groups, not to individual users

14 © 2014 IBM Corporation

• Access granted to • Read

• Any user can deploy a pattern

15 © 2014 IBM Corporation

Managing Other Catalog Content

16 © 2014 IBM Corporation

17 © 2014 IBM Corporation

Unless/until they connect it to external monitoring.

18 © 2014 IBM Corporation

• Role based visibility

• Single sign-on (SSO)

19 © 2014 IBM Corporation

20 © 2014 IBM Corporation

• Infrastructure Map shows hardware status

21 © 2014 IBM Corporation

22 © 2014 IBM Corporation

23 © 2014 IBM Corporation

• SNMP traps can be set to forward events for external monitoring

24 © 2014 IBM Corporation

25 © 2014 IBM Corporation

• For a particular VAP

26 © 2014 IBM Corporation

• Shared services must be deployed to enable workload monitoring

TEP (Tivoli Enterprise Portal)

TEPS (Tivoli Enterprise Portal Server)

28 © 2014 IBM Corporation

• For the whole system

29 © 2014 IBM Corporation

• For a particular VAP

30 © 2014 IBM Corporation

High Availability and Disaster

31 © 2014 IBM Corporation

• High availability (HA)

Techopedia Technology Dictionary: http://www.techopedia.com/it-dictionary

32 © 2014 IBM Corporation

Intra-rack HA by design Inter-rack WAS/DB2 HA

33 © 2014 IBM Corporation

1. Replicate management data