Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 31

Session Title

Bob Johnston / IPS Grid & HA Solutions

Session Number 1234


Agenda
 High Availability / Fault Tolerance Solutions
 What is Grid?
 Why use Grid?
 Software Architecture
 Hardware Architecture
 Administration
 Work shops

2
What is High Availability (HA)?
 Automated fail over of applications
– When one server fails, another server can take over to run the
application
 Not “fault tolerant”
– A fault tolerant solution is something that is always running
– Applications never die because of a hardware failure
– Requires special, “expensive” hardware
 HA requires redundant hardware
– E.g., Two servers that can run DataStage/QualityStage with the
minimum job requirements
 Single install of Information Server software that is shared between
the servers
 Two types
– Active / Passive: One Server used at a time
– Active / Active: All Servers utilized (via APT_CONFIG_FILE) only
3
What is Fault Tolerant?
Higher availability than an “HA” solution
Almost “NO” failure
Only supported for WISD or SOA type applications
Requires at least two installations (full installs)
–Requires duplication of all components
–Double license fee
–Deployment to each environment required (imports / WISD
configuration)
Application needs to call services on multiple servers
–If the first request fails, call the service again on another node
–Could use an HTTP spreader

4
What is Grid Computing?
Grid Computing doesn’t mean the same thing to all
people.
GRID

 Definitions include:
– Using Idle machines on the internet
– Using Idle desktop machines within the company
– Using any server that’s not currently in use
• Regardless of OS, Physical Location, CPU speed …
– Anything running more than 1 Linux box in any manner
– Anything running on any computer when you don’t care
which computer it runs on
5
What is Grid Computing within IP&S
 Low cost solution that provides high throughput processing
 Can be implemented for DataStage, QualityStage, and Information
Analyzer (Profile stage)
 Enables a single parallel job to run across multiple servers
 Typically implemented using four or more Intel servers running RedHat
or SuSe Linux

6
Why Information Integration Grid?
1. Commodity Hardware
 Low Cost
 High Availability
2. Software Scalability
 Larger Data sources
 Faster Run times
3. Utility Usage / Consolidation
 Away from dedicated Servers by Project
 Data Analysis / Data Cleansing / Data Integration on
shared pool of resources

7
Benefits of Grid Computing
 Low cost hardware
 High-throughput processing
 Significant ROI (Return on Investment) for data management
solutions
 Supports a high-availability (HA) solution
 SLA (Service Level Agreement)
–Consistent runtimes
–Isolates concurrent job executions
 Shared resource pool
–Not typical silo-ed environment
–Hardware shared across multiple environments and
departments
8
Before Grid
ProfileStage
Project 1
DataStage
Project 2
QualityStage
Project 3
DataStage
Project 4
... IBM Software
Project N

SMP 1 SMP 2 SMP 3 SMP 4 SMP N

“Silo-ed” architecture & proliferation of SMP servers:


• Higher capital costs through limited pooling of IT assets across
silos
• Higher operational costs
• Limited responsiveness due to more manual scheduling and
provisioning
• Inherently more vulnerable to failure 9
After Grid ProfileStage DataStage QualityStage DataStage ... IBM Software
Project 1 Project 2 Project 3 Project 4 Project N

Information Server Multi-Process Grid Framework

... Eliminates The


SMP nightmare,
Node 1 Node 2 Node 3 Node N Allows Unlimited
Node 4
“Virtualized” infrastructure: Scalability

– Creates a virtual data integration collaboration environment


– Virtualizes application services execution
– Dynamically fulfills requests over a virtual pool of system resources
(nodes)
– Offers an adaptive, self-managed operating environment that
guarantees high availability 10
GRID V. Big Machine

11
Grid v. Cluster Configuration
 DataStage parallel job framework supports both clustered and grid configurations
 Both configurations enable parallel jobs to run across multiple servers
 Clustered configuration
– Configuration files are statically defined
• Require modification when a single node fails
• Management of node utilization is more difficult because of the static nature of
the configuration files
–Same configuration file tends to be used many times for the same server
while other nodes remain idle
–Known Servers for application
 Grid configuration
– Configuration files are dynamically created at runtime
– Resource manager
• Used to identify idle servers and to create a configuration file that utilizes them
• Prevents multiple jobs from using the same nodes and resources
• Queues submitted jobs until required resources are available

12
Parallel Engine Runtime Architecture
 Configuration file maps a job to a runtime infrastructure of processing
nodes and resources
 Uses a multi-process architecture to facilitate scalability across server
boundaries
 Processes exist in an hierarchical framework, illustrated in subsequent
slide
– Conductor node (one)
• Section leaders (one per degree of parallelism)
–Players (stages/operators within a job flow)
 Communication takes place between processes using channels,
illustrated in subsequent slide
– Shared memory processes on the same node
– TCP/IP sockets across nodes

13
Parallel Engine Runtime Architecture
Conductor Node • Conductor node
– Main process used to start up jobs
C – Determines resource assignments
– Composes the Score
– Creates Section Leader processes (one per
Processing Node degree of parallelism)
– Consolidates messages to DataStage log
SL
– Manages orderly shutdown

P P P • Section Leader (one per degree of parallelism)


– Creates and manages Player processes
– Manages communication between Player
Processing Node
processes and Conductor node
SL • Players
– Group of processes used to execute framework
P P P operator logic
– Created on the same system as their Section
Leader process 14
Inter-Process Communication Channels
Control Channel/TCP
Conductor
Stdout Channel/Pipe

Stderr Channel/Pipe

APT_Communicator

Section Leader,0 Section Leader,1 Section Leader,2

generator,0 generator,1 generator,2

copy,0 copy,1 copy,2

15
Example DataStage Job

16
Framework Architecture
{
node “Conductor"
{ Conductor Node
fastname “ProdInfSrv"
pools “conductor” C

resource disk "/home/dsadm/Ascential/DataStage/Datasets"


{pools ""}
resource scratchdisk "/tmp" {pools "“ Processing Node
}
node “partition1" S
L
{
fastname “InfSrv-Compute1" P P P
pools “”
resource disk "/opt/IBM/InfSrv/DataStage/Datasets" {pools ""}
resource scratchdisk "/tmp" {pools ""} Processing Node
}
node “partition2" S
L
{
fastname “InfSrv-Compute2" P P P
pools “”
resource disk "/opt/IBM/InfSrv/DataStage/Datasets" {pools ""}
resource scratchdisk "/tmp" {pools ""}
}
}

17
Grid Environment
 Each Server must run the same operating system
 Server roles
– One Server is primary (Head node)
• Provides software and services to the compute nodes
• DataStage clients connect to this Server
• Sized based on need
– Compute nodes
• Smaller machines
• Typically 2 CPUs with 4G memory
 Servers are connected by TCP/IP network
– Ports >= 10,000
• Firewall ports must be open between nodes
– One GB switches
18
File Storage Solutions NAS or SAN?
 NAS – Network Attached Storage (Filer)
– Inexpensive solution for Grid computing
– Designed specifically for Data Storage
– Connections via Ethernet
– All nodes see same image via NFS mounts
– Runs on embedded operating system
– Easy to use
– Compute nodes do all I/O
 SAN – Storage Area Network
– Typical external Array like Shark or EMC
– Connected via Fiber Channel
– Global File System (GFS) allows all nodes to see same Image

19
Sizing the Grid
 Head node
– Size based on:
• Number of concurrent users and/or jobs
• Number of projects
– Typical example
• 8G memory
• Two dual core CPUs (3 GHz)
 Fail-over node (mostly compute, failover for Head node)
– Same memory requirement
 Compute nodes
– Number
• Minimum of 3. Less than 3 is a cluster
• Every job needs one node
– Number of jobs or job sequences that will run concurrently = Number of nodes needed
– Additional node or two for failover
– Hardware requirements
• Two 1G NIC card
– Private network: Parallel framework only and build of node
– Public network: To outside sources, e.g., databases
• “Bare-bones” machines
– No HA hardware

20
Grid Administration
 NIS SMP Staffing
– Single login administration
– Files administered by NIS
• /etc/passwd shadow group
• /etc/hosts services
– Managed by /etc/nsswitch.conf
 PXE boot Incorrect
– Pre-execution environment Configuration
– Power-on compute node OS install
– Post install procedures include installing
• Resource Manager
• NFS configuration
• NIS configuration
• Kernel changes The goal
• Resource monitoring configuration
 NFS/GFS mounts
– Software
– User home directories
• ssh keys
• Trusted hosts
21
Resource Monitoring
 Monitoring a single machine typically involves using commands such as:
top, sar, vmstat
– Only shows a single machine
– Not what is occurring across machines
– Requires connecting to the individual machine
 Monitoring a Grid involves more
– How are the machines in the Grid performing?
– View as a single node
– View the entire grid view
– Web interface is nice
 Monitoring
– CPU
– Network
– Memory
 Things that are important
– How a job utilizes nodes
– How all nodes are being utilized

22
Resource Monitor - Ganglia
Aggregate
information

Individual node
information

23
Resource Managers

Supported Commercial Open Source

IBM LoadLeveler SGE


Platform LSF Torque
PBSpro openPBS
DataSynapse Condor

24
Resource Manager Responsibility
Node Allocation / Holding of Resource until task
completes
 Nodes that are used, can’t be used again until
previously assigned task completes
 No additional nodes available – Jobs Queue’d
Released using FIFO method
 Must have enough to run max concurrent job streams
Smart usage/allocation by type of job
Define queues based on priority
 Down/off servers not assigned

25
Resource Manager Responsibility
 License Restricter (nodes allocated by type of task)
– Information Analyzer: 4 nodes
– QualityStage: 6 nodes
– DataStage: 10 nodes
• Based on time of day
– From Shared pool of nodes
 License for 8 nodes, but have 48 nodes available (other apps)
 Assigned to Logical, Not Physical node
 Each Resource Manager manages these tasks
differently
 Real Time during Day, Batch nightly or weekends
26
Correct Node Allocation
 Each Job Sequence or Job should maximize
utilization on single node before using multiple nodes
– Maximize on single node first
– Then add more nodes
 Most Jobs or Job Sequences can use single node
and partition:
 Run time (ie. Less than 5 minutes)
 Data volume (< few thousand rows)
 Concurrency of jobs from a job single sequence
 Others can use 2 or more nodes based on runtime /
Data volume requirements
27
Information Server With Grid
 Must be Enterprise Edition
 Installation (grid enabled
toolkit)
• Modifies dsenv and
Administrator environment
variables to include
Parameters for grid_enabled
components
• Engine must be ‘restarted’ to
enable jobs to be submitted
• Test script created used for
quick validation
 Jobs can be run from GUI as normal
 Jobs can be run from Command line
 All jobs statistics and logs are same as SMP

28
Grid Computing Discovery, Architecture
and Planning Workshop
 Grid Readiness Assessment
– Customer requirements and goals
– Existing standards and infrastructure
– Administrative and Developer skill sets
 Technical Overview of the IBM IIS Grid Enablement Toolkit
– Architecture
– Hardware and Software Requirements
• including Operating System and Grid Resource Manager requirements
– Administrative and Support Considerations
 Deliverable: Grid Deployment Plan
– Overview of IPS Grid offering
– Hardware and O/S recommendations GRID
– Software requirements
– Infrastructure and support recommendations
 Duration:
– 2 Days On-site, 1 week offsite to prepare report
29
Installation and Deployment Workshop
– Installation of software components
• IBM IP&S Grid Enablement Toolkit
• (optional) hardware and Operating System configuration
• (optional) “Build Your Own Grid” Toolkit
for NIS configuration, PXE boot processing, Resource Manager
– Configuration and Testing of components
• Grid Resource Manager
• IPS Grid Enablement Toolkit
• IPS Information Server
– Job design review
• Adapting and Optimization to run in the Grid environment
– Grid Toolkit mentoring GRID

• Management and Administration


• Performance Considerations
– Duration 2 – 4 weeks on-site, depending on scope

30
Disclaimer
© Copyright IBM Corporation [current year]. All rights reserved.
U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule
Contract with IBM Corp.

THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES


ONLY.  WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE
INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS” WITHOUT WARRANTY OF
ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT
PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.  IBM
SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE
RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS
PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR
REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND
CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR
SOFTWARE.

IBM, the IBM logo, ibm.com, InfoSphere, and DataStage are trademarks or registered trademarks of International Business
Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked
on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or
common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered
or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and
trademark information” at www.ibm.com/legal/copytrade.shtml

Other company, product, or service names may be trademarks or service marks of others.

31

You might also like