Professional Documents
Culture Documents
KM510 Course Guide PDF
KM510 Course Guide PDF
November, 2015
NOTICES
This information was developed for products and services offered in the USA.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for
information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to
state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any
non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document.
The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive, MD-NC119
Armonk, NY 10504-1785
United States of America
The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND,
EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in
certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these
changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the
program(s) described in this publication at any time without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any manner serve as an endorsement of
those websites. The materials at those websites are not part of the materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Information
concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available
sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM
products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the
examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and
addresses used by an actual business enterprise is entirely coincidental.
TRADEMARKS
IBM, the IBM logo, ibm.com and InfoSphere® are trademarks or registered trademarks of International Business Machines Corp., registered in
many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is
available on the web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml.
Adobe, and the Adobe logo, are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other
countries.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
© Copyright International Business Machines Corporation 2015.
This document may not be reproduced in whole or in part without the prior written permission of IBM.
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Contents
Preface .................................................................................................................. P-1
Contents .............................................................................................................. P-3
Course overview .................................................................................................. P-8
Document conventions ........................................................................................ P-9
Additional training resources ............................................................................. P-10
IBM product help................................................................................................ P-11
Information Server technical overview ................................................ 1-1
Unit objectives ..................................................................................................... 1-3
Information Server functional categories ............................................................. 1-4
IS products support these functional categories .................................................. 1-5
Role-based tools with integrated metadata ......................................................... 1-6
Topic: Data quality products and components ..................................................... 1-7
Information Analyzer ............................................................................................ 1-8
QualityStage functionality .................................................................................... 1-9
Why data cleansing with QualityStage is needed .............................................. 1-10
Topic: Data governance..................................................................................... 1-12
Information Governance Catalog ....................................................................... 1-13
The Information Governance Catalog (IGC) ...................................................... 1-14
Stewardship Center ........................................................................................... 1-15
Topic: Transformation ........................................................................................ 1-16
Using Information Server to transform data ....................................................... 1-17
DataStage .......................................................................................................... 1-18
FastTrack ........................................................................................................... 1-19
Topic: Delivery ................................................................................................... 1-20
Information Services Director ............................................................................ 1-21
Topic: Information Server Architecture .............................................................. 1-22
Information Server architecture ......................................................................... 1-23
Information Server backbone............................................................................. 1-24
Parallel processing engine................................................................................. 1-25
Information Server architectural tiers ................................................................. 1-26
Architecture diagram.......................................................................................... 1-27
Platform topologies ............................................................................................ 1-29
Client tier............................................................................................................ 1-30
Course overview
Preface overview
This course gets those charged with administering Information Server v11.5 and its
suite of many products and components started with the basic administrative tasks
necessary to support Information Server users and developers. The course begins with
a functional overview of Information Server and the products and components that
support these functions. Then it focuses on the basic administrative tasks an
Information Server administration will need to perform including user management,
session management, and reporting management tasks. The course covers both the
use of Information Server administrative clients such as the Administration Console and
Metadata Asset Manager and the use of command line tools such as istool and
encrypt.
Intended audience
Those who will be administering Information Server and/or its product components.
Topics covered
Topics covered in this course include:
• Information Server Technical Overview
• Working with Information Server Clients
• Authentication and Suite Security
• Session Management
• Managing Reports
• Administrative Tools
• Managing Information Server Repository Assets
Course prerequisites
Participants should have:
• No prerequisites
Document conventions
Conventions used in this guide follow Microsoft Windows application standards, where
applicable. As well, the following conventions are observed:
• Bold: Bold style is used in demonstration and exercise step-by-step solutions to
indicate a user interface element that is actively selected or text that must be
typed by the participant.
• Italic: Used to reference book titles.
• CAPITALIZATION: All file names, table names, column names, and folder names
appear in this guide exactly as they appear in the application.
To keep capitalization consistent with this guide, type text exactly as shown.
Task- You are working in the product and IBM Product - Help link
oriented you need specific task-oriented help.
Unit objectives
• List the Information Server functional categories
• List the Information Server products and components that support the
Information Server functional categories
• List the Information Server software, architectural tiers
Unit objectives
Unified Deployment
Unified Deployment
Topic:
Data quality products and
components
Information Analyzer
• In-depth analysis of existing data
systems
Analysis of application, database, and
file-based source data for content,
Subject Matter Data
quality, and structure Experts Analysts
Information Analyzer
In order to ensure data quality you need to measure the quality of data in data resource
systems. This is accomplished using Information Analyzer through data-centric profiling
and analysis of source systems, including column analysis, table analysis, and cross-
table analysis, which provide detailed profiling of the data in each column (cardinality,
nullability, range, scale, length, precision). This activity is typically conducted by data
analysts and subject matter experts.
Information Analyzer provides insight into the quality and usage characteristics of the
information. It can also help uncover data relationships across systems, through foreign
key affinity mapping. Profiling is designed to become an ongoing process, comparing
ongoing quality against a baseline, to understand how data quality changes over time
and to ensure that the understanding assumptions are still holding true.
QualityStage functionality
• Provides specialized data quality
processing
Ensures clean, standardized, de-
duplicated information
Enables a single version of the truth Subject Matter Data
Experts Analysts
Supports global postal verification
Standardize and correct source data
• Provides visual tools for designing fields, and match records together
across sources to create a single view
quality rules and matching logic
Seamlessly integrated into
DataStage
Precisely calibrates matching rules
• Allows quality logic to be deployed
seamlessly within DataStage
Extraction, Transformation, Load Visual Match Rule Design
(ETL) jobs
Information Server technical overview © Copyright IBM Corporation 2015
QualityStage functionality
QualityStage is a product that helps to identify and resolve the data cleansing issues
previously discussed. It provides data quality functions on an easy-to-use, design-as-
you-think flow diagram. This allows data quality to be embedded in any information
integration process.
QualityStage data quality functions include:
• Free-form text investigation: Enables you to recognize and parse out individual
fields of data from free-form text
• Standardization: Enables individual fields to be made uniform according to your
standards
• Address verification and correction: Uses postal information to standardize,
validate, and enrich address data
• Matching: Enables duplicates to be removed from individual sources, and
common records across sources to be identified and linked
• Survivorship: Enables the best data from across different systems to be merged
into a consolidated record
There are several types of problems within enterprise data stores that QualityStage is
designed to handle:
• The first is a lack of information standards. Names, addresses, part numbers, and
other data are entered in inconsistent ways, particularly across different systems.
• Another common issue involves data surprises in individual fields. Data in the
database is often misplaced, or fields are used for multiple purposes – as where
a name field contains company and address information, a tax ID field contains
telephone numbers, and the telephone field has a variety of mistakes.
• A third common problem is information buried in free-form fields. In this case
valuable information is hidden away in text fields. Since these fields are difficult to
query using SQL, this information is often not leveraged, although it likely has
value to the business. This type of problem is common in product information and
Customer Support case records.
• The fourth problem is data myopia – a term for the lack of consistent identifiers
across different systems. Without adequate foreign-key relationships, it is
impossible to get a complete view of information across systems. This example
shows three products that look very different, but are actually the same.
• The final problem is redundancy within individual tables. This is extremely
common, where data is re-entered into systems because the data entry
mechanism is not aware that the original record is already there.
Topic:
Data governance
Stewardship Center
• Business process workflow application to resolve data quality issues
Data quality exceptions can be generated in Information Analyzer,
DataStage, and QualityStage
• Business process workflow application to support the development of a
governance catalog
Governance events can be generated in the information governance catalog
− E.g., a catalog term has been submitted for approval
• Stewardship Center application sends an email to notify the reviewers
Stewardship Center
Using the Stewardship Center you can implement business process workflows that
facility data quality and governance applications.
Data quality exceptions can be generated in several Information Server products and
components including DataStage, QualityStage, and Information Analyzer. With the
Stewardship Center you can specify workflows to handle these exceptions. These
workflows include email notifications and signoffs.
The Stewardship Center also supports the development of a governance catalog in the
Information Governance Catalog. For example, the Stewardship Center can send
notifications to Catalog term reviewers for their approvals.
Topic:
Transformation
Topic: Transformation
DataStage
• Create codeless, visual designs of ETL
data flows using built-in transformation
components (stages) and links
Use stages to extract data from and load
data to data resources, including Developers Architects
database tables, sequential files,
Transform and aggregate any volume
enterprise resources of information in batch or real time
through visually designed logic
Links specify the flow of data from one
stage to another
Can create reusable sets of components
(shared containers) that can be shared
across jobs, projects, and developers
• Complete ETL functionality with
metadata-driven productivity
• Supports team-based development and
collaboration
Information Server technical overview © Copyright IBM Corporation 2015
DataStage
DataStage is the main Information Server product that is focused on transformation and
movement of information. DataStage enables codeless visual design of data flows, and
includes built-in transformation components (stages) and connectors.
DataStage is built around team collaboration and reuse. Everything from individual
stages, to connections, to entire data flows can be reused across different jobs and
projects. In addition, DataStage leverages the shared platform services for parallel
processing, administration, deployment, and connectivity.
FastTrack
Business
• Used in conjunction Users
with DataStage
• Build mapping
specifications that
describe and
document
DataStage ETL jobs Generated
DataStage job
• Generate DataStage FastTrack mapping
jobs from the specification
mapping
specifications
• Reverse-engineer
DataStage jobs into
mapping
specifications
FastTrack
Mapping specifications specify how data is mapped and transformed from source fields
to target fields. Business analysts create mapping specifications, leveraging source
analysis, target models, and metadata to facilitate the mapping process. Prototype
DataStage ETL jobs can be generated from these FastTrack mapping specifications.
These mapping specifications guide the DataStage developer’s work, and provide
DataStage them with a head-start in designing and building their DataStage jobs.
DataStage jobs can also be "reverse-engineered" back into mapping specifications that
document their mappings and transformations.
Topic:
Delivery
Topic: Delivery
Topic:
Information Server
Architecture
Metadata Metadata
Access Services Analysis Services
Services Domain
• Scale up by adding
processors or nodes
with no design change
or re-compilation
• External configuration
file specifies hardware
configuration and
resources Single
processor
SMP System MPP, GRID, and
Clustered
Systems
Architecture diagram
Information Server
Information Server Platform
Services 1 Repository 1
Client 1 .. N
Platform Services
Working Areas
Engine 1 .. N
DataStage/QualityStage
Scratch and Dataset
Information Server Engine
Information Analyzer data
Architecture diagram
Information Server clients include:
• Information Server Web Console (IS administration/reporting)
• DataStage/QualityStage clients (Administrator, Designer and Director)
• FastTrack client
• Metadata Workbench client
• Information Server Console: hosts Information Analyzer and Information Services
Director
• WebSphere Application Server (WAS) client
• Information Server Manager
• Multi-Client Manager
• Information Server Command Line Interface (istools)
Services:
• Uses IBM WebSphere Application Server (WAS) to implement the J2EE services
functionality
Repository:
• DB2, Oracle, SQL Server, etc.
Parallel engine:
• A C++ compiler is required to compile DataStage, QualityStage, and Information
Analyzer jobs into an executable form capable of being run by the parallel engine.
Platform topologies
Two-systems Three-systems
deployment deployment
Services
Services
Platform topologies
The diagram shows DB2 as the Repository database server, but Oracle and SQL
Server are also supported, as previously noted. Although only one Engine is shown for
each topology, Information Server supports multiple parallel engines on the same or
separate systems.
All tiers should be installed in the same physical LAN, connected by high-speed
network connections.
The Services and Engine platform types must match. The Repository database need
not match platform type of the Services and Engine.
Client tier
• Provides access to both Suite clients and Suite product and component clients
• Administrative clients include Information Server clients as well as clients specific
to Information Server hosted products:
Information Server Administration Console
− Security
− Session maintenance
− Logging and reporting management
DataStage Administrator client
− DataStage global and project configuration and defaults
Other Information Server products have a single client used for both administration and
user tasks
− Administrative tasks require product administrator authorization
• Clients for specific Information Server products and functional components:
Appropriate interfaces for the type of user (business or technical)
Facilitate the Information Server data quality, governance, integration, and delivery
functions
Client tier
Information Server products and components can be accessed through client
components. The client tier contains both administrative clients and user clients.
Some products and functionality are accessed through a web browser. These are
called "thin clients", because the functional components exist on the server but are
delivered to the web browser.
Other clients are called "thick clients", because functional components are installed and
exist on the client computer system as well as the server computer system.
Services tier
• Set of shared services that centralize core tasks across the platform
Administrative tasks such as security, user administration, logging, and
reporting
Repository services
Shared services allow these tasks to be managed and controlled in one
place, regardless of which product is using the service
• Various product components add additional product-specific services
to those that are deployed
• Deployed by IBM WebSphere Application Server (WAS)
Services tier
The Services tier consists of a set of shared services that centralize core tasks across
the platform.
Some services address functionality that is unique to a specific Information Server
product or component. Other services, such as security services, are used across
multiple products and components.
The services tier is deployed within an IBM WebSphere Application Server (WAS)
instance. The computer system running the WAS instance is referred to as the domain
or services host system.
Engine tier
• Components
Engine: The high-performance, parallel engine that performs analysis,
cleansing, and transformation
Connectors: Provide common connectivity to external resources such as
DB2, Teradata, Oracle, Sybase, InfoSphere MQ, and others
Packs: provide high-speed connectivity to packaged enterprise applications
QualityStage modules: a set of integrated modules for accomplishing data
cleansing and re-engineering tasks such as Investigating, Standardizing,
Matching and Survivorship
Service Agents: manages bi-directional communication between the engine
processes and the Repository
• To deploy the Engine tier to multiple machines, the Information Server
engine installation software is copied or NFS mounted to each engine
server
Engine tier
The engine tier consists of the following pieces:
• Information Server Parallel Engine: The high-performance, parallel engine that
performs analysis, cleansing and transformation processing
• Connectors: Provide common connectivity to external resources such as DB2,
Teradata, Oracle, Sybase, InfoSphere MQ, and others
• Packs: provide high-speed connectivity to packaged enterprise applications
• QualityStage Modules: a set of integrated modules for accomplishing data
cleansing and re-engineering tasks such as Investigating, Standardizing,
Matching and Survivorship
• Service Agents: manages bidirectional communication between the engine
processes and the Metadata Repository
To deploy the engine tier to multiple computer, the Information Server engine software
is copied or NFS mounted to each server.
Repository tier
• Stores objects and metadata for Information Server and each of its
hosted products
• Enables Information Server products to share metadata with each
other throughout the data integration lifecycle
• For the Repository database (named XMETA by default), the
Information Server installation package comes with DB2
An existing DB2 instance can also be configured
If another DBMS is used (for example, Oracle), scripts must be run before
the installation to configure the Repository
Repository tier
The Information Server Repository stores the objects and metadata produced and
consumed by Information Server hosted products and components. The Repository is
implemented as a database, named XMETA by default. Since all the products hosted
by Information Server use the same XMETA database, metadata produced by one
product can be shared with other Information Server products.
For the XMETA database, DB2 is supported. DB2 can be installed as part of the
Information Server installation or an existing DB2 instance can be used. Other
database systems, such as Oracle, are also supported.
Tier interaction
2. Authentication Service
retrieves credential
information
Client
Client Services Repository
Tier interaction
DataStage clients log into the IS Server and retrieve the DataStage credentials the
users are mapped to. The DataStage client, using the IS Authentication Service, logs
into the IS Server as follows:
• The host name and port number provided in the DataStage login window are
used to do an HTTP request with the IS server.
• The HTTP request is going to return the JNDI properties needed to establish a
remote EJB session between the client and the IS server. One of these JNDI
properties is the Provider URL which include the hostname and port number
(from the InfoSphere serverindex.xml file). The client uses JNDI lookups to call
and work with IS Services using the retrieved JNDI properties.
• The IS Server returns to the client the mapped credentials for the user. Even if
credential mapping is turned off (shared user registry mode), the credentials
needed to log in to the DataStage Server are returned from the IS Server (in this
case, the credentials will be the same as the ones used to login to the IS server).
These will allow the client to log onto the various DataStage Servers installed.
Purpose:
Check your understanding of the different products, components, and
functions of the Information Server product suite. As you take this test feel
free to review the unit content.
Question 1: Which Information Server product performs data cleansing? Choose one.
A. Information Analyzer
B. QualityStage
C. DataStage
D. FastTrack
E. Information Governance Catalog
Question 2: Which Information Server product performs data transformation? Choose
one.
A. Information Analyzer
B. QualityStage
C. DataStage
D. FastTrack
E. Information Governance Catalog
Question 3: Information Governance Catalog profiles source data.
A. True
B. False
Question 4: Which Information Server layer is comprised of design and operational
databases? Choose one.
A. Services (Domain)
B. XMETA Repository
C. Server
D. Client
Question 5: Which function does Information Server FastTrack perform? Choose two.
A. Creates business terms
B. Generates DataStage jobs
C. Runs metadata analysis reports
D. Creates data models
E. Creates mapping specifications for ETL jobs
Question 6: Which functions does Information Governance Catalog perform? Choose
two.
A. Browse Information Server information assets
B. Data cleansing
C. Cross-tool reporting
D. Data transformation
E. Data governance
Question 7: The DataStage Multi-Client manager allows you to do which of the
following? Choose one.
A. Connect to multiple repositories
B. Connect to multiple servers
C. Install multiple DataStage client versions on a workstation
D. Install multiple repositories
Question 8: QualityStage cleanses data by using logic provided by which object?
Choose one.
A. Rule Sets
B. Services
C. Connectors
D. Packs
Unit summary
• List the Information Server functional categories
• List the Information Server products and components that support the
Information Server functional categories
• List the Information Server software, architectural tiers
Unit summary
Unit objectives
• Using the Information Server Launch Pad to access Information Server
thin clients
Administrative Console
Information Governance Catalog
Metadata Asset Manager
• Accessing Engine clients
DataStage / QualityStage clients
FastTrack
Information Server Manager
• Information Server Console clients
Accessing Information Analyzer
Accessing Information Services Director
Unit objectives
Services
Multi-
Information
Server Console Client
Manager
DataStage
FastTrack
clients
Topic:
Thin clients
Administration Console
• Log in using Information Server administrator user ID
By default, isadmin
• Perform general Information Server administrative tasks, including:
Session management
User management
Reporting management
Administration Console
Here is the log in window for the Administration Console. The log in windows for other
think clients are similar. Enter your user ID and password, with appropriate
authorization, in the User name and Password boxes. Then click the Login button.
To log in to the Administration Console you require the Suite Administration
authorization role. When Information Server is installed a user ID named isadmin is
created by default with this authorization.
The Administration Console is specifically designed for Information Server
administration tasks including session management, user management, and reporting
management.
Manage reports
Manage sessions
Manage users
Clean up Repository
assets
Create connections
and import
information assets
Category name
Category description
Category terms
Category hierarchy
Topic:
Engine clients
Engine clients
• The DataStage parallel engine is used not only by DataStage but also by several
other Information Server products and components, in particular Information
Analyzer
QualityStage is embedded within DataStage and also uses the DataStage parallel engine
• DataStage clients
Administrator client: Used for DataStage, QualityStage development and runtime
administration
Designer client: Build and run DataStage jobs
Director client: Primarily used by DataStage operators to run DataStage jobs
− A DataStage operator is a user who can run job but cannot build or administer them
• Operations Console
Monitor DataStage jobs as they run
Monitor Engine resource usage
• FastTrack
Create mapping specifications for DataStage jobs
• Multi-Client Manager
Switch between different DataStage client versions
Engine clients
The Information Server Engine system refers to a computer system where DataStage is
installed. For this reason it is also called the DataStage Engine. It is called the Engine
because this is the system where jobs are run that perform various Information Server
tasks. Within an Information Server domain there can be multiple engine systems.
DataStage actually has two engines: the parallel engine and the server engine. These
refer to two types of DataStage jobs that can be run: parallel jobs and server jobs.
When the word engine is used without qualification, it refers to the parallel engine.
Engine clients refers to the DataStage product clients (Designer, Administrator,
Director) as well as the clients for other products and components associated with
DataStage. The Operations Console is a client used to monitor running DataStage jobs.
This client is discussed in a later unit. The Multi-Client Manager is a client used to
switch between different versions of DataStage.
DataStage
administrator
user ID
DataStage
server system
DataStage Add/Delete
projects projects
Specify project
properties
DataStage
project
Repository
DataStage
parallel job
DataStage
components
palette
Operations Console
Job activity
Engine
status
System
resources
Operations Console
In this graphic, you see the Dashboard tab of the Operations Console. The Operations
Console opens to the Dashboard tab, which contains three sections of information.
The Job Activity section shows which jobs are currently running and their statuses
within a time range, for example, last 3 days. In this example, no jobs are currently
running or have been running for the last few days.
The Operating System Resources section displays the CPU usage and free memory
that is currently available within a time range for DataStage jobs running on the
selected Engine.
The Engine Status section displays the current status of engine services, including the
Operational Console services and WLM (Workload Management).
FastTrack
• Fat client
• Logon procedure same as for other fat clients
• Used to create mapping specifications
Defines mappings, filters, and transformations between source and target
columns
DataStage jobs can be generated from mapping specifications
• Administrative tasks
Can define data source connections within FastTrack that are available
throughout Information Server
− Connections defined in Metadata Asset Manager are also available within
FastTrack
Import metadata of mapping specification sources and targets
FastTrack projects configuration
FastTrack
Logging into FastTrack is similar to logging into other fat clients. You specify the
services system and the port used to communicate with it, and you specify a user ID
and password with FastTrack credentials.
FastTrack is a product designed to work with DataStage. With FastTrack you can
create mapping specifications that document the mappings and transformations of a
DataStage job. This mapping specification can be used to document a DataStage job,
as well as to provide a DataStage developer with specifications for building it.
From mapping specifications, prototype DataStage jobs can be generated, which
implement the mappings and transformations specified in the mapping specification.
New
connection
Multi-Client Manager
• DataStage clients for different versions of DataStage can be installed
on a single computer system
But only one version can be active at a time
• Use the DataStage Multi-Client Manager to switch between different
versions of DataStage clients
Multi-Client Manager
From a single client system you can connect to multiple DataStage server systems, but
the DataStage clients work with only one version of DataStage. In order to connect to
DataStage servers running different versions of DataStage you need to install clients for
both versions and use Multi-Client Manager to switch between them.
Topic:
Information Server console
clients
Engine
credentials
ODBC
connection
to IADB
Validate database
settings
Demonstration 1
Working with Information Server Clients
Demonstration 1:
Working with Information Server clients
Purpose:
This exercise helps familiarize you with the Information Server clients used
for administration. Information Server hosts a complex array of products.
Although you do not need to acquire in-depth knowledge of all of these
products, you need to have some understanding of what they do and how
they are used.
7. When you are done exploring the Administration tab, click and explore the
Reporting tab. In particular, expand the Report Templates folder to view the
templates for the different types of reports that can be created and run.
3. When you are finished exploring, click Logout at the top right to log out of
Metadata Asset Manager.
Task 3. Log into and explore the Information Governance
Catalog.
The Information Governance Catalog can be used to browse information assets
in the repository and to develop and maintain a governance catalog of
categories, terms, policies and rules. A sample governance catalog has been
imported into the Repository on the course lab image for you to explore.
1. From the Information Server Launch Pad, click the Information Governance
Catalog link.
2. If necessary, log in as student/student.
6. Explore, for instance, the database asset named SAMPLE. Click on the
SAMPLE link.
7. Click on the STUDENT schema in the Database Schemas folder.
Notice the database table metadata that is stored in the Information Server
Repository.
8. Click on, for example, the EMPLOYEE table link to display the columns of this
database table.
10. Explore some of the terms in the Physical Address category, for example,
StreetAddress.
2. Click the Home pillar icon (blue sphere in the upper left portion of the window)
and then click Configuration>Analysis Settings.
There are three tabs for configuration.
5. When Information Server is installed a database named IADB is created for use
by Information Analyzer to store profiling information. On the Analysis
Database tab, the connections used to access this database are specified.
In this example the non-JDBC connection is named IADB. This connection was
created in Metadata Asset Manager.
In this example the JDBC connection is named jdbc/IADB. This connection
was created during Information Server installation.
7. When a data profiling job is run, data in columns in a table are profiled. On the
Analysis Settings tab profiling defaults are specified. Shown are the default
settings configured when Information Analyzer was installed.
3. Select the DSProject DataStage project. Then click Properties to open the
Administrator Project Properties window. Explore each of the tabs.
4. The Environment button is used specify environment variables that control the
DataStage development and runtime environments.
5. The Permissions tab is used by a DataStage Administrator to specify
DataStage project user roles.
6. Close DataStage Administrator.
7. Optionally, use the same procedure to log into and explore DataStage
Designer. When you log into Designer you log into a specific project. Here, log
into the project named DSProject.
The top left panel displays jobs that have run and are currently running on the
Engine. In this example, no DataStage jobs are running or have been running
over the last several days.
The top right panel displays the status of the Engine services. In this example,
all Engine services are running OK.
The bottom two panels display the CPU and memory resources on the system
running the Engine.
Unit summary
• Using the Information Server Launch Pad to access Information Server
thin clients
Administrative Console
Information Governance Catalog
Metadata Asset Manager
• Accessing Engine clients
DataStage / QualityStage clients
FastTrack
Information Server Manager
• Information Server Console clients
Accessing Information Analyzer
Accessing Information Services Director
Unit summary
Unit objectives
• Configure Suite Users and Groups
• Configure DataStage credentials for DataStage Engine users
Unit objectives
Topic:
Information Server user
configuration
Suite roles
• Suite Administrator: Maximum privileges
• Suite User: Minimum requirement to access any IS suite component
or product
• Common Metadata Administrator
Full functionality within Metadata Asset Manager to browse and manage
metadata assets
• Common Metadata Importer
Required to import metadata assets into the Repository in Metadata Asset
Manager
• Common Metadata User
Required to browse metadata assets in Metadata Asset Manager
Suite roles
There are four different types of Suite roles. The bottom three roles apply to Metadata
Asset Manager product and related tasks using istool.
There are two standard Suite roles: Suite Administrator, Suite User. A Suite
Administrator can log into the Information Server Administration Console and perform
any task, including creating user IDs. A Suite User has limited authority within the
Information Server Administration Console and other IS products. A Suite User can, for
instance, log into the Administration Console and view reports, but cannot create user
IDs.
The Suite User role is the minimum role required to do anything (requiring
authorization) within Information Server.
Suite roles
Suite
Component
Group ID and other roles
attributes
Suite roles
Suite
Component
roles
Topic:
Credential mappings
Credential mappings
• By default the Information Server (DataStage) Engine uses a different
user registry, namely the Engine operating system, than Information
Server
• To use and access the Engine, a user ID must be mapped to an
Engine operating system user ID
• A default can be specified so that all (non-mapped) Information Server
user IDs get mapped to the same, default operating system user
• Credential mappings are specified in the Information Server
Administration Console
Click Domain Management > Engine Credentials
Credential mappings
If Information Server and DataStage do not share the same user registry, which is the
default installation, then mappings must be created between Information Server user
IDs, having DataStage Administration or DataStage User roles, and user IDs that exist
locally in the operating system registry where DataStage is installed.
Assume that DataStage is using the operating system user registry, which is what it
uses by default. A credential mapping consists of mapping an Information Server user
ID (and password), who has a DataStage User or Administrator role attached to it, to an
operating system user ID (and password).
Alternatively, a single operating system user ID and password can be specified as the
default operating system user ID that all Information Server user IDs are mapped to.
Engine credentials
Engine operating
system user ID
Engine credentials
IS user ID
Engine OS user ID
Demonstration 1
Authentication and Suite Security
Demonstration 1:
Authentication and Suite Security
Purpose:
In this unit, you learn how to create users and groups in the Information
Server Administration Console.
Notice that there are several users already created including student, which
you have been using to log into clients: isadmin, the default Information Server
administrator ID created during Information Server installation, which you have
been using to log into the Administration Console, and wasadmin which is the
default WebSphere Application Server (WAS) user ID.
4. Click Cancel, and then open user student and examine its Suite and Suite
Component roles.
Notice it has all roles!
5. Define a new user named dev1. For simplicity here, type dev1 in all required
fields (the fields with the asterisks).
6. Expand the Suite folder, and then select the Suite User role (which by default
is selected).
This role is required by every Information Server user.
7. Expand the Suite Component folder, and then select the Information
Governance Catalog User role.
This will allow dev1 to use browse information assets in the Information
Governance Catalog.
8. Click Browse in the Groups panel, select the DEV group, and then click OK.
This gives dev1 the developer roles the DEV group possesses.
9. Click Save and Close, and then verify in the Users list that the new user, dev1,
has been created.
Task 3. Provide Engine credentials to a user.
1. Click Domain Management > Engine Credentials.
2. Select the Engine (edserver), and then click Open user Credentials.
3. Browse for user dev1 and add dev1 to the Map User Credentials list.
4. With dev1 selected, type in credentials for this user in the Assign User
Credentials area. Map dev1 to dsadm / dsadm, which is a valid OS user on
the DataStage Engine system.
5. Click Apply.
You should see dev1 mapped to dsadm.
You may be wondering whether you can now log into DataStage using the dev1
ID, since you have both created the user ID and given it DataStage credentials.
The answer is that you cannot, because dev1 still needs to be given
authorization to log into a DataStage project from a DataStage administrator.
This is done in the DataStage Administrator client on the Permissions tab.
Results:
You learned how to create Information Server users and groups.
Unit summary
• Configure Suite Users and Groups
• Configure DataStage credentials for DataStage Engine users
Unit summary
Session management
Unit objectives
• View a list of active sessions
• View session properties
• Disconnect sessions
• Configure global session properties
Unit objectives
Session
properties
Session details
Session
properties
User
attributes
Session details
Select a session and then click Open to view details about it and the user logged into
the session. In this example, a user named student is logged into the session.
Information about that user, including the authorization roles the user possesses is
displayed.
Some information about the session is also displayed, including its duration and the
number cached objects, which indicates how many resources the session is
consuming.
Disconnecting sessions
• To disconnect specific sessions:
• From the Active Sessions tab, select the connections you want to
disconnect
Click Disconnect
• To disconnect all sessions (including your own session)
Select Disconnect All
Disconnect all users
Disconnect selected
users
Session management © Copyright IBM Corporation 2015
Disconnecting sessions
You can disconnect active sessions by selecting the sessions and then clicking
Disconnect. You can also disconnect all sessions by clicking Disconnect All. Note
that this will also disconnect your session in the Administration Console as well as all
others.
Demonstration 1
Session management
Demonstration 1:
Session management
Purpose:
You will learn how to manage sessions using the Information Server
Administrative Console.
Notice that type displayed for each session. The Administration Console type is
listed as "Web Console". The Information Server Console type is listed as
"Console".
Notice session details such as the session duration as well as details about the
user, including the user’s authentication roles and user attributes.
7. Click Close to close and return to the Active Sessions list.
8. With the same session selected, click Disconnect and complete the process of
disconnecting the session.
Outside of the Administration Console, notice that the Information Server
Console session has been terminated (stopped working).
9. Back in the Administration Console, click Global Session Properties, increase
the inactive session timeout period to 180000 seconds, and then click Save and
Close.
Results:
You learned how to manage sessions using the Information Server
Administrative Console.
Unit summary
• View a list of active sessions
• View session properties
• Disconnect sessions
• Configure global session properties
Unit summary
Managing reports
Unit objectives
• Create and manage report folders
• Create a report
• Run a report
• View report results
• Control report access
Unit objectives
Reporting administration
• Managed on the Information Server Administration Console Reporting
tab
• Reports can be created about Suite component activities and
administrative functions
• Report formats include: HTML, PDF, RTF, TXT, XML
• Access to reports, report templates, and report results can be
restricted
• Reports are organized into folders
Root folder is named Reports
Additional folders can be created
− Folders can only be created by Information Server administrators
Reporting administration
Information Server reporting is managed through the Information Server Administration
Console Reporting tab. The Reporting tab, contains a folder of templates to build your
reports, and a set of folders you can use to store your reports. Access to reports, report
templates, and report results can be restricted.
Reports are stored and organized in folders. Folders can only be created by Information
Server administrators.
Creating a report
• Select a report template
Report templates are organized by Suite product or component
Example for Administration: "List of users"
• Click New Report
• Browse for report folder
• Report settings
Name
Parameters
− Vary depending on report type
− Example: DataStage projectname
Format: HTML, PDF
Settings include: Expiration, History policy
Creating a report
There are a number of pre-build reports that can be run from within Information Server
products.
New reports can also be created on the Reporting tab. You begin by selecting a report
template. Information Server administrators have access to all of the report templates,
but not all templates are available to all users. Then you specify the report settings in
the new report.
When you create a report you specify the folder to store the report in. The folder must
already exist at the time you create the report.
Several output formats are supported, including HTML and PDF.
Report folder
Add report to
Report name favorites folder
Report parameters
View results
Selected report
Run report
Report results
Report results
The graphic shows an example of a List of Users report results. This is available after
the report is run. In this example, the users and their user attributes are listed.
The criteria by which this list of users was chosen is described in the bottom half of the
upper panel. In this case, this report selects users who have one or more DataStage
product roles.
When you create a report you optionally specify report parameters. When you run the
report you are given the opportunity to specify additional report parameters that apply to
just this report run.
Demonstration 1
Managing reports
Demonstration 1:
Managing reports
Purpose:
You will learn how to manage reports.
6. Click OK.
The new report folders are displayed.
10. Click Finish > Save and Close from the Finish menu in lower right portion of
the window.
You will be returned to the Report Template to Work With window.
4. Click Report Result Status, which displays a list of the latest report executions.
5. Select your report execution, and then click View Report Result.
6. Close the report execution browser tab to return to the report execution list.
Task 4. Specify report access control.
1. Select Favorites.
2. Select your IS Users report and then click Open Access Control.
3. Click Browse, and then add student to the access control list. Afterwards click
OK.
When a user is added, the user automatically has read access to the report.
4. In addition, give student access to run the report.
Unit summary
• Create and manage report folders
• Create a report
• Run a report
• View report results
• Control report access
Unit summary
Administrative tools
Unit objectives
• Run the SessionAdmin tool
• Run the DirectoryCommand tool
• Run the Encrypt tool
Unit objectives
Topic:
Session management
using SessionAdmin
SessionAdmin tool
• Used to manage and monitor active Information Server sessions
• Command line tool
Duplicates session management functionality performed in the
Administration Console
• To run the tool:
Open a command window
Change to the \InformationServer\ASBServer\bin directory
Run the command with necessary parameters
− Most parameters require a value following the parameter keyword
• Parameter keywords are preceded by a dash
SessionAdmin tool
The SessionAdmin tool is the command line equivalent of the session management
functionality available in the Administration Console. This command line tool is available
in the \InformationServer\ASBServer\bin directory. To run the command first open an
operating system command window.
-ks parameter
Topic:
User management using
DirectoryCommand
Topic:
User management using
DirectoryCommand
Creating users
• Use one or more instances of the –add_user parameter followed by a
user information string with values separated by the tilde (~):
userID~password~firstName~lastName…
Example: -add_user dev2~dev2~Nelsen~Rock
• To add a user to a group use the –add_users_group parameter
Example: -add_users_group dev1~dev2$DEV
− Here users dev1 and dev2 are added to the DEV group
− Here the dollar sign ($) is used to separate the users list from the name of the
list of groups
Creating users
In the top graphic, the -add_user parameter has been used to create a new user. The
user information is separated by occurs of the tilde. In this example the user information
includes the user ID, password, first name, and last name. Additional information can
be added.
In the bottom graphic, the -add_users_group parameter has been used to add the
users just create (dev1 and dev2) to a group named DEV. The dollar sign ($) is used to
separate the string of user names from the group name. The user names are separated
with the tilde.
Topic:
Encrypt passwords using
the Encrypt command
Encrypt command
• Provides a method to encrypt user credentials
• Command line tool
• To run the tool:
Open a command window
Change to the \InformationServer\ASBServer\bin directory
Run the command with or without the text to be encrypted
Copy the encrypted string as a value to the password parameter of any
commands that support it
Encrypt command
The encrypt command provides a method for encrypting a string. Generally it is used
to encrypt passwords. Like the other tools we have discussed the encrypt command is
found in the \InformationServer\ASBServer\bin directory.
Encrypted password
Administrative tools © Copyright IBM Corporation 2015
Example credentials
file with encrypted
password
Credentials file
Demonstration 1
Administrative tools
Demonstration 1:
Administrative tools
Purpose:
You will use command line tools to perform administrative tasks.
Notice the type displayed for each session. The Administration Console type is
listed as "Web Console". The Information Server Console type is listed as
"Console."
5. Open up a Windows command window. To do this click Start > Run and then
open with cmd.
7. Each time you run the SessionAdmin command you should use the following
parameters:
• –url https://edserver:9443 to connect to Information Server
• –user isadmin –password isadmin to authenticate with Information Server
8. Enter the command to list the user sessions. This uses the –lus (-list-user-
sessions) parameter.
Note: This command and others are contained in the Commands.txt file in
your ISAdmin1_Files folder.
• The most difficult part of doing this is typing the session ID. One way to make
this a little easier is to first run the –lus command to a temporary file. Then you
can copy and paste the session ID into the command to kill the session.
3. Each time you run the DirectoryCommand command you should use the
following three parameters:
• –url https://edserver:9443 to connect to Information Server
• –user isadmin –password isadmin to authenticate with Information Server
4. Enter the command to list users and groups.
• Use the –list parameter followed by the types of objects to list. The USERS
value indicates user IDs. The GROUPS value indicates groups. Use ALL to
list everything. Separate multiple objects names using the tilde (~).
5. Compare the list you get from the command with the list displayed in the
Administration Console. My lists were the same.
Notice that in addition to the users and groups, you get a list of all the user roles
and a list of all the DataStage projects.
Task 3. Use the DirectoryCommand tool to create a user and
add the user to a group.
1. Run the DirectoryCommand command to create a user ID named dev2,
password pass2, first name Nelsen, last name Rock.
2. Use the –add_user parameter followed by a user information string with values
separated by the tilde (~):
• userID~password~firstName~lastName
3. Verify in the Administration Console that the user was created. (You may need
to refresh the Administration Console to see the new user.)
5. Verify in the Administration Console that the user has been added to the group.
Here group DEV has been opened in the Administration Console. Notice that
dev2 is listed as a user in the group in the right panel.
Task 4. Create a credentials file with an encrypted password.
In this task you will create a credentials file for user isadmin whose password is also
isadmin.
1. In a command window change to the
c:\IBM\InformationServer\ASBServer\bin directory.
2. Run the encrypt command to encrypt the password isadmin. Write the result
to a temporary file named ISCredentials.txt in the c:\Temp directory.
4. Run the DirectoryCommand command to list users and groups using the
–authfile parameter to invoke the credentials file your created in the previous
step.
Results:
You have used command line tools to perform administrative tasks.
Unit summary
• Run the SessionAdmin tool
• Run the DirectoryCommand tool
• Run the Encrypt tool
Unit summary
Unit objectives
• Use istool to export common metadata assets
• Use istool to query information assets
• Use istool to export security assets
• Use istool to export reporting assets
• Use istool to export individual product assets
Unit objectives
istool commands
• Engine management:
Build and deploy DataStage packages
− Packages contains DataStage jobs and supporting assets
Purge DataStage operational metadata
• IS Repository asset management
Delete common metadata assets
Export and import information assets to and from a file
− Individual product assets, e.g.:
• DataStage jobs
• Governance catalog assets
− Common Repository information assets, e.g.:
• Database assets
− Reporting assets
− Security assets
istool commands
Our focus in this unit is on using istool for asset management. This includes deleting
information assets from the Information Server repository as well as the import and
export of information assets.
The Information Server Repository contains a number of different types of information
assets including common repository assets such as database assets, reporting assets,
security assets, and individual products assets such as DataStage jobs and Information
Governance Catalog governance assets. Different parameters are used to handle
these different types of assets.
Invoking istool
• Command-line interface
Syntax of the istool command is:
− istool <command> <authentication_parameters> <archive> [ archive
parameters ]
−[ generic_params ][ command specific_parameters ]
Generic parameters: -help, -verbose, -silent
Authentication parameters: -domain, -username, -password
− Alternatively you can use the -authfile parameter to specify a credentials file
− If -domain is not specified, the primary domain server is assumed
Invoking istool
The istool utility is very powerful. It supports four basic commands: export, import,
build package, and deploy package. The build package and deploy package
functionality has been captured into the Information Server Manager tool. It is part of
DataStage administration and is not covered in this course.
There are two common parameters in the istool command. You will always need to
specify authentication, that is, the services domain you are logging into and the user ID
and password you are using to do so. The istool command supports the -authfile
parameter so that a credentials file can be used for authentication.
Secondly, you will always be specifying a path to the archive file. The archive file is
where the exported assets are or will be stored on the file system, during an import or
export.
Topic:
Exporting common
repository metadata
Archive path
Output file
Identify strings
Topic:
Exporting security assets
-security parameter
Preview parameter
Topic:
Exporting reporting assets
-reportName
parameter
Topic:
Exporting individual
product assets
-ia parameter
-projects sub-parameter
Topic:
Importing information
assets
Import example
• Import all the common metadata information assets in the
CommonMeta.isx archive
• First use the -preview option to preview the import
• Afterwards, use the -replace option to overwrite existing assets
-cm parameter Archive file
Import example
In this example the -cm parameter is used to select common metadata assets from the
archive file for import.
The -replace option is used to replace existing assets with the same names.
Demonstration 1
Managing Information Server repository assets
Demonstration 1:
Managing Information Server repository assets
Purpose:
You will learn how to import and export information assets using istool.
2. Open the Information Server Command Line Interface by clicking on its icon on
the desktop.
Note: This is not the same as the Windows Command window which you used
in the previous exercise.
Notice the istool> prompt.
3. At the prompt, type the word export, and then type the –authfile parameter
followed by its value, namely the path to the credentials file:
c:\Temp\ISCredentials.txt.
4. Now add the –ar parameter followed by a path to the archive file in double
quotes. Name the file CommonMeta.isx and store it in the c:\Temp directory.
5. Now add the –cm parameter followed by the identity string to the CUSTOMERS
table in single quotes.
6. Click Enter.
2. Now add the –cm parameter and its value in single quotes. Retrieve a list of all
schemas in all databases. The *.sch extension indicates the schema type.
3. Click Enter.
Notice that the command appended “commonmetadata” to the name of the
output file.
4. Open the output file to view the list of schemas. (Your results may differ from
what you see here.)
2. Now add the –security parameter and its value in single quotes.
• Include the –securityUser parameter to export user security assets.
• Include the –userident parameter followed by an identity string in double
quotes to identify the users. In this case, specify all users.
• Use the –includeRoles and –includeUserGroupMemberships to include
user role and group membership information for the users exported.
3. Add the –preview parameter to preview what will be exported before the export
runs.
4. Click Enter.
2. Now add the –report parameter and its value in single quotes.
• Include the –reportName parameter followed by an identity string in double
quotes to identify the reports. In this case, specify all reports.
• Use the –includeAllReportResults to include the results of executed reports.
3. Optionally add the –preview parameter to preview what will be exported before
the export runs, and then click Enter.
4. Click Enter.
2. Now add the –datastage parameter and its value in single quotes.
• Specify in your identity string that you want to export all parallel jobs in the
Jobs folder of the DataStage project named DSProject.
• The *.pjb extension stands for the parallel job type.
• Your identity string should have the following format:
‘hostName/projectName/folderName/*.pjb’
3. Add the –preview parameter to preview what will be exported before the export
runs, and then click Enter.
2. Now add the –datastage parameter and its value in single quotes.
• Specify in your identity string where you want the jobs to written. In this case it
is to a project named dstage1 on the edserver host.
3. Add the –preview parameter to preview what will be imported before the import
runs, and then click Enter.
In this case, these assets already exist in the project, but when you run the
command this may not be the case.
4. Remove the –preview parameter, and then add the –replace parameter. This
will overwrite assets if they already exist. Then click Enter.
5. Optionally, if you feel comfortable using DataStage, you can log into the
dstage1 project in DataStage Designer and verify that the jobs were imported
into the Jobs folder.
Results:
You have learned how to import and export information assets using istool.
Unit summary
• Use istool to export common metadata assets
• Use istool to query information assets
• Use istool to export security assets
• Use istool to export reporting assets
• Use istool to export individual product assets
Unit summary