Professional Documents
Culture Documents
Blue Print 1
Blue Print 1
01 November 2012
Introduction
IBM InfoSphere Blueprint Director is a member of the IBM InfoSphere Foundation Tools portfolio.
Blueprint Director is aimed at the information architect designing solution architectures for
information-intensive projects. A thorough introduction to Blueprint Director is in an introductory
tutorial titled "Planning an Integration Landscape." Based on understanding the functionality of
Blueprint Director, we provided a first best-practices article working with blueprint templates here:
"Best practices for IBM InfoSphere Blueprint Director, Part 1: Working within a project lifecycle."
We assume for the purpose of this article that you are familiar with the content of the introductory
tutorial and have hands-on experience with Blueprint Director.
Our purpose here is to provide best practices for users of Blueprint Director to create your own
blueprints from scratch. In addition, we provide guidance on how to use Blueprint Director to
incorporate metadata landscapes as well as physical Information Server landscapes as part of the
specific information landscape. The best practices provided in this article are needed for:
Copyright IBM Corporation 2012
Best practices for IBM InfoSphere Blueprint Director, Part 2:
Designing information blueprints from the ground up
Trademarks
Page 1 of 24
developerWorks
ibm.com/developerWorks/
Creation of legible blueprints from scratch Since one of the main purposes of using
Blueprint Director is to effectively communicate a specific solution architecture, alternate
solution architectures and their various advantages and disadvantages or the impact of
change requests to a solution architecture, clarity in the blueprints is critical. We provide a
comprehensive list of best practices to create legible and effective blueprints from scratch. We
also include best practices on adding palette extensions.
Metadata landscapes Information-intense projects are usually complex and involve
business, technical, and operational metadata, which is interlinked and interconnected. A
challenge from a project management perspective is to understand how a feature change
request on a process level affects the information structures supporting the business
processes. Without the ability to understand whether a change request is coming in which
data models, ETL jobs, etc. would be affected by the change, it's hard to control and manage
change effectively over time. Using Blueprint Director to visualize the metadata landscape
(as an information landscape itself) helps you, the information architect, to quickly understand
where you need to look to understand the impact of such change. We provide best practices
on metadata landscapes.
Physical deployment landscapes Platforms including IBM InfoSphere Information Server
or InfoSphere Master Data Management (MDM) are often deployed on multiple physical
nodes (see Resources) for a single environment used for sandbox, development, test,
preproduction and production. In addition, in more complex scenarios like SAP consolidation
projects, managing code artifacts for Information Server also requires management of the
corresponding ABAP code artifacts and SAP configurations consistently across SAP and
Information Server development, test, and production environments (see SAP Packs in
Resources). To ensure proper code propagation, backup/restore, migration, and fixpack
deployment best practices across such infrastructures, it is helpful to communicate the layout
of the infrastructure to administrators, developers, and project managers. We provide best
practices how Blueprint Director can be used to develop such blueprints.
This article will help you:
Learn how to create effective blueprints from scratch.
See how to apply Blueprint Director to understand and manage metadata landscapes.
See how to use Blueprint Director to visualize the physical deployment landscape.
Page 2 of 24
ibm.com/developerWorks/
developerWorks
Data replication is capturing changes on one or multiple sources and replicates them to one or
multiple targets.
For capturing the changes, at least two techniques exist at the database level with advantages and
disadvantages:
Trigger-based capture of changes
Transactional log-based capture of changes
There are at least four recognized data replication topologies:
Unidirectional replication between a source and a target system
Bidirectional replication between a source and a target system requiring considerations on
possible conflicts and their resolution
Roll-up replication is a typical scenario in data warehousing environments where data from
multiple data sources is replicated to a single target: the data warehousing system
Distributed replication occurs in typical scenarios like replication from a central data
warehouse to local data marts or from a central master data management system into
multiple targets consuming master data
Data volume: This metric determines how many capture and apply agents might need to operate
in parallel on sources and targets to achieve the throughput required and might also affect the
physical structure and location of the systems involved.
Figures 1 and 2 show a possible result to decompose the replication architecture pattern. As
you can see, not all of the above-noted characteristics have been placed into a single diagram.
Instead, just the core idea of moving data from one or multiple sources to one or multiple targets is
shown at the high level.
Then, by creating a sub-diagram related to the replication function, we proceed into the next
diagram (see Figure 2). Here, a decision has been made to show that the replication topologies
might look different depending on the use case. However, details on which capture and apply
techniques, which concrete systems are involved, etc. are not yet included, so the diagram
Best practices for IBM InfoSphere Blueprint Director, Part 2:
Designing information blueprints from the ground up
Page 3 of 24
developerWorks
ibm.com/developerWorks/
focuses clearly on the topology idea. With these two diagrams, a reusable structure for the data
replication architecture pattern is created, which can be tailored in any concrete project with
additional diagrams to concrete systems and physical considerations of the solution landscape.
So, for the information architect, the question arises as to how to put all these characteristics into
one or multiple diagrams in a blueprint within Blueprint Director. For the root diagram, the key
design point is to keep it simple. It is the first impression of the solution you propose and excessive
detail prevents others from getting the core idea of the solution.
Following this example, let us consider the best practices applied here, which we will introduce in
turn:
Managing detail
What belongs on a blueprint?
Using grouping containers
Reusing elements through references
Creating a legible layout
Abstracting from the runtime
Considerations for metadata landscapes
Considerations for physical landscapes
Managing detail
When you start to create a blueprint, the first diagram (the root diagram) shown creates the first
impression. As noted, the key for this first diagram is to really keep it simple. In the example with
the replication architecture pattern, as shown in Figure 1, this best practice has been applied
because the solution architecture shown has been reduced to its minimal number of constituents:
One or multiple source and target systems between which a data replication pattern is applied.
Thus, on the root diagram, you should apply the following considerations:
Avoid excessive details Instead, reduce to the core constituents (see Figure 3).
Identify the core components detailed on subsequent sub-diagrams.
Best practices for IBM InfoSphere Blueprint Director, Part 2:
Designing information blueprints from the ground up
Page 4 of 24
ibm.com/developerWorks/
developerWorks
Think about decomposition points Where in your root sketch can you add a sub-diagram
entry point which you can unfold to the next more detailed conceptual level?
Note that the advice to avoid excessive detail should be applied also to sub-diagrams. If a diagram
on any level is too busy, it is not yet structured well and simplification with a decomposition into
multiple layered diagrams may still be pending.
Information processing flows may be lengthy. Just consider the necessary steps to extract data
from multiple sources into a staging environment where data analysis, structural alignment, data
standardization, matching, and de-duplication and final transformations before loading it to the
target are applied.
Generally, we consider it best practice to avoid lengthy and complex data flows across multiple
functional concerns (extraction, profiling, cleansing, etc.) in a single diagram. They are cluttered
and difficult to read and understand. Instead, if practical, abstract from the details by identifying
coarse-granular functional areas, which provide a high-level overview of the overall information
processing flow and decompose each functional area with sub-diagrams. Conceptually, consider
this a divide-and-conquer approach in the design phase. This best practice is illustrated with
Figures 4 and 5.
Page 5 of 24
developerWorks
ibm.com/developerWorks/
Figure 5. Design the base diagram with logical function groups and
decompose them into sub-diagrams
Page 6 of 24
ibm.com/developerWorks/
developerWorks
Aid the consumer of the blueprint by visually guiding the user exploiting the presence or absence
of links on elements:
Lack of a sub-diagram indicator on an element might indicate that the design work in this area
is not yet done.
Lack of asset link(s) may indicate that no requirements, design, or development (e.g.
DataStage jobs) has been done so far.
Use the notes feature to add checklists and annotations on what still needs to be done to the
diagrams.
Once the blueprint has been developed and a number of diagrams have been created, the search
function of Blueprint Director enables you to identify information throughout the blueprint. This is
particularly useful if you are not sure anymore where a specific detail is located.
Avoid placing tools used to construct the solution into the blueprint, as shown in Figure 7.
Basically, they are irrelevant from an information-flow perspective, adding complexity and
confusion. Also note that the same architecture shown in a solution might be able to be built with
tools from different software vendors. Therefore, particularly for blueprints capturing reusable
architectures, it is usually best practice to avoid tying the blueprints to particular software. The
discussion of applicable tools should be handled through a method describing best practices on
realization, and artifacts created by the tools mentioned in the method should be linked using the
asset-linking feature.
Best practices for IBM InfoSphere Blueprint Director, Part 2:
Designing information blueprints from the ground up
Page 7 of 24
developerWorks
ibm.com/developerWorks/
You should place method or process step describing the sequence of task to construct the solution
in a blueprint, as shown in Figure 8. Method and process steps are best captured in the linked
method. Again, this mistake violates the concept to focus on information flows in the blueprints and
just adds cluttering detail, which distracts the user from understanding how the information flows
by mixing in details on how to construct the solution.
Page 8 of 24
ibm.com/developerWorks/
developerWorks
The elements in the Groups Palette are Asset Set, Domain, and Project. If you need to select
among them, consider the following aspects:
Visibility of the contained elements:
Domain All elements are visible within the domain grouping. Note that sub-diagrams
cannot be based on the Domain itself.
Asset Set The constituents of the Asset Set are defined in a sub-diagram and are not
visible within the diagram containing the Asset Set, but sub-diagrams may be connected to
the Asset Set. Example: In the data replication example in figures 1 and 2, the constituents
of the sources and targets were not explicitly named. That's still a task on a sub-diagram,
which is fine because these two diagrams provide the baseline concept where the concrete
individual sources and targets are not yet relevant.
Project All elements are visible within the project grouping. Sub-diagrams cannot be based
on the Project itself.
Recommended elements which should be placed into this grouping container:
Domain Any
Asset Set In this grouping container-only assets such as databases, applications and
similar entities should be placed.
Project Any
Primary use case for grouping container:
Domain This grouping container is ideal for identifying architectural layers, major functional
areas or a grouping of related elements.
Asset Set Abstraction for list of assets in a diagram like lists of source or target systems,
etc.
Best practices for IBM InfoSphere Blueprint Director, Part 2:
Designing information blueprints from the ground up
Page 9 of 24
developerWorks
ibm.com/developerWorks/
Project This grouping container is ideal for identifying projects or project phases that make
up a changing information landscape. Where milestones are used, it may make sense to
associate specific milestones with specific project containers.
Page 10 of 24
ibm.com/developerWorks/
developerWorks
Page 11 of 24
developerWorks
ibm.com/developerWorks/
Page 12 of 24
ibm.com/developerWorks/
developerWorks
From a best-practices perspective, there are certain guidelines you should follow when using
conceptual views:
The decomposition for conceptual views is a free-form approach for the schemas and
is neither intended nor capable of replacing the necessary data model creation using
appropriate object or entity-relationship modeling techniques. Such modeling should continue
to happen in tools like IBM InfoSphere Data Architect. If you have Blueprint Director installed
and shell-shared with InfoSphere Data Architect, it is possible once a conceptual view of
the schema has been created in Blueprint Director to start the actual data modeling work
in InfoSphere Data Architect (and also seamlessly switch back to Blueprint Director in case
during the actual data modeling work a need to change the conceptual view has been
discovered).
The decomposition and creation of conceptual views for a schema is ideally driven and
governed by standard glossary definitions. Thus, entities in a conceptual view should be
linked to the appropriate terms in InfoSphere Business Glossary.
As a result of creating conceptual views, you create and capture the key concepts of this part of
the solution and are able to communicate them to appropriate audiences.
Page 13 of 24
developerWorks
ibm.com/developerWorks/
Page 14 of 24
ibm.com/developerWorks/
developerWorks
Page 15 of 24
developerWorks
ibm.com/developerWorks/
From a best-practices perspective, we advise not to extend the palette arbitrarily. The key decision
point is to make reasonable decisions between visual clarity vs. too much information (i.e. too
many visual images for people to keep track of). The purpose of an element in the palette is to:
Provide distinct icons help to differentiate among elements that are not the same.
Avoid having too many icons as that defeats the purpose because the user will be unable to
recognize all of them.
You need to strike a balance between adding for clarity and adding for the sake of having
something different.
Page 16 of 24
ibm.com/developerWorks/
developerWorks
This information blueprint can provide limited value simply as a diagram. A true information
blueprint must be both linked and actionable. Being linked and actionable means that the blueprint
is connected to and interacting with the tooling being used to construct the components of the
information-intensive solution.
Theoretically, an information blueprint can be linked to and interacting with any artifacts or tools,
but for simplicity, we can consider two main forms of linking:
Linking to the metadata artifacts of the solution
Linking to the development deployment tooling and artifacts supporting the solution
It is worth nothing that there can be some degree of overlap here. Metadata-aware tooling, for
example, will typically contain the metadata of components of the solution (for any of the three
metadata forms discussed earlier) and the development artifacts being used to construct specific
components of the solution.
Page 17 of 24
developerWorks
ibm.com/developerWorks/
specific type of information like employee, for example); technical metadata (the logical and
physical structure of data in terms of logical and physical data models, for example); and
operational metadata (runtime details of an ETL job, such as the runtime and duration and number
of rows processed, for example). It is important to note that this metadata should not be seen as a
set of interesting, but disconnected structures. Rather, this metadata becomes most useful when it
is seen as an interconnected web of artifacts where the edges between these artifacts have clearly
understood semantics depending on the artifacts involved and the relationship type considered.
For example, customer information exists in many enterprises in many IT systems with different
logical and physical data models, which can be linked to the same business term "customer"
in an enterprise glossary. A central MDM system might act as the trusted source of information
from which the customer information is fed to the numerous consuming systems with different
transformations mapping it from the customer data model in MDM to the various data models of
its consumers. Deciding to change the MDM data model in such an environment is something that
cannot be done without considering the impact to the related artifacts in environment. Thus, when
looking at the metadata landscape, the following aspects need to be considered separately:
Lifecycle management Where change management is an important aspect, capabilities
such as data lineage (where is the data I am looking at coming from?) and impact analysis
(how many artifacts are affected if a certain change is applied?) exploiting metadata are key
techniques for governing change management.
Design and development aspect Metadata is obviously a critical part in any data
design or data development project. Data models (the data about the data of the system
supposedly build or changed) are developed during the design phase and is a key input for
the development work of ETL jobs, etc.
With that in mind, the metadata landscape can be characterized as an information landscape itself,
below the level of a common business view because it represents insight into how information
is grouped and managed. From a best-practices perspective, we advise that you incorporate
a metadata landscape as a sub-diagram under a given information landscape. This helps to
understand how glossaries and models support a given data structure and where change/lifecycle
management needs to be applied in the target view. With such a metadata landscape diagram
available, you also get the following advantages:
You have a place to link directly to assets representing these metadata components.
You can build out conceptual maps under this metadata landscape.
Note that such a structure might require encapsulation into a reference object if the metadata
landscape supports multiple components. An example of a metadata landscape is shown in Figure
19.
Page 18 of 24
ibm.com/developerWorks/
developerWorks
In the metadata landscape shown above, it is not the individual glossary elements, logical entities,
or physical models that are included. It is the store of such items, treated as data, that is included,
which makes this metadata landscape another instance of an information blueprint. Modeling
remains the province of modeling tools, but the understanding of how the metadata as information
flows or moves from point to point is captured and understood.
Page 19 of 24
developerWorks
ibm.com/developerWorks/
artifacts are read-only, and that any change must be done in the development environment and
propagated through the appropriate test cycle again to the SAP production system. As illustrated
with this example, you can use Blueprint Director to establish an architectural view supporting
how code (or parameter configuration, patches, etc.) have to be propagated to test and production
environments.
As a side note, the SAP application icon shown in Figure 20 is a palette extension developed
for an SAP consolidation use case and gives you a concrete example how you can use palette
extensions in your own blueprints.
In Figure 20, for each environment, the actual physical deployment topology is not shown yet to
avoid excessive detail in a blueprint. For each environment, you can see the yellow +, indicating
we added sub-diagrams for more detail. Figure 21 shows the detail for the InfoSphere Information
Server (IIS) Development system. As you can see, Information Server is installed across five
physical nodes:
One application node (ISD)
Three nodes for three instances of the DataStage parallel Engine (PX1 to PX3)
One node for the metadata repository (XMETA)
All five nodes of the primary system are in one data center (IT Location 1). For the primary system
for high availability and disaster recovery (HADR) reasons, a secondary system in a different data
center (IT Location 2) is configured. Such a configuration for an IIS development system might be
required if the data migration development team consists of several dozen developers (we have
seen projects with 50-150 developers on an IIS development system) distributed across various
geographies demanding an environment with 24-hour availability at least from Monday through
Friday.
Page 20 of 24
ibm.com/developerWorks/
developerWorks
Conclusion
IBM InfoSphere Blueprint Director provides the capability for you to communicate a vision of your
information landscape to your organization and broader team. In this article, you have:
Reviewed best practices in creating your own effective blueprints from scratch.
Learned about the benefits for creating and managing blueprints, visualizing the metadata
landscape based on best practices.
Learned that you can incorporate the physical deployment landscape into your information
blueprint based on best practices.
By following these best practices, you can focus more closely on the critical aspect of
communication to drive your projects forward. Clarity in presentation, consistent understanding,
and an ability to view all aspects of the information landscape for your project at varying levels all
help facilitate this process.
Best practices for IBM InfoSphere Blueprint Director, Part 2:
Designing information blueprints from the ground up
Page 21 of 24
developerWorks
ibm.com/developerWorks/
Resources
Learn
For best practices in using an information blueprint in a project context, see "Best practices
for IBM InfoSphere Blueprint Director, Part 1: Working within a project lifecycle."
Learn about examples of Information Server SAP Packs landscapes in "Security and
deployment best practices for InfoSphere Information Server Pack for SAP applications, Part
2: Deployment."
Check out Designing a topology for InfoSphere Information Server.
For a tutorial on Blueprint Director, see "Designing an integration landscape with IBM
InfoSphere Foundation Tools and Information Server, Part 1: Planning an integration
landscape."
For basic information about using Blueprint Director, see IBM InfoSphere Blueprint Director.
Learn more about Information Management at the developerWorks Information Management
zone. Find technical documentation, how-to articles, education, downloads, product
information, and more.
Stay current with developerWorks technical events and webcasts.
Follow developerWorks on Twitter.
Get products and technologies
Build your next development project with IBM trial software, available for download directly
from developerWorks.
Now you can use DB2 for free. Download DB2 Express-C, a no-charge version of DB2
Express Edition for the community that offers the same core data features as DB2 Express
Edition and provides a solid base to build and deploy applications.
Discuss
Check out the developerWorks blogs and get involved in the developerWorks community.
Page 22 of 24
ibm.com/developerWorks/
developerWorks
Martin Oberhofer
Martin Oberhofer works as senior IT architect in Enterprise Information Architecture
with large clients worldwide. He helps customers to define their enterprise information
strategies and architectures, solving information-intense business problems. His
areas of expertise include master data management based on an SOA, data
warehousing, information integration and database technologies. He especially likes
to work with enterprises running SAP applications. Martin provides in a lab advocaterole expert advice for information management to large IBM clients. He started his
career at IBM in the IBM Silicon Valley Lab at the beginning of 2002 and is currently
based in the IBM Research & Development Lab in Germany. He co-authored the
books Enterprise Master Data Management: An SOA Approach to Managing Core
Information and The Art of Enterprise Information Architecture: A Systems-Based
Approach for Unlocking Business Insight, as well as numerous research articles
and developerWorks articles. As inventor, he contributed to more than 30 patent
applications for IBM. He is also an The Open Group Master Certified IT Architect and
holds a master's degree in mathematics from the University of Constance/Germany.
Harald Smith
With nearly 30 years in IT and software development, Harald Smith has focused on
the design and delivery of information integration and information quality solutions
and products, including methods and best practices.
Page 23 of 24
developerWorks
ibm.com/developerWorks/
Page 24 of 24