Professional Documents
Culture Documents
SnapLogic User Guide
SnapLogic User Guide
SnapLogic, Inc.
2 W 5th Ave, Fourth Floor
San Mateo, California 94402
U.S.A.
www.snaplogic.com
Copyright Information
© 2011-2013 SnapLogic, Inc. All Rights Reserved. Terms and conditions, pricing, and other information
subject to change without notice. ”SnapLogic” and “SnapStore” are among the trademarks of SnapLogic,
Inc. All other product and company names and marks mentioned are the property of their respective own-
ers and are mentioned for identification purposes only. “SnapLogic” is registered in the U.S. and Trademark
Office.
Table of Contents
SnapLogic® User Guide 1
Table of Contents 3
Preface 7
About SnapLogic 7
Concepts 9
SnapLogic Architecture 9
User Interfaces 13
Introducing SnAPI 13
Sidebar: Foundry 17
Sidebar: Library 18
Canvas 20
Slider 21
Components 25
Component Parameters 36
Validating a Component 37
Pipelines 39
-3-
SnapLogic® User Guide
Executing Pipelines 49
Snaps 57
Accessing Snaps 57
Installing Snaps 58
Configuring Snaps 59
Administration 63
Clustering Servers 71
SiteMinder Support 85
Security Overview 87
Enabling SSL 88
Management Console 90
Log Files 94
Snapshots 97
Sidekick 108
-4-
Table of Contents
General 117
Prerequisites 117
Triggers 117
Glossary 125
-5-
SnapLogic® User Guide
-6-
1
Preface
The SnapLogic User Guide contains instructions for performing data integration using several
SnapLogic interfaces: its graphic user interface, Designer; and its application program inter-
face, SnAPI.
About SnapLogic
SnapLogic is a data integration platform with an innovative, open, and extensible data flow
architecture and straightforward subscription model. SnapLogic connects to almost any SaaS,
Cloud, Web, or enterprise application or data source through Components and Pipelines, pro-
viding information as a utility to business users and applications. SnapLogic is an alternative
to closed, proprietary, client-server-based integration solutions and the massive amount of
hand-coding still being performed to accomplish data integration in the marketplace today.
-7-
SnapLogic® User Guide
-8-
2
Concepts
This section of the document provides an overview to the architecture and concepts of the
SnapLogic application.
SnapLogic Architecture
SnapLogic consists of a number of architectural constituents, including the SnapLogic Server
with its two interfaces and metadata repository, and the Component Servers with their APIs.
The SnapLogic Server manages the instantiation and execution of Components and Pipelines.
The SnapLogic Server's open API enables anyone to create read, write, and operation Com-
ponents and simply snap them into the Server, limitlessly extending the Server's connectivity
and functionality. The SnapLogic Server features:
l Integration of data from any source (including Web, SaaS, and on-premise data)
l Universally extensible SnapLogic Component API
-9-
SnapLogic® User Guide
l SnAPI scripting and command-line API
l Deployment on-premise or in the cloud
l SnapLogic Designer, a browser-based drag-and-drop GUI
l Enterprise ETL functionality
l Searchable metadata repository of Pipelines and Components, including a robust set of
Component connectors, functions, and extensions
l Access to downloadable and reusable integration extensions
l SnapLogic Designer: The SnapLogic Designer is a browser-based visual configuration
tool for creating and executing data integration solutions. The Designer provides a sim-
ple, drag-and-drop visual interface for combining Components and Pipelines and defin-
ing their execution and results. The SnapLogic Designer application runs inside your
Web browser, so no client software installation is required. This enables you to access
your SnapLogic data flow server from anywhere. The Designer enables you to:
l Create Components from the SnapLogic Server Foundry
l Assemble data integration Pipelines
l Create data services from your data sources
l Preview and execute Pipelines
l Schedule unattended Pipeline runs
l SnAPI: The SnapLogic Application Program Interface, SnAPI, is the programmatic inter-
face to the SnapLogic Server that enables you to create and use Components and Pipe-
lines from your application or development environment. SnAPI is ideal for users who
do not need the visual interface of the SnapLogic Designer, or for those who wish to
create Components and Pipelines through code generation. SnAPI supports the following
languages:
l Python
l Java
Most of the actions you can perform in SnAPI can also be performed in the Designer.
- 10 -
Concepts
l Components: A Component is an object used to perform a simple subtask, such as
read, write, or act on data. Strung together, Components are the building blocks of Pipe-
lines, or data flows. Components are generally classified as Connectors (Components
that read or write data) and Operators (Components that perform an action, such as a
join or filter, on data). Basic templates for Components are included in your SnapLogic
installation (refer to the Component Reference for the list of Component templates that
ship as part of SnapLogic), and reside in the Designer's Foundry panel. These generic
templates, once configured, become configured Components that are stored in the Snap-
Logic Server repository, and can be found in the Designer's Library panel.
l Pipelines: A Pipeline is a collection of Components linked together to orchestrate a
flow of data between end points. For example, a simple Pipeline may read data from an
RSS feed, reformat it, and write it to a database.
l Snaps: A Snap is the encapsulation of an integration task or subtask that performs a
complete, and usually high-level, function. It is any “pluggable” piece of code that has
been conveniently packaged to run seamlessly in the SnapLogic Server. Specifically, a
Snap can take various shapes:
l A Snap can be a collection of Components that are functionally related, such as
the Salesforce Snap, which contains Components for inserting contacts into and
deleting contacts from Salesforce.
l A Snap can also consist of a single low-level building block, such as a filter.
l A Snap can comprise a complete Pipeline packaged as a simple Component to
insert an item.
The definition of "Snap" is therefore a recursive one: A complex Snap can contain multiple
Pipelines; a simple Snap can stand alone or participate in a Pipeline.
A Snap in SnapLogic is comparable to a smart phone app, a browser add-on, or an application
plug-in. A Snap can perform as simple a task as to read data from a file, or as complex an
operation as to connect to an instance of Microsoft® Dynamics CRM, analyze the source data,
and provide full access (data and functionality) to all standard and custom objects within
Microsoft Dynamics CRM. When changes have been made to standard or custom objects, the
Snap adapts and provides you access that takes this change into account.
- 11 -
SnapLogic® User Guide
with the Field Linker for details.)
- 12 -
3
User Interfaces
SnapLogic has an intuitive graphic user interface, the SnapLogic Designer, where you can vis-
ually create data integration scenarios. SnapLogic also has an Application Program Interface,
SnAPI, through which you can access all SnapLogic functionality using Python or Java.
This guide provides instructions for completing tasks using both Designer and SnAPI where
possible.
Introducing SnAPI
The SnapLogic Application Program Interface, SnAPI, enables you to create and use Com-
ponents and Pipelines from your application or development environment. SnAPI is ideal for
users who do not need the visual interface of the SnapLogic Designer, or for those who wish to
create Components and Pipelines by way of code generation.
You can use SnAPI for tighter integration with various relational database management sys-
tems. For example, refer to Using SnAPI within PostgreSQL for an introduction to SnAPI. Snap-
Logic provides SnAPI support for the following languages:
l Python
l Java
l Linux: bin/snaplogic_env.sh and bin/snaplogic_env.csh
Source the appropriate script for your Linux shell to configure the environment.
Launching Designer
Follow these instructions to launch Designer from your computer desktop:
1. Begin by starting the server. See "Starting and Stopping Servers" for information on
how to do this.
2. Launch Designer.
l In a browser, go to http://hostname:443/designer, where hostname is the
name of the machine where SnapLogic is installed.
- 13 -
SnapLogic® User Guide
Designer is a Flash/Flex application that runs in your web browser. You can also load it by
pointing your web browser to your server host and port number; for example, http://h-
ostname:443/designer.
The sidebar hosts the Library and Foundry panels:
l Library: The Library stores the projects you are building: your Pipelines and configured
Components. Refer to the Sidebar: Library Panel section for more details.
l Foundry: The Foundry stores the building blocks from which you can build projects.
These building blocks are either generic templates provided with SnapLogic, or more
specialized Snaps you have purchased from SnapStore. Refer to the Sidebar: Foundry
Panel section for more details.
l Access these sidebar objects by dragging them onto the canvas to sketch, or
design data flow.
l You can toggle the visibility of the sidebar by clicking the sideways arrow to the
left of the canvas, or by selecting View > Sidebar.
l Canvas: The canvas is your work area for sketching and configuring. Drag select items
from the sidebar onto the canvas to perform your data integration tasks. The canvas is
equipped with a slider that displays in detail the properties and, if available, previews of
any highlighted object. The canvas and slider are each discussed in detail in this section.
l Server: Use the Server menu to connect to, disconnect from, and manage SnapLogic
servers.
- 14 -
User Interfaces
l Library: Use the Library menu to create new pipelines and to copy and paste high-
lighted Library objects.
l View: Use the View menu to dictate what elements appear in your Designer screen,
and to access scheduler, server, and log information. For information on configuring
view settings, see "Settings".
l SnapStore: Click SnapStore to open SnapLogic's online marketplace in a new browser
window. You can purchase Snaps from SnapStore and open them immediately in
Designer.
l Help: Use the Help menu to access SnapLogic resources, check for software updates,
and manage your SnapLogic licensing.
Settings
To invoke the Settings dialog box, select View > Settings from the Designer menu bar.
Note that you can use the Reset to Defaults button if you do not want to keep your changes.
This list describes the settings available and their default values:
l General: General settings include display preferences for the welcome page,
animations, and feedback with sound.
l Show animations: Select this option to display animations in Designer. Default
setting: Yes
l Registered Servers: This setting displays a list of servers and their default connection
settings.
l Remove: Highlight the server you want to remove and click Remove to delete the
server from SnapLogic's registry. You must have at least one server registered in
order to use SnapLogic. Default setting: None. When you first install SnapLogic,
you are prompted to register a server. That server displays when you open this
dialog box. However, if you click Reset to Defaults, you are asked to confirm
whether by restoring all default settings, you also want all servers to be removed
from this list. You can answer "No," and your servers will be left in the registry
while all the remaining setting defaults are restored.
- 15 -
SnapLogic® User Guide
l Component Options: This setting enables you to control preview and pass-through
options. Limit preview to (specify number) of records: This setting affects the preview
feature available in the canvas slider for Components that support preview. The max-
imum number of records SnapLogic can preview is 100. If desired, use this setting to
lower the maximum. Default setting: 100
l Pipeline Options: This setting enables you to control which prompts and aids are auto-
matically enabled when you work with pipelines:
- 16 -
User Interfaces
The Settings box provides a Reset to Defaults button should you choose to restore all the
options to their default values. Click Close to exit the Settings box.
Sidebar: Foundry
The Foundry is the bottom panel in the Designer's sidebar and contains Component templates
for your use.
Components are the building blocks of Snaps and Pipelines: they perform a simple subtask,
such as connecting to a database, filtering rows of data, or performing a join. A Component
has properties; each property has a type (for example, binary or string )and a value (its value
can either be fixed, or made a parameter).
The Component templates in the Foundry are generic. They are designed to perform specific
functions, but without configuration, they remain mere templates. To create Components, you
must configure generic Component templates found in your Foundry to your specific needs
and connect them to data sources, other Components, or targets.
The templates available in the Foundry include:
l Component templates provided with your purchase of SnapLogic, in their original ver-
sion (prior to any configuration you have performed) are stored in the Foundry.
l Snaps you have purchased from SnapStore, prior to any configuration you have per-
formed, consist of Component templates and appear in the Foundry.
To configure a Component, drag it from the Foundry to the canvas and edit its properties. At
this point, any work you have done on the object is saved in the Library, and the object,
whether complete or in progress, is no longer considered a Component template, but rather, a
Component. The Foundry is akin to a hardware store where you procure your materials. The
moment you begin to manipulate an object, its new home (with the work you have applied to
- 17 -
SnapLogic® User Guide
it) becomes the Library. Refer to the Library to locate any object which you have begun to con-
figure.
Foundry Toolbar
The Foundry toolbar has buttons for the following commands:
Foundry Categories
The Foundry panel separates its Component templates into categories, which you can filter
using the drop-down list. This organization is helpful when you have many Component tem-
plates and want to look at them by type.
l The ALL category displays all objects available to you in the Foundry in a tree structure.
l The SnapLogic category displays only Component templates that are shipped as part of
SnapLogic.
l Additional categories appear when you purchase Snaps from SnapStore. These Snaps
consist of Component templates and install into the Foundry under their own category.
For example, if you purchase the SalesForce Snap, a SalesForce category is added to
the drop-down list.
l The Foundry view tab ( ) displays all objects available to you in the Foundry, organ-
ized by object type and name.
Sidebar: Library
The Library is the top panel in the Designer's sidebar and stores all of your customized solu-
tions and projects in the form of Components and Pipelines.
- 18 -
User Interfaces
The Library contains all the objects that you are in the process of, or have completed, con-
figuring. Configured Components and Pipelines reside in the Library. Components that you
located in the Foundry and have begun to configure reside in the Library.
If the Foundry is akin to a hardware store where you procure tools and materials, the Library
plays the role of your workshop, where your projects come to life. The moment you begin to
configure a Component template, this configured Component appears in the Library. You can
create any number of Components from a Component template.
Sidekick
As of 3.7, if you have Sidekick configured, it will display under the corresponding SnapLogic
Server.
Data Folder
For each server in the Library pane, there is a “data” folder, which allows you to upload files
to this folder, or its sub-folders, from within the Designer, by right-clicking on the destination
folder and selecting Upload.
Library Folders
Notice that in the Library view, Library objects appear to be organized by folder. When you
configure a Component or create a new Pipeline in SnapLogic, Designer automatically assigns
it a URI and a location in this folder structure. This visual organization helps you conceptually
sort the Component and Pipelines you build, and makes them easier to locate. While you can-
not create folders or alter the folder structure outside of the URI definition, you can drag Com-
ponents in and out of these folders as necessary.
However, despite the convenient appearance of this folder structure, Designer references all
the Component and Pipelines you create by uniform resource identifiers (URIs), and not by a
traditional folder structure. The resources in the URIs it assigns are fully qualified "long point-
ers," which can point to objects located anywhere—in the cloud just as easily as on premise.
Designer thus gives the appearance of a folder structure for your convenience, but uses the
URI approach to referencing objects in order to build and execute bi-directional data flows
between applications, hiding the complexity of your system architecture where needed.
- 19 -
SnapLogic® User Guide
l The Library ( ) view tab displays all objects available to you in the Library, organized
in folders.
Canvas
The canvas is your main workspace in Designer. Create data integration solutions in the can-
vas by sketching, connecting, and then configuring Components and Pipelines. Sketching
refers to the process of dragging Component templates to the canvas and linking them to
others. Configuring refers to the Component and Pipeline properties you can edit in the slider.
The following are high-level instructions for using the canvas, which are later discussed in
detail:
l Begin by creating Components; do this by selecting one or more Component templates
provided in the Foundry, dragging them to the canvas, and renaming them. See "Cre-
ating a Component in Designer" for more information.
l Connect multiple Components to form Pipelines. See "Creating a Pipeline in Designer"
for more information.
l Highlight a Component to configure its properties in the slider that occupies the lower
portion of the canvas. For Components that support data preview, you can preview the
results in the slider's Preview tab. See "Configuring a Component in Designer" for more
information.
l Edit the field links, or connection properties (that is, configure how fields from each
Component map to their downstream Component) by clicking on the link between them
to display the Field Linker in the slider below. See "Mapping Components with Field
Linker"for more information.
l Execute Pipelines directly from the canvas by clicking the Run Pipeline button in the
canvas toolbar. See "Executing Pipelines" for more information.
The canvas occupies the entire work area until you select an object or a link in the canvas, or
run a Pipeline. Any one of these three actions displays the slider just below the canvas. Use
the canvas for sketching: bringing appropriate building blocks into your data flow and con-
necting them. Use the slider for configuring the building blocks, their connections, and the
resulting Pipelines. The slider is further discussed in the Slider section.
Designer creates a tab for every Pipeline that is open, or any new Pipeline you create. Each
tab displays the name of its Pipeline. You can have multiple tabs open, and drag building
blocks into the tab of your choice.
- 20 -
User Interfaces
The upper right corner of each tab enables you to control the zoom level for the selected area
in that tab. Zooming in and out is especially useful when you have a long or cluttered Pipeline.
You can change the default zoom percentage in the Pipeline Settings' Pipeline Options screen.
A toolbar is located on the left side of the canvas. Note that, depending on where the canvas
and slider are separated, you may not be able to see the entire canvas toolbar without first
dragging down the divider to make the slider smaller and allow the entire toolbar to appear on
the canvas.
The canvas toolbar contains the following commands:
l Run Pipeline: Use this command to execute the Pipeline in the active tab.
l Pipeline Properties: Use this command to display the properties of the entire Pipeline
in the slider below.
l Rearrange Components: Use this command to arrange the Components in an orderly
manner.
l Show Grid: Use this command to display a subtle grid to help you visually follow or
align the layout of the objects in the tab.
l Snap to Grid: Use this command to have Designer automatically align the objects in
the tab.
l Search Pipeline: Use this command to search for specific Components in a Pipeline.
This is particularly useful in large, complex Pipelines.
Slider
The slider resides below the canvas. Its appearance is dynamic: it displays in a frame under
the canvas when you select a Component or a link on the canvas, or when you run a Pipeline.
When you first launch Designer and start to work on the Canvas, the slider is not visible. Once
you select a Component or link, or run a Pipeline, the slider appears.
You can also maximize the slider to occupy the entire area of the canvas, or open it in its own
tab alongside the other tabs topping the canvas. The content of the slider varies with the
object selected:
l If a Component is highlighted on the canvas, the slider is in Component mode, and dis-
plays Component properties you can configure. The properties presented vary with the
type of Component and the functions it supports.
l If a connection, or link, is highlighted on the canvas, the slider displays the Field Linker.
- 21 -
SnapLogic® User Guide
l If you execute a Pipeline from the canvas, the slider goes into Pipeline mode and dis-
plays Pipeline properties and execution data.
The slider's title bar displays the name and type of object highlighted in the canvas above.
Slider Commands
Regardless of which page the slider displays, the following commands are available:
l Save: Use this command to save the work you are doing in Designer to the SnapLogic
Server repository.
l Suggest: Use this command to invoke automatic fill-in suggestions. This button is only
enabled when you are viewing Components that are eligible for suggestions. Refer to
the "Suggestions for Component Properties, Inputs, and Outputs" section for more
details about this function.
l Validate: Use this command to validate Components and Pipelines. Refer to the "Val-
idating a Component" section for more details about this function.
Library Toolbar
The Library toolbar has buttons for the following commands:
l New Pipeline: Use this command to create a new Pipeline. A URI dialog box asks you
to name the new Pipeline, and then the canvas is primed for you to begin sketching.
The slider's Component information is divided into a series of pages. Navigate between them
by clicking the oval-shaped page names. Note that the pages available vary with the type of
Component selected in the canvas. For example, if you have selected a Component that
accepts inputs but produces no outputs, the slider menu displays an Input page, but neither an
Output nor a Preview page.
For more information on using the slider in this mode, refer to the Components section.
- 22 -
User Interfaces
To connect two Components to each other, select the bottom (output) frame of the first Com-
ponent and drag your mouse to join it to the top (input) panel of the downstream Component.
For example, the output frame of the Leads Component appears purple, and is joined to the
green input frame of the Prospects Component. The line between them indicating their con-
nection has a ring in its center; this ring represents the link between the Components. Select-
ing the link displays the Field Linker in the slider. The Field Linker displays output fields in a
column alongside input fields of the downstream Component.
The Field Linker automatically suggests field-to-field mappings when field names are the
same (for example, the "Address" output field of the Leads Component is automatically linked
to the "Address" input field of the Prospects Component). You do not have to accept these sug-
gestions. Change individual links by clicking on the output field in question and selecting an
alternative from the drop list.
For detailed information on working with Pipelines, refer to the Pipelines section.
- 23 -
SnapLogic® User Guide
- 24 -
4
Components
A Component is an object used to perform a simple subtask, such as read, write, or act on
data. Strung together, Components are the building blocks of Pipelines, or data flows. Com-
ponents are generally classified as connectors (these are further divided into consumers and
producers: Components that read or write data, respectively), and operators (Components
that perform an action, such as a join or filter, on data).
Basic templates for Components are included in your SnapLogic installation (refer to the Com-
ponent Reference for the list of Component templates that ship as part of SnapLogic), and
reside in the Designer's Foundry panel. These generic templates, once configured, become
configured Components that are stored in the SnapLogic Server repository, and can be found
in the Designer's Library panel.
To create a Component, drag an unconfigured Component template from the Designer's
Foundry panel onto the canvas.
If you are in configuration mode, continue with the instructions in this section. If you are in
sketching mode and prefer to configure afterward, connect the Components to data sources,
other Components, or data targets to form Pipelines, as described in the Pipelines section--
and then return to this section for configuration instructions. Configured Components are
stored in the SnapLogic Server repository and can be found in the Designer's Library panel.
1. Double-click the desired Component template in the Foundry.
2. Specify a URI in the New [Component Type] prompt that displays.
3. Click OK.
The Component you just created now displays in the Library. The canvas displays Com-
ponent properties for you to configure.
4. Configure the Component properties in the canvas, referring to the list of Component
Configuration Properties if required.
5. If you wish to validate your work, click Validate, and refer to the "Validating a Com-
ponent" section for details.
- 25 -
SnapLogic® User Guide
6. Click Save to save your Component (whether complete or not) in the SnapLogic Server
repository. You can also return to the configuration step later. Save the Component as it
is now, and at some later point, drag it from the Library to the canvas to edit its prop-
erties in the slider.
Alternately, if you wish to accept the default URI, you can create a Component by dragging
the Component Template onto the canvas.
l General: General information about the selected Component. Refer to the "Validating a
Component" section for details.
l Properties: Properties vary for every Component. Some properties can be edited.
Refer to the "Configuring Component Properties" section for details.
l Output: The Output tab only displays for Components that produce outputs, or data that
can serve as inputs to downstream Components. Refer to the "Specifying Component
Outputs" section for details.
l Input: The Input tab only displays for Components that consume input data for the pur-
pose of performing functions on them or for passing them through to another down-
stream Component. Refer to the "Specifying Component Inputs"section for details.
l Parameters: Parameters are variables that can be used for runtime substitution in the
properties of Components. You can use parameters to avoid hard-coding property
values that are likely to change. Using parameters enables you to use a single Com-
ponent or a single Pipeline for multiple purposes. The Parameters tab displays Com-
ponent parameters and their default values. You can edit this information, and add or
remove parameters. Refer to the "Component Parameters" section for more infor-
mation on parameters.
l Preview: The Preview tab only displays for Components that support preview, and ena-
bles you to examine a subset of the Component output rows without actually executing
the data integration scenario. Refer to the "Previewing Component Execution" section
for details.
l URI: The unique URI that SnapLogic assigns to the Component.
l Component: The Component used. Once you drag a Component to the canvas, save it
with a name of your choice and configure it, the type of function it performs may not be
immediately obvious. This field displays the original Component template used in your
Component.
- 26 -
Components
l Version: The version of the Component used in the Pipeline. This may important when
verifying the functionality available in that Component.
l Created by: The username that defined this Component. This is only visible on newly
placed Components in release 3.7 or later, not on Components upgraded from a pre-
vious release.
l Created on: When the Component was defined in the Library. This is only visible on
newly placed Components in release 3.7 or later.
l Modified by: The username that last modified this Component. This is only visible on
newly placed Components in release 3.7 or later.
l Modified on: When the Component was last modified. This is only visible on newly
placed Components in release 3.7 or later.
l Author: Optionally enter the username of the Component creator.
l Description: Optionally enter a description for the Component.
) or a treeview . Pausing your cursor over a property name displays an explanation of
the property. Use the Value column to edit properties that allow editing. Properties vary for
every Component. Refer to the Component Reference Guide for details about each Com-
ponent's specific properties.
The following list is a small sampling of properties common to many Components:
l Credentials: Username
l Credentials: Password
l Delimiter
l File name
l Is input a list
l Input field
l Output field
l Error Handling (Refer to the "Error Handling to Address Connection Problems and Data
Errors" section for details.)
- 27 -
SnapLogic® User Guide
Depending on the Component, a message may appear to the right of the properties list,
explaining that the Component being edited does not support pass-through. You can enable or
disable pass-through messages by editing the Resource Options in your Pipeline Settings.
You can also add error outputs to Components so that erroneous records are collected in a
specified destination file. This practice enables you to address the failed records while the
remaining records in the Pipeline execute. Refer to the "Error Handling to Address Connection
Problems and Data Errors" section for details.
l Run: Begin with this tab. If your Component has runtime parameters, enter their values
here to run the preview. Click Run when you are ready.
l Preview Data: Here is where you can view the output rows generated as the preview.
The default maximum number of rows generated is 100. If the data source has fewer
records than the maximum setting, the preview shows all the records available. If you
want faster previews or simply need smaller samples, you can change the maximum
from 100 to any lower number in the Resource Options of your Pipeline Settings. When
the data is too large to be displayed, the data is truncated by an ellipsis hyperlink. Click-
ing on that link shows the complete data in another window. Use the Print and Copy
buttons to print the rows generated in the preview, or to copy them to your Microsoft
Office Clipboard in tab-delimited format that is ready to be pasted into a spreadsheet
application.
l Runtime Information: Visit this tab to view runtime results. You can see the start and
end times of your preview run, and its status.
- 28 -
Components
Components are not required to suggest all fields, and in some cases, meaningful suggestions
are impossible. You can choose to accept or reject any suggestions made.
For example, setting the filename property to the name of a .csv file, produces suggestions
from the CSV Read Component regarding the output fields and how many header rows to skip.
Suggestions are followed by a Confirm Action prompt, which you can opt out of seeing.
l DB Reader: Use this Component to select data from a database.
l DB Writer: Use this Component to insert, update or delete data in a database.
l DB Lookup: Use this Component to perform per record lookups, sometimes referred to
as probes.
l DB Upsert: Use this Component for "upsert" (merge) functionality, updating existing or
inserting new rows into a database.
Creating a Component from these templates requires that you also create a database-specific
connection Component using the appropriate DB Connection Component template. SnapLogic
provides DB Connection Components for a variety of databases, including:
l DB Connection - DB2
l DB Connection - JDBC
l DB Connection - MySQL
l DB Connection - Oracle
l DB Connection - PostgreSQL
l DB Connection - SQL Server
l DB Connection - SQLite
With a DB Connection Component, you specify the connection details of the target database,
providing such information as database name, host name, and port number. Every DB Read-
er/Writer/Lookup Component contains a DB Connection Component property field in which you
specify the URI of the DB Connection Component containing the information about the data-
base to which the Reader/Writer/Lookup Resource is to connect. This allows connection-spe-
cific details to be centralized in a single Component that can be shared by multiple DB
Reader/Writer/Lookup Components.
See the Component Reference for detailed information about Components.
- 29 -
SnapLogic® User Guide
The DB Reader Component specifies the URI of the DB Connection Component to use, in addi-
tion to the actual SQL statement used to read from the database.
When building a Pipeline with DB Reader/Writer/Lookup Components, drag the DB Connection
Component onto the canvas. Note that the DB Components have drop lists where you can
select which DB Connection Component to use. This selection overrides the connection prop-
erty you specified in the Component itself and allows parameter replacement to be per-
formed.
The following illustration shows an example Pipeline that accesses three distinct databases
with two DB Readers that read sales and account history data, and one DB Writer that updates
the HR database with commission information.
- 30 -
Components
- 31 -
SnapLogic® User Guide
The following diagram illustrates the concept of pass-through fields.
With pass-through, the inputs of Components in a Pipeline need not be explicitly designed to
handle all the incoming fields from upstream Components. Component inputs only need to
specify fields that the Component requires for its computations. This reduces the field linking
to the absolute minimum.
Not all Components support pass-through. Each Component description in the Component Ref-
erence declares whether it supports pass-through or not.
Examples
The following scenarios provide examples of how pass-through works.
- 32 -
Components
Scenario 1:
[Component I] (Input: N/A; Output: A,B,C; Pass-through: No) -> [Component II] (Input: A;
Output: A; Pass-through: Yes)
l Component I is linked to Component II and field A is mapped from I's output view to II's
input view.
l With pass-through enabled, fields A,B,C are available in the output of Component II.
Scenario 2:
[Component I] (Input: N/A; Output: A,B,C; Pass-through: No) -> [Component II] (Input: A,B;
Output: A; Pass-through: Yes)
l Component I is linked to Component II and field A is mapped between I's output view to
II's input view.
l Field B from Component I is mapped to field B in Component II.
l With pass-through enabled, fields A,C are available in the output of Component II. Note
that field B is not available in the output of Component II due to the fact that it is only
defined in the input view of the Component.
Scenario 3:
[Component I] (Input: N/A; Output: A,B,C; Pass-through: No) -> [Component II] (Input: A,B;
Output: A; Pass-through: No)
l Component I is linked to Component II and field A is mapped from I's output view to II's
input view.
l Field B from Component I is mapped to field B in Component II.
l With pass-through disabled, field A is available in the output of Component II. Field C is
not visible since pass-through is not enabled.
Scenario 4:
[Component I] (Input: N/A; Output: A,B,C ;Pass-through: No) -> [Component II] (Input: A,B;
Output: A,B; Pass-through: Yes)
l Component I is linked to Component II and field A is mapped between I's output view to
II's input view.
l Field B in Component II is mapped to NULL.
l With pass-through enabled, fields A,B,C are available in the output of Component II.
- 33 -
SnapLogic® User Guide
The following figure provides an example of the input fields accepted by the Filter Component
("FilterLeads").
"FilterLeads" is the second Component in the Filtered_Qual_CA_Prospects Pipeline. Its
upstream Component is a CSV Read Component named "Leads." The Leads Component is read-
ing data from a comma-separated file and passing it through a filter. The Filter Component,
"FilterLeads" is selective on which rows it accepts as inputs, but does not operate on these
rows, and outputs them directly to the next downstream Component.
The previous figure displays the Input side of the filter Component, "FilterLeads." These are
the inputs it is accepting from the upstream "Leads" Component.
The next figure displays the Output side of the Filter Component. Notice that the pass-through
option for fields coming from Input1 is enabled; that is, the Component is specified to accept
pass-through fields.
- 34 -
Components
The ifnull(X,Y) operator returns Y if X is NULL; otherwise, it returns X. You can specify con-
stant values for the default value, or reference other fields as shown in the example below.
- 35 -
SnapLogic® User Guide
A simple example is changing True and False strings to Yes and No strings. The following is an
example of the expression in this case:
CASE WHEN (${Field001} = 'True') THEN 'Yes' WHEN (${Field001} =
'False') THEN 'No' ELSE NULL END
You can perform complex expressions to calculate the replacement values or reference addi-
tional fields.
Component Parameters
Parameters are variables you can set for run-time substitution in Component and Pipeline
properties. Use parameters to avoid hard-coding property values that are likely to change;
this enables you to use a single Component for multiple purposes.
For example, to create a Component that reads data files created on a daily basis, use a
parameter for the filename property, as follows:
File name $?{INPUTFILE}
You can also use parameters to specify only part of property. In the following example, the
user must only specify the date, rather than the entire path to the data file:
File name file://data/logs/revenue_$?{DATE}.csv
A property can contain any number of parameters. In the following example, the user can
specify the report type and date:
File name file://data/logs/$?{REPORT}_$?{DATE}.csv
When defining parameters, the Component author has the option to specify a default value, or
to require that the Component user provide a value at runtime. If the author specifies a
default value, the parameter is called an Optional Parameter. If no default value is specified,
the parameter is referred to as a Required Parameter. A Component with a required
- 36 -
Components
parameter cannot be executed without a value specified at runtime. Refer to the following
table for a breakdown of this concept.
Validating a Component
When you create a Component, you can invoke the validation process at any point. Validation
can perform basic checks, such as:
l Ensure that all the required properties have values specified
l Confirm that property values with constraints are set to valid values
- 37 -
SnapLogic® User Guide
l Check that all required input and output views are defined
l Verify that field and view name references are correct
In addition, each Component can perform more advanced validation tasks that are specific to
the Component's function. For example, validation can ensure that a specified file exists or
that a user name and password are valid.
- 38 -
5
Pipelines
A Pipeline is a collection of Components linked together to orchestrate a flow of data between
end points. For example, a simple Pipeline may read data from an RSS feed, reformat it, and
write it to a database.
The main modes of action you can take on Pipelines in Designer are:
l Sketch: Drag and drop Components onto the canvas and link them, leaving con-
figuration for another time. (Refer to the Field Linking section for details.)
l Configure: Click on the Pipeline's Components and links displayed on the canvas, and
use the slider to configure them. (Refer to the Component Configuration and Field Link-
ing sections for detailed configuration instructions.)
l Run: Execute Pipelines from the canvas and examine their runtime information in the
slider. (Refer to the Executing Pipelines in Designer and Scheduling Pipeline Execution
sections for detailed execution instructions.)
Follow these high-level steps to work with Pipelines in Designer:
2. Drag the desired Component templates from the Foundry, or Components from the
Library, and drop them onto the canvas.
3. Configure the Components in the slider (refer to the section Configuring a Component in
Designer for more details). You can also reverse the order of this step with the next;
that is, you can link the Components and then configure them.
4. Link the Components together in the desired order by clicking on each Component and
dragging the connecting arrow to the desired downstream Component.
5. Configure the field links by clicking on each link that connects two Components to dis-
play the slider's field linking dialog.
6. Click Run to edit Pipeline properties and execute the Pipeline.
- 39 -
SnapLogic® User Guide
l Component properties: this refers to configuring the properties of each Component par-
ticipating in the Pipeline. This is discussed in detail in the "Configuring a Component in
Designer" section.
l Connection properties (field links): this refers to configuring the properties of each link
between Components that participate in the Pipeline. This is discussed in detail in the
"Mapping Components with Field Linker" section.
l Pipeline properties: this refers to configuring the properties of the Pipeline as a whole.
It is discussed in this section.
Pipeline Properties
To access Pipeline Properties, click Run on the canvas toolbar next to the open Pipeline you
are viewing. The slider displays Pipeline Properties.
Pipeline Properties include a number of pages. Navigate between them by clicking the oval-
shaped page names, as described:
l General: This tab contains general information about the Pipeline. Refer to the "Gen-
eral Pipeline Information" section for details.
l Input: This tab displays inputs defined at the Pipeline level. Refer to the "Specifying
Pipeline Inputs" section for details.
l Output: This tab displays outputs defined at the Pipeline level. Refer to the "Specifying
Pipeline Outputs" section for details.
l Parameters: Use this page to define and set values for parameters at the Pipeline
level and to ensure that all Components with required parameters are mapped to Pipe-
line parameters. Refer to the "Pipeline Parameters" section for details.
l Run: Use this page for tasks associated with executing pipelines, such as inputting run-
time parameter values, selecting a data tracing option, previewing data, executing the
Pipeline, and monitoring its status. Refer to the "Executing Pipelines" section for
detailed information.
l URI: The unique URI that SnapLogic assigns to the Pipeline.
l Component: The Component used. The value here is "Pipeline."
l Author: Optionally enter the username of the Pipeline creator.
l Description: Optionally enter a description for the Pipeline.
l Related Pipelines: This button enables you to enter optional metadata information
about other Pipelines that together provide useful data streams; this option facilitates
"data serendipity." You can describe the correlation between a field from the Pipeline in
question and a parameter in the target Pipeline. This information is metadata only; the
- 40 -
Pipelines
SnapLogic Server does not execute on this information. Click Related Pipelines and
follow these steps to add a relation:
1. In the Related Pipelines screen, enter the URI of a Pipeline or click the
browse button to locate a Pipeline on any server for which your Designer is
configured. If you use the browse option, the Select Pipeline screen dis-
plays.
2. Optionally enter a Display name for this entry and click Add.
4. Click OK to finish.
l Scheduler: This button launches the Scheduler, which enables you to execute Pipelines
automatically at designated times. Refer to the "Scheduling Unattended Pipeline
Execution" section for details.
You can add and remove inputs by using the + and - buttons. Clicking the Add (+) button dis-
plays the Add Input Mapping dialog. Enter a name for the input and select the Component to
which your Pipeline-level input maps. Click Finish, and find your new input displayed in an
additional tab on the Input page.
You can add and remove outputs by using the + and - buttons. Clicking the Add (+) button dis-
plays the Add Output Mapping dialog. Enter a name for the Pipeline-level output and select
the Component to it maps. Click Finish, and find your new output displayed in an additional
tab on the Output page.
Pipeline Parameters
Use this page to define Pipeline-level parameters. Use it also to ensure that all Components
with required parameters are mapped to Pipeline parameters, and that all required Pipeline
parameters have values.
Parameters are variables you can set for runtime substitution in Component and Pipeline prop-
erties. Use parameters to avoid hard-coding property values that are likely to change; this
enables you to use a single Component for multiple purposes. Pipelines can define parameters
which are then mapped to Component parameters. Like Component parameters, Pipeline
parameters can be either Required Parameters or Optional Parameters. Refer to the following
table for a list of outcomes for each combination of Pipeline and Component parameters.
- 41 -
SnapLogic® User Guide
A Pipeline must have the following requirements met in order to execute:
l All required Pipeline parameters must have values provided by runtime.
l All of the Pipeline's Components with required parameters must be mapped to Pipeline
parameters.
Consult the following figure for an example of a parameter mapping.
The first Pipeline Parameter (that is, a parameter at the Pipeline level) is LEADS, which, in the
Mapped To column, is mapped to the Leads.INPUTFILE Component parameter, with a Default
Value that specifies the path of the .csv input file containing information on sales leads.
If you open a Pipeline in Designer, a background check verifies that its Components are up-to-
date. A message displays to inform you that Components within the Pipeline have been auto-
matically updated to reflect any changes if that is the case.
If you have completed a Pipeline that the Scheduler now executes, and then you modify the
interface of one of its Components, future scheduled runs of that Pipeline may fail. To update
- 42 -
Pipelines
the Pipeline with the new version of the Component you have changed, you can simply open it
in Designer, where it will automatically submit to a background check. You can also delete the
object from the Pipeline and add it again.
2. Select the new location for the Pipeline and click OK.
The Pipeline and its related resources are copied over.
The output frame of the "from" Component appears purple, and is joined to the input frame of
the "to" Component, usually green. The line between them indicating their connection has a
ring in its center; this ring represents the link between the Components. Selecting the link dis-
plays the Field Linker in the slider. The Field Linker displays output fields in a column along-
side input fields of the downstream Component.
The Field Linker automatically suggests field-to-field mappings when field names are the
same (for example, the "Address" output field of the upstream Component is automatically
linked to the "Address" input field of the downstream Component). You do not have to accept
these suggestions. Change individual links by clicking on the output field in question and select-
ing an alternative from the drop-down list.
Note: If you do not want automatic Smart Link suggestions, disable the option in
the Pipeline Options of your Pipeline Settings.
When the Field Linker finds no obvious correspondence between fields, it displays only a drop-
down list in the output column, from which you can select the appropriate field. For example,
perhaps SnapLogic did not suggest an output field corresponding to the "Work_Phone" input.
Clicking Select Field next to "Work_Phone," the other Component's "Phone_w" field is high-
lighted as the appropriate output. This link is manual, as opposed to the automatic sug-
gestions provided by SnapLogic.
On the right of the input column is a set of tools. The Field Linker's Tools include:
l Smart Link:Use this command to generate field linking suggestions. The Field Linker
automatically suggests field-to-field mappings when field names are the same (for
- 43 -
SnapLogic® User Guide
example, the "Address" output field of a CSV Read Component is automatically linked to
the "Address" input field of the CSV Write Component. By default, these suggestions
already display when the Field Linker first opens. However, if you have changed this
default behavior in your settings, or if you have cleared the auto-generated sug-
gestions, click Smart Link to regenerate them. A confirmation prompt displays to warn
you that any fields you have manually linked will be unlinked should you continue.
l Clear All: Use this command to clear all field linkings, manual and auto-generated
alike. A confirmation prompt displays to warn you that any fields you have manually
linked will be unlinked should you continue. The result is that all of the rows in the out-
put Component column prompt you to select a field, as shown in the figure "Field Linker:
Clear All Command".
l Null All: Use this command to set all output fields to NULL. This means that the fields in
question provide no output values for the downstream Component. A confirmation
prompt displays to warn you that any fields you have manually linked will be unlinked
should you continue. The result is that all of the rows in the output Component column
display "NULL."
l Null Remaining: Use this command to set any unlinked fields to NULL. This means that
all fields that have been auto-linked or fields you have linked manually retain their
links, while any fields that remain unconnected will provide no output values for the
downstream Component. The result is that all unconnected fields in the output Com-
ponent display "NULL."
l Pipelines that are self-contained, with no inputs or outputs.
l Pipelines that have inputs or outputs, or both.
A Data Service Pipeline is a special form of the second category: it is a Pipeline that has only
one output view, and no input views. SnapLogic automatically makes the output of a Data
Service Pipeline available as a data service endpoint. Data Service Pipelines provide data in
response to a HTTP GET request. These data service endpoints provide a simplified interface
to SnapLogic data streams and are easily consumed by any programming language that sup-
ports a basic HTTP library. In particular, they are very Ajax-friendly, and can be readily con-
sumed in Javascript using XMLHttpRequest().
- 44 -
Pipelines
The following figure is an example of a Data Service Pipeline defined in Designer. In this
example, C2's output C2_Output1 has been assigned to the Pipeline output P_Ouput1. This
assignment is performed when adding an output to the Pipeline in the SnapLogic Designer.
- 45 -
SnapLogic® User Guide
Note that the URI of the service endpoint is preceded by the /feed token and appended with
the output name of the Pipeline. The /feed prefix is necessary to indicate to the server that a
GET to this URI triggers Pipeline execution. The output name suffix is used to specify the out-
put you are requesting.
You can pass parameters to the Pipeline in the HTTP URI, as follows:
http://host:port/feed/P/P_Output1?P_PARAM1=value&P_PARAM2=someothervalue
The Representation of data from a data server endpoint is negotiated between the application
and the SnapLogic Server. You can explicitly request a certain representation by specifying
the sn.content_type parameter appended to the URI, as in the following example:
http://host:port/feed/P/P_Output1?sn.content_type=text/html
Refer to the Output Representation Formats section for a list of the supported representations.
Components that produce structured or user-defined output, such as the HTML Formatter and
XML Write Components, output data in a representation that cannot be modified in a mean-
ingful way by specifying the sn.content_type parameter. Refer to the following diagram.
- 46 -
Pipelines
- 47 -
SnapLogic® User Guide
Follow these instructions to configure error handling for the sample Pipeline shown in the dia-
gram of a Pipeline with an Error Output:
1. Create a new Pipeline, starting with a CSV Read Component. Add two CSV Write Com-
ponents.
2. In the CSV Read Component Properties, click the View/Edit link for the Error Handling
property. If you are in tree view, you can expand the Error Handling folder and edit the
properties there.
3. Specify your error handling preferences in the Error Handling Options dialog.
Error Handling Options address both data errors and connectivity problems. The first
option regards data errors; the rest address connection failures:
l Retry strategy: There are two retry strategies:
l Exponential backoff: With the exponential backoff strategy, the previous
wait time is doubled between retries. The initial interval it uses is specified
in Wait time before retrying option. For example, if the Wait time before
retrying is set to 15 seconds, this strategy waits 15 seconds before the first
- 48 -
Pipelines
retry, 30 seconds before the next retry, 60 seconds before the third retry,
and continues to double the interval until it exceeds either the Maximum
wait between retries, the Maximum number of retries, or the Timeout in sec-
onds.
l Timeout in seconds: Time limit on how long the Component must retry con-
necting in the case of a network connection failure. This value covers the period
starting when SnapLogic initiates the first connection attempt; attempts continue
until this timeout is reached.
4. In the CSV Read Component Output screen, click the create error outputs link. Snap-
Logic presents field suggestions. You can agree and click Apply All Changes, or reject
the suggestions by clicking Close. The Component Output screen displays the new Error
Output you have added.
5. Link the CSV Read Component to the CSV Write Component you have designated to
catch the bad data, as shown in the diagram at the beginning of this section. Because
the CSV Read Component has two outputs, a Select Outputs to Link dialog prompts you
to specify the output that provides data to the error file. Select your Error output.
When you execute the Pipeline, intermittent connection problems or data errors do not
have to prevent its successful execution. If it encounters network connection problems,
automatic retries often remove the obstacle without creating a fail. If the Pipeline con-
tains a subset of bad records, they can be written to an error file without preventing the
successful records from loading. You can inspect the error file and address the erro-
neous records separately.
Executing Pipelines
You can execute Pipelines using several different interfaces, starting with the Designer. Using
Designer is easy and intuitive, but can be impractical if you prefer to execute Pipelines on a
periodic schedule, or trigger them from other applications. For this reason, SnapLogic pro-
vides other interfaces. Any of the remaining methods can be invoked from cron or other sched-
uling software.
l SnapLogic Designer
l Management Console
l SnapAdmin Utility
l SnAPI
- 49 -
SnapLogic® User Guide
l HTTP directly for Pipelines that have mapped output views; that is, for Data Service
Pipelines.
If you have configured your servers to run in a cluster, the process of executing Pipelines in a
cluster environment is identical to executing them on an independent server. If you submit a
Pipeline execution request to a cluster, it is performed by the cluster; if you submit a request
to an independent server, it is performed by an individual server.
1. Open the desired Pipeline on the canvas, and click Run. The slider displays Pipeline Prop-
erties.
2. Go to the Run tab.
The Run tab has the following sub-tabs:
l Run: Use this tab to specify runtime parameters by inputting values directly into the
Value column, select a data tracing option if desired (refer to the "Tracing Data to
Debug Pipeline Execution" section for details), and execute your Pipeline by clicking the
Run button.
l Preview Data: Use this tab to preview data for Pipelines with outputs defined at the
Pipeline level. Pipelines that do not produce outputs at the Pipeline level do not support
data preview. For Pipelines that do have defined outputs and support data preview, the
preview feature is identical to the preview feature at the Component level.
l Runtime Information: Use this tab to view Pipeline status updates and execution logs.
The figure, "Pipeline Runtime Information," displays a list of completed Pipeline runs.
You can view the log and statistics for each run by clicking on the link provided in its
status.
Clicking on the View log/statistics link provided with a Pipeline's status update displays the
information for that run in a separate tab, Run1, next to the Runtime Information tab. In the
Run1 tab, the Log file displays first. Use the Less and Full buttons to toggle between sum-
mary and detailed views of the log. You can also use the Print and Copy icons to send the file
content to a printer or to your clipboard, respectively.
Access the Pipeline statistics report by clicking Stats. A sample of the statistics screen is
shown in the figure, "Pipeline Run Stats."
The Pipeline statistics report includes the following sections:
l Summary: This section reports the execution time, number of Components, total
number of inputs, and total number of outputs in the Pipeline.
l Breakdown By Component: This section reports the execution time of each Com-
ponent, its number of inputs, and its number of outputs.
- 50 -
Pipelines
l Record Breakdown: This graph illustrates the number of records extracted and loaded
between Components, pictorially. Components are listed along the Y axis, over Record
Count along the X axis.
Aborting a Pipeline
As of 3.7, you can abort a Pipeline from within Designer by clicking the abort link on the Run
status of that Pipeline.
Select the Pipeline you want to execute by using the checkbox in the left column of any of
these screens and click Run in the upper right corner. (The Pipelines screen is the only excep-
tion to this: it displays only one Pipeline at a time, so there is no need to specify which Pipe-
line you want to run. Simply click Run in that screen.)
For more information, refer to the Management Console section.
The /feed prefix tells SnapLogic that you wish to invoke the specified pipeline and read from
the output view you specified, in this case Output001. The argument sn.content_
type=text/html tells SnapLogic which output representation to use. You can specify Pipeline
parameters in the request using standard HTTP syntax. Refer to this example of the request to
specify a value for the CENSUS parameter for the CensusFeed Pipeline:
http://servername:443/feed/SnapLogic/User/Exercise_
3/CensusFeed/Output001?CENSUS=file://tutorial/data/alt_
census.csv&sn.content_type=text/html
You can invoke Pipelines by this method from cron or other scheduling software.
To view execution logs in this method, use the management console by entering its URI into
your browser address bar: http://<hostname>:<port>/console.
- 51 -
SnapLogic® User Guide
In a clustered environment, the Scheduler ensures that only one scheduled Pipeline is
executed at a time in a cluster. The head node will check whether the same Pipeline is waiting
to run or is already running before scheduling a job. If is already running the scheduled job
will be skipped.
Use the "Scheduled Events For Selected Pipeline" screen to add, modify, delete, and execute
Pipeline schedules as follows:
l New: Click New to add a new Scheduler entry for the Pipeline. The Scheduler Prop-
erties screens display. Complete the properties pages as described in the "Scheduler
Properties" section.
l Edit: Select a Scheduler entry on the screen and click Edit to change the runtime spec-
ifications for the entry. The Scheduler Properties screen displays. Modify the properties
pages as described in the "Scheduler Properties" section.
l Delete: Select a Scheduler entry on the screen and click Delete to remove it.
l Close: Click Close to leave the Scheduler.
Scheduler Properties
Access the Scheduler Properties pages by clicking the Scheduler button in the General page
of Pipeline Properties, and then clicking New to create an entry, or Edit to modify one.
Scheduler Properties consists of several pages, through which you can navigate using the side-
bar menu on the left:
l General: The General properties page is shown in the figure, "Scheduler Properties:
General Page." It includes the descriptive name of the Pipeline to run and the Pipeline's
URI.
l Schedule: The Schedule page is where you specify the Pipeline's run dates and times.
You can select the desired attributes of each scheduling column and view the resulting
summary description in the Summary field.
You can specify the Month, Day, Date, Hour, and Minute on which the Pipeline should
execute. For example, a Pipeline runs weekly at 3:00 am. on Tuesdays.
- 52 -
Pipelines
If necessary, select multiple non-consecutive entries in any column by holding the Ctrl
key while highlighting multiple fields.
Select a range of consecutive entries by holding the Shift key.
To run a Pipeline once an hour, select the Minute at which you want the pipeline to
execute; for example, selecting :00 runs the Pipeline at the top of the hour, whereas
selecting :15 runs the Pipeline at quarter past the hour.
If you want to run the pipeline every 15 minutes, make multiple selections in the Minute
column by holding down the Ctrl key and highlighting :00, :15, :30, and :45.
l Parameters: The Parameters page displays runtime parameters defined for this Pipe-
line. The values you enter here are used at the scheduled runtime.
l Exclusions: The Exclusions page, enables you to specify exceptions to the execution
schedule you defined. The Exclusions page works exactly like the Schedule page but the
settings you select here specify times not to run the Pipeline. Toggle the value of the
Exclusions enabled check box to suspend or enforce your exception.
l Notifications: The Notifications page enables you to receive notifications of execution
successes and failures. You can receive notifications by way of additions to a specified
text file, or through email. Support for email notifications is only enabled if your Snap-
Logic Server has been configured with information about your outgoing SMTP server.
The default snapserver.conf file is configured for logging only.
1. Edit the snapserver.conf file.
2. Uncomment and specify the appropriate values for the [[email]] section, as shown in
the following example:
# Notifications
[notification]
# [[email]]
# smtp_server = smtp.gmail.com:587
# smtp_use_tls = yes
# smtp_login = some_account@gmail.com
# smtp_password = some_password
# to = some_target_email
# from = some_account@gmail.com
# subject_prefix = NOTIFICATION
# success_template = email_success_notification.tmpl
# failure_template = email_failure_notification.tmpl
[[file_write]]
filename = notification.txt
root_directory = $SNAP_HOME/../logs
- 53 -
SnapLogic® User Guide
# success_template = file_success_notification.tmpl
# failure_template = file_failure_notification.tmpl
3. Restart the SnapLogic Server and reconnect to the Server in the Designer.
Notification Templates
Templates have been provided that allow you to send custom notifications for successes and
failures, as defined by the success_template and failure_template settings in the Noti-
fication section of the snapserver.conf file. The parameters available for use in these tem-
plates include:
l $name : Name of the Pipeline execution event
l $uri : Pipeline URI
l $status : Result of the Pipeline execution 'Completed' or "Failed'
l $hostname : Hostname of the server that ran the Pipeline
l $start_time : Time the Pipeline run began
l $end_time : Time the Pipeline execution ended
l $status_uri : URI to get status information on this run
l $log_uri : URI to get log information related to this run
l $err_msg : If status is "Failed", this contains the error message
The data is written out to trace files using a comma-separated format, with each record ter-
minated by a newline. The data traced can include:
l Component inputs: All Component inputs in the Pipeline dump their data into trace files.
l Component outputs: All Component outputs in the Pipeline dump their data into trace
files.
l Component inputs and outputs: All Component inputs and outputs are traced.
You can turn data tracing on and off within the Designer, in SnAPI, or through the SnapLogic
Server Configuration file.
- 54 -
Pipelines
1. With your Pipeline open in the Canvas, click Run from the canvas toolbar.
2. Select the Run menu in the slider's Pipeline Properties. From this screen, select the Run
tab.
3. Set the parameters and make other adjustments to your Pipeline as required.
4. Click the down arrow on the Run button that resides under the list of Parameters to dis-
play a drop-down list of data tracing options
5. Select a tracing option from the Run drop list. The options are:
l Run, No Trace: This is the default setting. No trace files are created.
l Trace INPUT: Use this setting to force all Component inputs in the Pipeline to
dump their data into trace files.
l Trace OUTPUT: Use this setting to force all Component outputs in the Pipeline to
dump their data into trace files.
l Trace ALL: Use this setting to force all Component inputs and outputs to be
traced.
6. Click the Run button.
7. Examine the trace files, as described in the Data Trace Files section.
Set the parameter to input, output, or input,output. Note that data tracing settings you specify
when executing a Pipeline in Designer or SnAPI override any settings in the snapserver.conf
file.
Refer to the Configuring SnapLogic Server section for more information on the configuration
file.
l the Component's name
l the input or output name
l a .in or a .out suffix denoting whether the trace contains input our output data, respec-
tively
For example, a CSV Read Component called "FileReader," used in a Pipeline whose output is
named "Output1," results in the trace file name: FileReader.output1.out.
- 55 -
SnapLogic® User Guide
If the Pipeline has a DB Write Component named "DataWriter," whose input is named "Input1,
" then the resulting file name is: DataWriter.input1.in.
If the Pipeline itself resides in another Pipeline named "Pipe1," and then executed, then the
names of the files are: Pipe1.FileReader.output1.out and Pipe1.DataWriter.input1.in.
- 56 -
6
Snaps
A Snap is a SnapLogic software package that adds to the functionality and connectivity pro-
vided by the SnapLogic Server. For example, a Snap may add connectivity to SalesForce, or
add functionality such as Data Cleansing. Snaps are typically purchased from the SnapStore
and installed on a SnapLogic server via Designer. A Snap usually installs one or more of the
following SnapLogic objects:
l A collection of Component templates that are functionally related, such as the Sales-
force Snap.
l A Wizard that helps create Components from component templates.
l A collection of Components and Pipelines.
A Snap can perform as simple a task as to read data from a file, or as complex an operation
as to connect to an instance of Microsoft Dynamics CRM, analyze the source data, and provide
full access (data and functionality) to all standard and custom objects within Microsoft®
Dynamics CRM. When changes have been made to standard or custom objects, the Snap
adapts and provides you access that takes this change into account.
A Snap in SnapLogic is comparable to a smart phone app, a browser add-on, or an application
plug-in. A Snap can perform a simple task, such as reading data from a file, or a more
involved grouping of tasks, such as adding a comprehensive set of Insert, Update, Delete,
Upsert, and Search capabilities to all Microsoft Dynamics CRMobjects. You can build your own
Snap, or download Snaps built by the SnapStore community.
Accessing Snaps
Snaps are add-on solutions that can be downloaded and installed to enhance the functionality
of your SnapLogic Data Server. Snaps can be contributed by the SnapLogicCommunity to
solve a specific integration problem. A Snap may include new Components and Pipelines.
The SnapLogic Server comes prepackaged with a library of commonly used Components.
These prepackaged Components are at your disposal as soon as you install SnapLogic. They
include:
l field-level operations: for example: arithmetic, string, dates, and type conversion
l complex operations: for example: join, lookup, filter, and sort
l advanced operations: for example: compute, regex, and DB analytics
Additionally, SnapStore, SnapLogic's online marketplace, serves many of your specific inte-
gration needs by providing additional, specialized Snaps. Some of the Snaps in the SnapStore
are free of charge; others must be purchased.
You can access SnapStore directly from the Designer in the following ways:
- 57 -
SnapLogic® User Guide
l In the Designer menu bar, click SnapStore.
l When searching for a Snap in the Foundry, if you cannot locate your desired Snap, you
can click a link that takes you to the SnapStore.
To download a Snap from the SnapStore, browse the product listing using categories, tags, or
searches until you find the Snaps you want. Shortly after you complete the checkout and pay-
ment process, you receive two emails: one to confirm your purchase; the other containing
any download links to your Snaps. SnapLogic must be installed before you can install your
newly purchased Snap.
Installing Snaps
You can integrate with SnapStore--that is, you can install Snaps purchased from the Snap-
Store--directly from the Designer. After having selected a Snap from SnapStore, you receive
an email containing a URI to access the Snap. You can, but are not required to, download the
Snap to a temporary directory on your system. When you install the Snap, you are prompted
for either the URI you received (if you have not yet downloaded the Snap), or for the Snap's
location (if you have already downloaded it).
Follow these instructions to install a Snap purchased from the SnapStore:
l At the installation screen, you can specify the URI you received in your SnapStore email
when you selected the Snap, and then select the Component template from the drop-
down list. If a subfolder specified in the URI name does not already exist, it will be
created automatically. Or, if you have already downloaded the Snap, click Browse to
locate it on your system.
l If you have enabled sandboxing, which creates a security sandbox for Java Snaps by lim-
iting access to resources such as network destinations, file system locations, and execut-
ing processes, then SnapLogic prompts you for permission to use each Component that
the Snap requires. If you deny access to any Component, the Snap installation is can-
celed. (For more information on this feature, refer to "Sandboxing to Protect Your Snap-
Logic Environment.")
The installer decompresses the required files and installs them, prompting you when the instal-
lation is complete. After the Snap installation is complete, the Foundry displays a new cat-
egory tab with the Snap's name. This tab contains the available Component templates in the
Snap.
- 58 -
Snaps
Configuring Snaps
Configuring a Snap is also a simple process from within the Designer. For your new Snap, the
Foundry contains a folder with the Component templates available for the Snap. For more
complex Snaps, a wizard is also provided. You can either use the wizard, or edit the Com-
ponents directly in the Designer, referring to the "Components" chapter for instructions. Snap-
Logic strongly recommends using the wizard when it is available. A wizard is intentionally
provided for any Snap whose complexity makes editing the Components directly an involved
process. The wizard guides you through a series of questions that vary with each Snap, and
that dramatically simplify the configuration process.
Follow these instructions to configure a Snap using the SnapMaker wizard:
l In the Foundry, click on the Category tab displaying the Snap's name and expand it.
l Launch the Wizard within the Snap's folder. The SnapMaker screen appears.
l Enter the Snap-related information requested in the SnapMaker screens. The wizard
begins by collecting source connectivity information.
l Select which records to generate from the Snap if you need only specific records; other-
wise, select all records.
l Review the Summary screen and click Finish.
The SnapMaker wizard displays status messages as it completes the configuration. After the
SnapMaker wizard is done, configured Components for your configured Snap appear in the
Library.
Developing Snaps is a straightforward process. Although Snaps vary in complexity and func-
tionality, all Snaps follow the same pattern, use the same APIs, and take advantage of the
common functions provided by the SnapLogic platform. However, developing Snaps is not the
same as using SnapLogic. As a SnapLogic user, you create Components, and assemble Pipe-
lines from existing Component templates and the Components you have already created. You
focus on data flow and manipulation instead of finer details. By contrast, as a Snap developer,
- 59 -
SnapLogic® User Guide
you approach your task from the perspective of how to create Components that others can
use, instead of focusing on how to use existing capabilities.
Concepts to Grasp. A good Snap developer must first be comfortable with using SnapLogic,
and must understand these main elements of SnapLogic:
l Component templates and Components
l Pipelines
l Data Services
l Data types
Types of Snaps. There are two general categories of Snaps:
l Connectivity Snaps: Connectivity Snaps add connectivity to an application or data
source. The SalesForce.com, NetSuite, and SAP Snaps are examples of connectivity
Snaps. These Snaps normally include new Components that access the application API
and translate it to the SnapLogic API.
l Solution Snaps: Solution Snaps are higher-level Snaps that implement the business
logic for a specific integration scenario, such as "quote to bill" between a CRM system
and a financial system. Solution Snaps normally include Pipeline and Component def-
initions that implement business logic. They often also include connectivity Com-
ponents, or depend on connectivity Components provided by another Snap.
The Twitter Snap in the Snap Developer Tutorial is a connectivity Snap; it adds Twitter
read and write capabilities to SnapLogic, but does not provide any predefined Pipeline
logic to process Tweets for a specific purpose.
Parts of a Snap. A Snap is composed of three parts:
l Component templates
l Pipelines and Components
l An installation program
All Snaps include an installation program. Connectivity Snaps include primarily Com-
ponent templates and Components, while solution Snaps are oriented toward Pipelines
and Components. Because the Snap Developer Tutorial profiles a connectivity Snap, it
includes an installer, Component templates, and some Components to get its users
started.
l Which application objects must the Snap expose as data?
l What data access does the Snap require?
l What data must the Snap needs read?
- 60 -
Snaps
l What data must the Snap write or update?
l Which application functions may the Snap need to call?
l Which transformations are likely to be used on the data?
l Which utility functions from the application should the Snap expose?
The answers to these questions guide your creation of a model of the Snap. Some of its
capabilities may require new Components, while others can be addressed by creating Pipe-
lines or using existing Components. In general, connectivity Snaps include a reader Com-
ponent and a writer Component. They occasionally include one or two function Components.
For example, in CRM integration, the "convert" functionality is neither a data source nor tar-
get; rather, it is implemented as a transformation or utility function.
For more information about developing Snaps, visit SnapLogic's Snap Developer Doc-
umentation page.
- 61 -
SnapLogic® User Guide
- 62 -
7
Administration
This section of the document covers administration functionality of your SnapLogic envi-
ronment, such as:
l Starting and Stopping SnapLogic and Component Servers
l Configuring SnapLogic Server
l Configuring Authentication
l Enabling SSL
l Using the Management Console
l Sandboxing to Protect Your SnapLogic Environment
l Importing and Exporting Components
l Running SnapLogic Behind a Proxy
l Understanding SnapLogic Data Types and Output Representation Formats
l Using the SnapAdmin Utility
l Using the SnapLogic Sidekick
User names can contain only ASCII alphanumeric characters and must be lowercase.
Using SnapAdmin
1. Run SnapAdmin as described in "SnapAdmin Utility".
3. Set the credentials used for requests to the server to the default set for the admin, as:
credential set default admin. Enter the password for the admin when prompted to
do so.
- 63 -
SnapLogic® User Guide
l <username> is the name for the user
l <password> is the password.
5. Restart the SnapLogic Server for the changes to take effect.
Within Designer
You can create users within Designer provided you are not using an external authentication
method.
3. Type a name for the user and supply a password, then click OK.
Using SnapAdmin
1. Run SnapAdmin as described in "SnapAdmin Utility".
l <groupname> is the name of the group you are creating.
3. Restart the SnapLogic Server for the changes to take effect.
Within Designer
You can create groups within Designer provided you are not using an external authentication
method.
3. Type a name for the group and click OK.
Using SnapAdmin
1. Run SnapAdmin as described in "SnapAdmin Utility".
l <groupname> is the name of the group you are creating
l <username> is the name of the user you are adding the to the group
- 64 -
Administration
3. Restart the SnapLogic Server for the changes to take effect.
Within Designer
You can add users to groups within Designer provided you are not using an external authen-
tication method.
3. Select the user, then click OK.
You can invoke this script with start/stop/restart arguments.
l snapctl.sh start: Start the server processes.
l snapctl.sh stop: Stop the server processes.
l snapctl.sh restart: Stop and restart the server processes.
Note: Invoking this script with --admin_mode enforces single-user mode. It is rec-
ommended to run in that mode for installation and upgrade of Snaps to ensure con-
sistency.
- 65 -
SnapLogic® User Guide
snapserver.conf. Some options that must be enabled or disabled in the configuration file
also require additional configuration steps; these are broken out into separate topics within
this chapter.
l SnapLogic Server Configuration File: snapserver.conf
l General SnapLogic Server Settings
l Component Container Configuration Parameters
l Data Cache Configuration Parameters
l Notification Instructions
l Management Console Configuration
For information on clustering servers, see "Clustering Servers".
l Mac: /Applications/snaplogic/config/snapserver.conf
l Linux: /opt/snaplogic/config/snapserver.conf
l log_dir: The location of logging directory.
Example: log_dir = /opt/snaplogic/logs.
l log_level: Sets the SnapLogic Server log level (possible values are ERR, INFO,
DEBUG). Note: the log level is set separately for the server and the Component Con-
tainers.
l server_hostname: The hostname for this server.
Example: server_hostname = hostname.
l server_proxy_uri: If the public hostname for this server is different than reported by
`hostname` then set the server_proxy_uri to the external URI.
Example: server_proxy_uri = http://HOSTNAME:DATAPORT
l server_address: The address to which this server binds.
Example: server_address = 0.0.0.0.
l server_port, server_secure_port: The port numbers used by the server.
Example: server_port = 80 and server_secure_port = 443.
l server_secure_cert: Location of the server certificate.
Example: server_secure_cert = /opt/snaplogic/config/host.pem.
- 66 -
Administration
l server_secure_ignore = no: This setting tells SnapLogic to enable SSL. To disable
SSL, change the setting to yes.
Example: server_secure_ignore = yes.
l polling_interval: The interval of time, in seconds, that a Pipeline receives status
updates regarding the Components inside it.
Example: polling_interval = 60.
l static_dir: Specifies the directory from which static content is served. All requests for
the /__snap__/__static__ URI space are served from within this directory. In the file,
SnapLogic is calling this directory __static__, but you can assign any other name. If
the directory name starts with a forward slash (/); for example, /tmp/static, then it
is interpreted to be an absolute path name.
Example: static_dir = /opt/snaplogic/static.
l state_dir: Directory to store state data needed across server restarts.
Example: state_dir = /opt/snaplogic/repository.
l explorer_uri: Location of the explorer (as a fully qualified URI)
Example: explorer_uri = http://Snap05:443 /__snap__/sta-
tic/designer/index.html?mode=explorer
l pipe_to_http_uri_prefix: Specify the prefix that should be added to resource URI, in
order to have it executed via pipe_to_http.
Example: pipe_to_http_uri_prefix = /feed.
l auth_file_config, auth_file_passwords: If the auth_file option is present, then all
authentication information is read from the specified file. In the absence of this option,
SnapLogic Server does not start.
Examples: auth_file_config = "/opt/snaplogic/config/snapaccess.conf" and
auth_file_passwords = "/opt/snaplogic/config/passwords".
l license_file: Location of the license key file.
Example: license_file = "/opt/snaplogic/config/license.txt".
l allow_proxying_to: To enable the management console, comment out this line. For
tighter security, use a comma delimited list. For more information, refer to the "Con-
figuring the Management Console" section.
Example: allow_proxying_to = server1.somedomain.com:443, sever2.another-
.domain:443.
l auth_plugin, auth_plugin_args, and proxy_auth_header: these properties are
used to implement custom authentication.
l LDAP_address: To enable LDAP authentication, uncomment this line and update it to
point it to your LDAP URL. This is no longer in the file by default, but is supported for leg-
acy reasons. The recommendation is to use the auth_plugin and auth_plugin_args
instead.
l log_backup_count: Set the maximum number of backups to create if log rotation has
been enabled. Each backed up log gets suffixed with .1, .2, up to number of backups
specified. The default value is 5.
- 67 -
SnapLogic® User Guide
l max_log_size: Set a maximum size which applies for each snaplogic log file. By
default there is not limit on the size. When a file reaches the upper limit, the contents of
the file are rotated to a backup log name (suffixed with .1, .2, etc.) as long as log_
backup_count has a non zero value. The size can be specified in bytes, MB or GB.
l client_token_cache_limit, client_token_timeout: If token-based authentication is
being used in the client, then client_token_cache_limit is the maximum limit on the
number of client tokens cached by the server (default is 10000 entries). client_token_
timeout is the time a token is valid for, specified in minutes (default is 1440 minutes,
24 hours). If the account password is changed or the account is deleted, the next
request with that token will fail with a 401 error. To renew a token, a call can be made
to the get_token api with the old token, which will give a new token valid for 24 more
hours. A maximum of 10000 tokens will be cached; more than that and the expired
tokens or the oldest token will be removed.
l max_job_limit: Adding this property sets a limit on the maximum number of jobs a
data server will accept. Can be used on a worker node and also on a standalone server.
Setting it on a head node does not do anything since job execution happens on workers.
If one worker hits the throttle limit and fails, the job is tried on all other workers one by
one. If it fails on all workers, the job is failed.
l log_level: Sets the Python Component Container log level. Possible values are CRIT-
ICAL, ERROR, WARNING, INFO, DEBUG.
l heartbeat_seconds: Server pings the CC every heartbeat_seconds (set to 0 to dis-
able).
l filesystem_root: Changes the filesystem root. By default, the root is the SnapLogic
install directory. For information on used of this feature in clustered deployments, see
"Clustering Servers".
l disable_sandbox: This parameter controls sandboxing. Sandboxing enables you to run
Components in a restricted environment provided by the JVM. By default, sandboxing is
enabled. You can disable it by setting the disable_sandbox property to false. Refer to
the section on Sandboxing to Protect Your SnapLogic Environment for details.
l trace_data: To enable data tracing, add this line to the [component_container] sec-
tion of the snapserver.conf file, and set the parameter to either input, output, or
input,output, as follows: trace_data=input,output. To turn data tracing off, comment
out or remove the line from the file. The data tracing settings you specify when execut-
ing a Pipeline in Designer or SnAPI override this parameter in the configuration file.
- 68 -
Administration
Refer to the "Data Tracing via the SnapLogic Server Configuration File" section for
details.
l cc_resdef_cache_enabled and cc_resdef_cache_max_entries: control resdef
caching.
l cache_dir: The location of the cache directory.
l cache_timeout: The amount of time, in seconds, that data should be cached.
Example: cache_timeout = 300.
l cache_size: The maximum allowed size of the cache. You can specify the size in bytes
(for example, 10000), in kilobytes (in this case, use a KB/kb suffix; for example, 10KB
or 10kb), in megabytes (using an MB or mb suffix), in gigabytes (using a GB or gb suf-
fix), or in terrabytes (using a TB or tb suffix). Example: cache_size = 10MB.
While caching large data output (greater than 2GB) from a single resource, you need to deter-
mine if the OS can support a large file. Also, the Python interpreter might need to be compiled
with special flags to handle large files (see http://docs.python.org/lib/posix-large-files.html).
Also, the OS should be capable of supporting large files.
l high_water_mark: An integer percentage value (0 - 100) specifying the percentage of
the maximum size at which cache cleanup is initiated. The cache can temporarily
exceed the maximum size, so SnapLogic recommends specifying a value less than
100% to allow for these temporary excessions.
Example: high_water_mark = 90.
l low_water_mark: An integer percentage value (0 - 100), specifying the percentage of
maximum size that must be reached once cache cleanup is initiated.
Example: low_water_mark = 60.
Repository Configuration
The parameters in the [repository] section of the snapserver.conf file describe your meta-
data repository.
l type: The type of database system hosting your metadata repository.
Example: type = sqlite.
l path: The location of your metadata repository database file.
Example: path = /opt/snaplogic/repository/repository.db.
Notification Instructions
Configure the parameters in the [notification] section of the snapserver.conf file to spec-
ify how the Scheduler notifies you about Pipeline execution. For details on configuring these
parameters, refer to the section on "Enabling Email Notifications" for more information.
- 69 -
SnapLogic® User Guide
Additional setup involves logging into the console on your primary server to add extra servers
as required. Refer to the section on "Configuring the Management Console" for detailed infor-
mation.
l node_type: Designates whether a node is a head or worker node. See "Clustering
Servers" for more information.
l head_node: Used on a worker node to designate the head node in the cluster.
l workers: Used on a head node of a cluster to list the worker nodes.
l jobs_per_worker: The maximum number of jobs a worker can run at a time. Set to -1
to disable it. Both jobs_per_worker and max_job_limit can be used together. In that
case, jobs_per_worker is applied on the head node (with queuing of jobs) and max_
job_limit is applied on the worker (with no queuing). See "Configuring Job Distribution
Across Workers" for more information.
Prerequisites:
l Private key used to create a certificate signing request (CSR)
l Signed SSL certificate
l CA bundle certificate
To replace the certificates:
1. (Optional) Backup existing keystore and certificate files: host.cert, host.jks,
host.pem.
2. Change the current directory:
- 70 -
Administration
cd /opt/snaplogic/config
3. Create a PKCS12 file using the CA signed certificate:
where:
l <BUNDLE_CERT> refers to the CA bundle certificate
l <PRIVATE_KEY> refers to the private key used to create the signed SSL certificate
CSR
l <SIGNED_CERT> refers to the signed SSL certificate
4. Convert the PKCS12 file created in Step 2 into a keystone:
5. Create the new server certificate:
cp <SIGNED_CERT> host.cert
where:
l <SIGNED_CERT> refers to the signed SSL certificate
6. Create a new PEM file:
where:
l <PRIVATE_KEY> refers to the private key used to create the signed SSL certificate
CSR
l <SIGNED_CERT> refers to the signed SSL certificate
7. Restart SnapLogic services:
/opt/snaplogic/bin/snapctl.sh restart
Clustering Servers
Add a [cluster] section to your snapserver.conf file to run your SnapLogic Servers in a
cluster. Clustering is a method of improving performance by allowing multiple Pipeline
execution requests to proceed in parallel in a cluster of worker nodes. In clustering, data
servers are assigned roles of either head node or worker nodes. The head node then main-
tains a queue of job submissions. As jobs come in, they are scheduled to the worker nodes,
with an emphasis on load balancing.
- 71 -
SnapLogic® User Guide
A SnapLogic clustering configuration consists of a data server designated as the head node
and one or more data servers designated as worker nodes:
l Head node: Jobs get queued by the head node and then de-queued and distributed
when a worker node is available to run them. The head node interacts directly with Snap-
Logic Designer and SnAPI, and provides cluster execution and reporting data to the man-
agement console and to Designer.
New Snaps are installed in the head node, which automatically repeats the Snap instal-
lation on each of the worker nodes, provided that the worker nodes are online during
installation.
l Worker nodes: The worker data servers accept pipeline execution requests and report
completion back to the head node. The worker nodes do not access the repository
directly; they point to the head node repository to retrieve Component and Pipeline
parameter values. The status of Pipeline jobs on worker nodes can be polled using a
status interface. Worker nodes report job completion to the head node.
If you submit a Pipeline execution request to a cluster, it is performed by the cluster; if you
submit a request to an independent server, it is performed by the individual server. The proc-
ess of executing Pipelines in a cluster environment is identical to executing them on an inde-
pendent data server. Follow the process described in the "Executing Pipelines" section
regardless of your server environment.
Configuring Clusters
To establish a SnapLogic cluster, designate a data server as a head node and one or more
data servers as worker nodes by modifying the SnapLogic Server Configuration file (snap-
server.conf) on each node. Follow these instructions to configure your cluster:
l Identify your head node and worker nodes, and note the URI or IP address of each node
(for example, http://head:443, http://worker1:443, http://worker2:443).
l Install SnapLogic onto each node. The nodes in a cluster configuration do not have to
run the same operating system.
l If not using LDAP authentication, ensure the SnapLogic passwords are synced up on all
machines of the cluster. One way to do this is to copy the passwords, cc1_creds and
cc2_creds files from the config directory of the head node to all the worker nodes. All
accounts (including user accounts created after initial configuration/deployment) need
to be synchronized between all cluster nodes. In addition, ACLs must be synchronized
across all nodes if custom ACLs are implemented.
l Modify the SnapLogic Server Configuration file (snapserver.conf) on each node as fol-
lows:
To designate a data server as the cluster head node, add the following [cluster] sec-
tion to its snapserver.conf file:
[cluster]
node_type = head
workers = http://worker1:443, http://worker2:443
- 72 -
Administration
To designate a data server as a cluster worker node, add the following [cluster]
section to its snapserver.conf file:
[cluster]
node_type = worker
head_node = http://head:443
l Restart the SnapLogic Servers on the head node and all the worker nodes.
Test the cluster by using the Designer to run a pipeline or using the test interface on the head
node: http://head:443/__snap__/__static__/console/cluster-info.html. (You can
also navigate to this page from the head node data server page: http://head:8088.) Go to
server_cluster, and then to cluster-info.html. From this page, you can submit Pipeline
jobs to the cluster.
Job execution requests which are directed to the head node are load balanced across the work-
ers. If a job execution request is sent directly to a worker, the job runs on the same worker
and does not apply towards the worker's job limit. If a Pipeline has a Execute Resource Com-
ponent, the target Pipeline should be changed to point to the head node to ensure that the jobs
spawned by the Execute Resource are load balanced across the cluster. The status of the job
queue can be monitored by using the URI http://head:8088/__snap__/cluster/info.htm.
This shows the queued jobs, the running jobs and the last few completed jobs.
To add a worker node to the cluster, the new worker node has to be configured appropriately
with the right credentials and [cluster] section. The required Snaps have to be installed on
the new worker by opening the Designer directly to the worker node. Then the new worker
should be added in the head node's [cluster] section and the servers restarted. See the
Worker section of "SnapAdmin Commands" for information on how to add or delete worker
nodes.
Input/Output Files
Pipeline execution requests which are sent to the head node are forwarded to one of the avail-
able worker nodes. The job execution happens on the worker node and any input files which
are read or output files to be created are created on the worker node where the job runs. To
ensure that the output file can be read without knowing which node the job ran on, a shared
filesystem mount can be created and all filesystem operations can be done on the shared
mount directory. One way to do this is to change the filesystem_root property to point to
the shared mount directory. The filesystem browse option in the Designer shows the files on
the head node by default. Using a shared mount point ensures that the files created as Pipe-
line output can be used for Suggest and previewed using the filesystem browser.
- 73 -
SnapLogic® User Guide
Troubleshooting
l Since the Pipeline execution happens on one of the worker nodes, the execution logs are
created on the worker node. If a Component has to be debugged, the debugger should
be connected to the worker node.
l Previewing the output of a Pipeline or a Component from the Designer or reading the out-
put of a SnAPI program requires the client to be able to talk to the worker node. If there
are firewall rules which prevent the client from talking directly to the worker nodes,
these operations will fail. The firewall rules need to be changed appropriately to fix this.
l The jobs_per_worker property can be used to increase the level of parallelism by
increasing the number of jobs to run on a worker. Setting this to a very high value can
cause issues with excessive memory usage on the worker node, possibly leading to out
of memory errors.
l SnapLogic installation by default installs self signed certificates. If the Designer is using
SSL connection to the head node, then for Pipeline or Component preview to work, the
browser should trust the self signed certificates used by the worker nodes. To do this,
open each of the workers SSL URIs https://head:8091, https://worker1:8091,
https://worker2:8091 from the browser and accept the prompt which asks whether
the certificate should be trusted.
Note: Installing and configuring keepalived is operating system dependent.
Depending on how the network is configured, assigning a virtual IP might require
changes in the network configuration. If virtual machines are being used, then the
VM settings might have to be changed to enable virtual IP addresses.
- 74 -
Administration
The following assumes that the master head node is named headmaster, the backup head is
headbackup, headvirtual is the name for the virtual IP address and worker1 and worker2 are
the worker nodes. The steps to configure the failover are:
1. Install keepalived on both the head nodes. keepalived has installables and keepalived
docs has the user guide.
A sample keepalived.conf file for the headmaster is
global_defs {
router_id my_router
}
vrrp_instance app_master {
state MASTER
interface eth1
virtual_router_id 36
priority 150
advert_int 1
preempt
garp_master_delay 5
virtual_ipaddress {
60.57.189.127 dev eth0
}
}
A sample keepalived.conf file for the headbackup is
global_defs {
router_id my_router
}
vrrp_instance app_master {
state BACKUP
interface eth1
virtual_router_id 36
priority 100
advert_int 1
preempt
garp_master_delay 5
virtual_ipaddress {
60.57.189.127 dev eth0
}
}
- 75 -
SnapLogic® User Guide
2. Install SnapLogic as usual on the cluster machines. Configure the cluster as documented
in the Clustering Servers section. On the master and backup head nodes, update all ref-
erences to the local hostname to headvirtual. On the worker node also, update the
head_node property to point to headvirtual. On the master head node, add a backup_
head_node property pointing to the backup head node.
[cluster]
node_type = head
workers = http://worker1:443,http://worker2:443 backup_head_node
= http://headbackup:443
3. Restart keepalived and SnapLogic on all the machines. Change the Designer to point to
http://headvirtual:443.
4. Test the failover by shutting down the headmaster machine and trying to run jobs on the
cluster. The client requests should go to the headbackup and the jobs should run suc-
cessfully.
If failover is configured using some other means like HTTP failover or DNS failover, then the
SnapLogic configuration changes remain the same as mentioned. Mainly, the master head
node has a new entry pointing to the backup_head_node and all the configuration entries point-
ing to the head node use the virtual name. Since the head node maintains a job queue and
keeps track of job distribution, multiple head nodes cannot be enabled at the same time. So if
HTTP failover is used, only one of the head nodes should be enabled at a time.
Repository Synchronization
In a failover configuration, both the master and backup head node maintain a copy of the
SnapLogic repository database. Any repository changes are replicated from the master to the
backup. If the backup head node is offline for some time, the repository can go out of sync
with the repository on the master. To resynchronize the repository, the repository database
can be copied from the master head node onto the backup head node. If the master machine
goes down and there are repository changes on the backup head node, the repository from the
backup head node has to be synced up to the master repository after the master node comes
back up again.
- 76 -
Administration
During a Pipeline execution, any output files created by the Pipeline are created on the
machine which ran the Pipeline. A shared filesystem can be used to ensure that input and out-
put files are available on every machine. The filesystem_root setting controls the location
of the input and output files used by the Components. The log file and trace data for each Pipe-
line execution also are available only on the instance which ran the Pipeline.
Configuration
Each SnapLogic instance is installed using the installer. The user accounts on each instance
need to be synced up. One way to do this is to:
1. Copy the passwords, cc1_creds and cc2_creds files from the /config directory of the
first instance to the /config directory of all other instances.
2. Add the failover related entries need to the snapserver.conf config file.
Two entries need to be added to the [main] section of snapserver.conf. The first is
backup_servers, which is a comma separated list of the backup instances. failover_
proxy_uri is the URI of the load balancer. For example, if the two instances are run-
ning at https://instance1.mydomain.com:443 and https:/-
/instance2.mydomain.com:443 and the load balancer is running at
https://loadbalancer.mydomain.com:443, then the entries on instance1 are:
backup_servers=https://instance2.mydomain.com:443
failover_proxy_uri=https://loadbalancer.mydomain.com:443
The entries on instance2 are:
backup_servers=https://instance1.mydomain.com:443
failover_proxy_uri=https://loadbalancer.mydomain.com:443
3. Install and configure the load balancer in front of the SnapLogic instances. The load bal-
ancer configuration details depends on the type of load balancer being used.
Sidekick Configuration
Each of the instances can be configured with a Sidekick. In this configuration, there would be
two SnapLogic servers and two Sidekicks, one for each of the SnapLogic Servers. There will
be single load balancer in front of the two SnapLogic Servers. The steps to configure the
server would be similar to above. Since the user accounts need to be synced up between the
SnapLogic Servers, we need to ensure that the Sidekick is getting the updated server cre-
dentials. The sequence of steps would be:
l Install SnapLogic on the two server instances. Sync the user credentials by copying the
passwords, cc1_creds and cc2_creds files from the config directory of the first instance
to the config directory of the second instance.
- 77 -
SnapLogic® User Guide
l Run /opt/snaplogic/bin/snaplogic_sidekick_generate_config.sh on both the
instances to ensure that the updated credentials are available to the Sidekick instance
for download.
l Install the sidekick machines, and run /opt/snaplogic/bin/snaplogic_sidekick_
download_config.sh to download the sidekick configs.
l Start all the instances and enable sidekick for each instance individually. Each instance
should be functional with its sidekick.
l Add the backup_servers and failover_proxy_uri entries to the server instances.
Install and configure the load balancer and restart the servers.
Troubleshooting
l If using self signed certificates for the load balancer, then the load balancers SSL cer-
tificate needs to be added to the trust keystore of the Java CC of all the SnapLogic
instances. This can be done using the command
$SNAP_HOME/pkg/java/jre1.6.0_20/bin/keytool -importcert -alias
proxy -file /etc/ssl/certs/myssl.crt -keystore $SNAP_
HOME/../config/host.jks -storepass changeit
where $SNAP_HOME is the install location of the SnapLogic version and
/etc/ssl/certs/myssl.crt is the certificate being used by the load balancer.
l Configuration changes are not currently synced between the instance. So changes like
creating new user accounts etc need to be done on each individual instance separately.
l In a failover configuration, each instance maintains a copy of the SnapLogic repository
database. Any repository changes are replicated to every instance. If one instance is
down for some reason, the repository can go out of sync with the other instances. To
resynchronize the repository, the repository database can be copied from the working
instance onto the failed instance.
Memory Configuration
In some cases, Pipeline execution might require you to increase the memory allocation of the
Java process. The Java Component Container memory allocation can be defined in the cc2_
java.sh/bat files, located in the installdir/product/version/bin/init.ddirectory.
You can set the option using set SNAP_JAVA_MEMORY=-Xmx'MEM_VALUE' to allocate 'MEM_
VALUE' memory for the Java CC process. The default value is 256MB. Other valid values are -
Xmx512, -Xmx1048 or -Xmx2048. In the case that your physical memory of the machine is
exceeded during startup, then the Java CC process will fail to come up and write its log into
- 78 -
Administration
installdir/logs/javacc_stderr. As an example: Some Linux kernels only provide 1.7GB
available process memory. If you are setting the 'MEM_VALUE' to 2048 then the process will
fail to start. The option should be set to the maximum available physical memory or below
that limit.
The memory configuration allows you to increase performance for certain Components such
as the Sort or the Aggregate Component. Both allow you to configure its memory usage during
execution. As an example, the Sort Component allows you to configure how much memory
can be allocated by the Component to sort the records in memory before they are written to
disk. Setting this to, for example, 200 MB will allow you to sort 200 MB of records in memory.
In the case that the whole input record set will fit into the allocated memory, then the sort will
be much faster since it does not need to write records to disk during the sort operation.
Buffer Configuration
During Pipeline execution (in a Java-only execution environment), a buffer is kept between
two Java Components that are linked to each other. The first one (defined as the upstream
Component) links to the second one (defined as the downstream Component). In between
resides a buffer with the default size of 1000, meaning up to 1000 records are kept in memory
between the two Components. In the case the downstream Component is slower then the
upstream Component in regards to throughput, then the buffer will fill up over time, having
1000 records remain in memory until the down stream Component consumes all remaining
records. The assumption is that the upstream Component is either very fast or the process is
long running to allow the buffer to fill up.
Lets say you have a larger Pipeline with ten Java Components (meaning nine buffers) and the
last downstream Component is the slowest Component in the Pipeline. In this case all nine
buffers will fill up over time, leading to 9000 records in memory. This might exceed the allo-
cated memory depending on the record size. For this scenario it is advised to decrease the
buffer size to a lower setting by defining the -Dsnap.pipe.size='BUFFER_SIZE' option as a
Java argument cc2_java.sh/.bat files. The 'BUFFER_SIZE' setting should not be lower then
the actual throughput of the slowest Component, meaning if the slowest Component produces
200 records/sec then a buffer size of 200 should be defined.
- 79 -
SnapLogic® User Guide
By default, the SnapLogic Server is installed with a basic authentication configuration that
allows "public" users to perform all operations, with the exception of modifying the Tutorial
examples located in the /SnapLogic/Tutorial namespace. Depending on your requirements,
you can further modify the authentication configuration to enhance the security configuration.
You can perform active directory-based authentication by configuring SnapLogic for your LDAP
database, or file-based authentication by configuring the Snap Access Configuration (sna-
paccess.conf) file.
Note: If a user account name is defined both locally within SnapLogic as well as
through the authentication service, the credentials are checked against the built-
in authentication first. If it is not a valid built-in credential, it is checked against
the plug-in. If the same user name is defined in both places, it is a valid account
as long as the password matches with either definition.
The SnapLogic Data Server logs all interactions in the access log files: main_process_
access.log, cc1_access.log, and cc2_access.log. Each access request logs the time, orig-
inating IP address, username, operation, and URI.
Note: If you are in a token-based authentication environment, see client_
token_cache_limit and client_token_timeout in the "General SnapLogic
Server Settings"
l Read: Allows access to basic metadata including description, inputs and outputs,
required arguments, and similar objects.
l Write: Allows users to save Components they create.
l Execute: Allows execution of the Component or Pipeline.
A SnapLogic user has an identity comprised of a username and set of groups to which the user
belongs. There are two default groups: public and known. All users belong to the public group,
but only users who have authenticated by providing a username and password belong to the
known group as well.
- 80 -
Administration
Note: The administrator can create users and assign passwords with the Sna-
pAdmin utility users command.
The SnapLogic Designer prompts for authentication when you initially connect to a Data
Server, or when you add a new Data Server. SnAPI users can provide their credentials
through the appropriate interface routines.
l Users: Enumeration of individual users.
l Groups: Enumeration of logical groups. These groups are optional and in addition to
the system public and known groups.
l UserGroups: Specifies to which groups a user belongs. This is optional, because a user
is not required to belong to any administrator-defined groups.
l ACLs: Specifies, per URI, which user or group has which role or privileges.
Understanding ACLs
The SnapLogic Server reads the Access Control Lists (ACLs) from snapacess.conf file on
server startup. These ACL rules apply to every REST request to the server. ACLs can be
defined at a user level or at a group level. Every group should have an entry in the <Groups>
section of snapacess.conf. Users are assigned to groups by adding entries in the <User-
Groups> section.
Access rules are configured by adding entries in the <ACLs> section. Each entry specifies a
rule for a particular URI and the default SnapLogic installation comes with a set of predefined
rules.
Note: Rules for URIs beginning with '/__snap__/' are defined as required by the
product and usually should not be changed by the user.
You can add entries for Pipeline URIs. For example, if a Pipeline is defined at URI '/Test/M-
yPipeline', then an ACL can be added as follows.
<Location name="/Test/MyPipeline">
DENY USER joe
ALLOW GROUP dev_group PERMISSION READ WRITE EXECUTE NONRECURSIVE
</Location>
Some things to note:
l All rules are recursive by default, unless the NONRECURSIVE directive is specified.
l The directive names are case-insensitive, but user and group names are case-sensitive.
l Permissions can not be specified for DENY rules, it denies all access to the specified
URI.
- 81 -
SnapLogic® User Guide
l One or more permissions must be specified for ALLOW rules. Permissions are separated
by spaces and the order in which they are listed does not matter.
The syntax for a rule is:
The amount of information returned depends on what permissions the user has for this data:
l READ: Can access anything within the resdef.
l WRITE: Only has permission to see that the resource exists. No resdef data.
l EXECUTE: Can see the resource exists and any resdef data related to execution (known
as DESCRIBE view).
In general:
l GET requests require READ access.
l POST/PUT requests require EXECUTE.
l DELETE requests require WRITE.
However, there are variations.
Matching locations are evaluated with the longest prefix first. So, you can specify:
<Location name="/foo">
deny user joe
</Location>
<Location name="/foo/bar">
allow user joe permission read
</Location>
This gives 'joe' read access to '/foo/bar' (and everything below), but denies all access to '/foo'
(and everything below, except 'bar').
If a specific match for a user is found, the traversing of the path is stopped and the per-
missions are taken from that match. Permissions through group matches (a user can be a
member of multiple groups) however are accumulative. If a user is a member of multiple
groups and any of them allow access to a path, then access is granted.
Note: Denying read access to the root namespace / may disable some features.
Read access to the / (root) namespace and to the /__snap__/meta/info is required by
clients to whom you want to allow dynamic discovery of the SnapLogic Data Server capabil-
ities. Without it, a client must know the exact URI, instead of being able to discover it.
Note: To access files outside of SNAP_HOME (which is the install directory; an
example on Linux is /opt/snaplogic/[release number]), the file system root
property needs to be configured. For example, filesystem_root = /tmp will
- 82 -
Administration
allow access to all files from the Components. This can be done for each Com-
ponent Container individually.
Example
As an introduction to the flexibility of the SnapLogic authentication model, consider the fol-
lowing scenario:
l Harry and Sally are both members of the Finance department. They have been given
permission to execute the Pipelines in the /dept/finance/ space, but are not allowed
to modify them.
l Jane is also a member of the Finance department, but she is the maintainer of the Pipe-
lines in the /dept/finance/ space, so she has been given full permissions for that
space.
l No one outside of the Finance department is allowed access to that space. The Access
Configuration file should appear as follows:
<AccessConfig>
<Users>
#Username Description
harry Harry Smith - Senior Analyst
sally Sally Bell - Senior Analyst
jane Jane Burton - Finance Manager
</Users>
<Groups>
#Groupname Description
finance Finance Department
</Groups>
<UserGroups>
#Username Group1 Group2 ...
harry finance
sally finance
jane finance
</UserGroups>
<ACLs>
<Location name="/">
deny group public
allow group public permission read NONRECURSIVE
allow group finance permission read write execute
</Location>
<Location name="/dept/finance">
deny group public known
allow group finance permission read execute
allow user jane permission read write execute
</Location>
</ACLs>
</AccessConfig>
- 83 -
SnapLogic® User Guide
Note: Privileges are assigned to namespaces. They can be applied to a single
resource, or to a group of resources with a common prefix.
The server checks the permissions based on the longest matching prefix first, as follows:
<Location name="/dept/finance/">
...
</Location>
<Location name="/dept/finance/sales_report">
...
</Location>
<Location name="/dept/finance/users/jane/">
...
</Location>
l If a user accesses the URI /dept/finance/budget or /dept/finance/audit/results,
the longest matching prefix rule applies to the rules for location /dept/finance/.
l If a user accesses /dept/finance/sales_report, the rule for /dept/finance/sales_
report is used.
l If a user accesses the URI /dept/finance/users/jane/Q1_budget, the rules from the
location /dept/finance/users/jane/ are applied.
By default, the user "admin" is the administrative user. If you define an "admin_group" group
in the snapaccess.conf file, and add one or more users to it, then only this group's users will
have admin privileges. The "admin" user has no special privileges.
Follow these instructions to configure and use Active Directory-based authentication:
1. Edit the snapserver.conf file's [main] section to add an LDAP_address entry. By
default, this line is commented out in the snapserver.conf file. To enable Active Direc-
tory authentication, uncomment this line and update it to point to the LDAP server URL.
Example: LDAP_address=ldap://myserver.mydomain.com:389.
2. If the user name for the Active Directory instance is complex, a user name template can
be configured to simplify the user name to be entered by the user. This is usually
required when connecting to an OpenLDAP server. To configure the user name template,
uncomment the LDAP_user_template entry. Example: LDAP_user_tem-
plate="uid=%USER%,ou=Users,dc=mycompany,dc=com". %USER% is a keyword that gets
replaced by the user name supplied by the user. If the user logs on as "abc", the user
name sent to the LDAP server would be "uid=abc,ou=Users,dc=mycompany,dc=com".
3. Create a group named admin_group in the snapaccess.conf file. Add users that should
have administrative privileges to this group in the UserGroups section. For example, if
- 84 -
Administration
admin_user@ad_domain.mycompany.com is the account in LDAP for the admin user, the
following are your entries in the snapaccess.conf file:
<Groups>
test_group test group
admin_group Admin users group
</Groups>
<UserGroups>
admin_user@ad_domain.mycompany.com admin_group
test1@ad_domain.mycompany.com test_group
</UserGroups>
4. Restart the SnapLogic Server and Component Containers.
l auth_plugin: the name of the plug-in to use.
l auth_plugin_args: a comma-separated list of arguments for the plug-in
l proxy_auth_header: for use with a proxy-generated header as authentication. If this is
enabled, the server must be behind a proxy that protects against unauthenticated
access.
SiteMinder Support
As of 3.7, SnapLogic offers support for Single Sign-On with CA SiteMinder®.
a. Change the default port from HTTPS port 443 to HTTP port 80.
Uncomment this line:
# The port number used by server.
- 85 -
SnapLogic® User Guide
# To enable HTTP, uncomment this line.
server_port = 80
And comment out:
# The secure port number used by the server.
#server_secure_port = 443
b. Set the proxy uri to be the SiteMinder proxy, with the appropriate DNS name and port.
# If the public hostname for this server is different than reported
# by `hostname` then set the server_proxy_uri to the external URI.
#server_proxy_uri = http://<whatever the siteminder proxy address is>
c. Set the proxy_auth_header to be SM_USER
# To use a proxy-generated header as authentication, uncomment and update property
# below.
# NOTE: If this is enabled, the server MUST be behind a proxy that
# protects against unauthenticated access
# proxy_auth_header = "SM_USER"
1. Start the server:
/opt/snaplogic/bin/snapctl.sh restart
2. Verify it is up and running. Ideally by connecting to the server machine on port 80 with a
web browser, but if you only have shell access.
netstat -a | grep -i http
l Configure the SiteMinder proxy to forward all requests from a proxy URI to SnapLogic
on port 443.
For example, configure an Apache proxy (using mod_proxy) and modify httpd.conf
with:
ProxyPass / http://hostname.com/
ProxyPassReverse / http://hostname.com/
That forwards http port 80 from the proxy to the SnapLogic Server at hostname.com.
If you make the proxy_auth_header X_FORWARDED_FOR (which is added by the mod_
proxy), you should be able to log in with Designer with no credentials as your IP
address.
1. Configure SiteMinder Policy Server to allow access to the SnapLogic Server. You only
need to allow access to the main server port (80 by default) and support
GET/POST/PUT.
- 86 -
Administration
2. Once the proxy is set up, and SnapLogic is configured and running, try going to the Snap-
Logic root URI on the proxy. It should re-direct you to the SiteMinder login screen. Once
you enter your credentials, it should re-direct you to the SnapLogic landing page (the
one with all the API links).
3. If that is successful, launch Designer. You should see Designer automatically log in as
the SiteMinder user (identified by a numeric id). You should be able to build Pipelines
and run them.
Note: You may not be able to import/export resources or install Snaps because you are logged
in as an non-admin user.
1. To enable a SiteMinder user to have admin credentials, you need to edit the /opt/sna-
plogic/config/snapaccess.conf file and add one group:
<Groups>
# groupname description
...
admin_group admins
</Groups>
and then add the numeric user id that you want to have admin privilege to the user groups sec-
tion
<UserGroups>
# username groupname1 groupname2 ...
...
1234455678345 admin_group
</UserGroups>
Restart the server for it to take effect.
Security Overview
SnapLogic's security points consist of:
l The SnapLogic Designer and SnapLogic Server communicate with each other through a
secure HTTPS connection.
l The SnapLogic Server and the Component Containers communicate with each other
through a secure HTTPS connection.
l The method in which the Component Containers communicate through the various data
sources is dependent on the following:
- 87 -
SnapLogic® User Guide
l Assuming the data source supports secure communication generally, it is the
Snap developer's discretion whether or not to support secure communications.
Many existing Snaps support secure communication.
Enabling SSL
The SnapLogic Server, Python Component Container, and Java Component Container have SSL
listeners on ports 8091, 8092, and 8093, respectively. To enable SSL, modify the snap-
server.conf file's component_container section. A Component Container is a process that
runs a Component. Include the name of each Component Container in brackets, followed by
parameters describing the Component Container, as shown:
# Configuration of component containers (CC)
[component_container]
# Name of CC1
[[cc1]]
cc_hostname = YourHost-PC
# Name of CC2
SSL is enabled if server_secure port is defined in the main section and/or cc_secure_port
is defined in the cc section. By default SSL is enabled. For example, to disable SSL, comment
out server_secure_ignore and cc_secure_ignore properties in the snapserver.conf.
Note: The servers need to be restarted after any change in the
snapserver.conf.
Parameters
The parameters in the component_container section of the snapserver.conf file include:
l CC Name: Enter the name of each Component Container before specifying its param-
eters. Example: [[cc1]].
l log_dir: The location of the log directory for this Component Container. Example: log_
dir = /opt/snaplogic/logs.
l component_dirs: The location of component directory. Example: component_dirs =
"$SNAP_HOME","/opt/snaplogic/extensions/components".
- 88 -
Administration
l component_conf_dir: The location of the component configuration directory. Exam-
ple: component_conf_dir = "/opt/snaplogic/component_config".
l cc_port: The port number used by the Component Container process. Example: cc_
port = 8089.
l cc_secure_port: Number of the port on which secure communication is available via
SSL. Comment out this parameter to disable https. Example: cc_secure_port = 8092
.
l cc_secure_cert: This parameter points to the location of the certificate file required
for SSL communication. Example: cc_secure_cert = /opt/sna-
plogic/config/host.pem.
l cc_hostname: Name of the host running the Component Container. Example: cc_host-
name = YourHost-PC.
SSL Usage
To use SSL, enable SSL in the snapserver.conf and restart the server. The default SSL port
for the server is 8091. To connect to the Designer over SSL, use the URL https:/-
/machinename.domainname.com:8091/designer. By default, self signed certificates are
installed on the server. The browser would prompt for whether the certificate should be
trusted. Trusting the certificate would open the Designer with https. The machine name has to
be specified when installing the product, since the SSL certificate will have the machine name
and the browser can validate the machine name.
Note: Accessing SSL URLs through the IP address does not work since SSL cer-
tificates do hostname validation.
If the client (Designer) connects to the SnapLogic Server through https, all subsequent oper-
ations for that connection are done with SSL. If SSL is not desired for pipeline execution, SSL
can be disabled for CC's, keeping SSL for the server only.
Prior to 3.5, to disable non-SSL access (enable only SSL), comment out the server_port prop-
erty in the main section and cc_port property in the cc section of snapserver.conf. There
need to be a few updates to the startup scripts:
On Linux or OSX
Update the snaplogic_include.sh script in the INSTALL_VERSION_DIR/bin. Change DATA-
PORT to the port use for server_secure_port in snapserver.conf. Change SERVER_URI to
- 89 -
SnapLogic® User Guide
start with https instead of http. Restart the SnapLogic servers.
Management Console
The browser-based Management Console provides details about the performance of executed
Pipelines, whether in a cluster or on individual SnapLogic Servers. The Management Console
draws on comprehensive log message access, Pipeline- and Component-level statistics, and
analysis of Pipeline run history to enable quick drill-down to the root causes of Pipeline
execution failures. Use the Management Console to view data for multiple distributed Snap-
Logic instances, including:
l A dashboard view of Pipeline executions
l Historic reports of all Pipeline executions
l A detailed drill-down on each Pipeline's execution history, results, and contents
l An overview of your SnapLogic Servers at both the cluster and individual server levels,
cluster configuration and server designations, as well as jobs run by the cluster
l Access the Management Console by entering its URI into your browser's address bar:
http://<hostname>:<port>/console. A Login screen displays.
l Log in to your primary server; that is, the SnapLogic instance on the same domain as
the console into which you are logging. The Wall page displays, with an overview of
recent statistics for your SnapLogic instances. Because you are only logged in to your
primary server, the information summarizes only that instance.
l To register additional servers, go to the Setup screen.
l Under the Server URI heading, add each server by entering its hostname, port,
username, and password. Click Add Server.
l Enter the SnapLogic instance as host:port, without specifying the protocol (that is,
without specifying http://), as follows: some_snaplogic_instance:443. The
server you added now appears in the Setup screen.
l Repeat this step for each of your SnapLogic Servers. You can also remove servers
from the list by clicking Remove Server to the right of the server's credentials.
Clicking on the URI of a server in this page navigates you to the Servers page for
detailed server information.
- 90 -
Administration
Click the tab of the screen you wish to access. The screens of the management console are
listed in a horizontal menu panel across a top of the console:
l The Wall: Use this screen as the dashboard view of your most recent Pipeline
executions. From it, you can drill down to server, Pipeline, or execution details.
l Events: Use this screen to examine historical information for every Pipeline executed
from an event-based approach.
l Pipelines: Use this screen to drill down to the details of a single Pipeline. Access the
Pipelines screen from the Wall, History, Events, or Servers screen by clicking on the
name of a Pipeline.
l Server Info: Use this screen to monitor server information, configuration, and activity
for every server that you registered in the Setup screen. Access the Servers screen by
clicking a server name in the Wall, History, Events, or Pipelines screens.
The Wall
The Wall is a dashboard view of your latest SnapLogic runs. It displays each Pipeline executed
in the last 48 hours, the server on which it was run, and the result of its last run, which is
easily discerned by its color. Amber-colored Pipelines indicate Pipelines that have failed in the
past but are now running successfully.
l Time Range: Along the top of The Wall are links to control the time range in display.
The Wall initially displays the Pipelines run in the last 48 hours. Click Last Week to dis-
play all Pipelines run within the last week, or specify a custom range using the cal-
endars that pop out of the Date Range from and to fields.
l Paging: Use the left and right arrows on the sides of the screen to page through the
Pipelines displayed in your specified time range.
l Drilling Down: Each Pipeline is presented in overview format on The Wall. Click the
Pipeline's Server name to display the Servers screen where you can examine server
details. Click the Pipeline's Last Run time to display the Pipelines screen, where you can
examine the log of that run in the bottom panel.
Events
The Events screen also contains historical information for every Pipeline executed, but takes
an event-based focus. The top panel of the page displays Pipeline names, the servers on
which they executed, their status, their start and end times, the number of records written,
and the number of errors encountered. The bottom panel of the Events page initially displays
the main server log, but is used to display Pipeline Components when you select a Pipeline to
examine.
You can manipulate the event-driven display as follows:
l Time Range: Along the top of the screen are links to control the time range in display.
The screen initially displays the Pipelines run in the last 48 hours. Click Last Week to dis-
play all Pipelines run within the last week, or specify a custom range using the cal-
endars that pop out of the Date Range from and to fields.
- 91 -
SnapLogic® User Guide
l Sorting: Click on any of the column headers (Pipeline, Server, Stats, Started, Ended,
Records, and Errors) to sort the data by that criterion. Click again to alternate between
ascending and descending order.
l Pipeline Components: Drill down to a single Pipeline's contents by selecting a Pipeline
using the checkbox in the left column. The bottom panel of the Events screen displays a
list of the Pipeline's Components and their execution status.
l Pipeline Details: Drill down to Pipeline-level historical and content details by clicking
on a Pipeline's name to navigate to the Pipelines screen.
l Server Details: Drill down to server details by clicking the Pipeline's Server name to
display the Servers screen.
Pipelines
The Pipelines screen displays when you want to drill down to the details of a single Pipeline.
This screen accesses all log information related to Pipelines. Access the Pipelines screen from
the Wall, History, Events, or Servers screen by clicking on the name of a Pipeline.
The top of the screen identifies the Pipeline-server combination you are examining. The tab-
ular display reports the Pipeline's runs on that server: each run's Status, the time each run
Started and Ended, the number of Records processed and the number of Errors encountered.
The graphic display includes a chart of data records processed by date. The bottom panel of
the screen breaks down the Pipeline's Components and reports on their individual per-
formance for the execution selected in the top panel.
You can manipulate the detailed Pipeline display as follows:
l Time Range: Along the top of the screen are links to control the time range in display.
The screen initially displays executions that occurred within the last 48 hours. Click Last
Week to display all executions occurring within the last week, or specify a custom range
using the calendars that pop out of the Date Range from and to fields.
l Sorting: You can sort the tabular data by clicking on any of the column headers (Status,
Started, Ended, Records, and Errors) to specify the sort criterion. Click again to toggle
between ascending and descending order.
- 92 -
Administration
l Pipeline Components: The bottom panel of the Pipelines screen displays a list of the
Pipeline's Components and their execution status.
l Pipeline Log: Click the Logs view in the bottom panel to view the execution log for the
server on display.
ServerInfo
The Server Info screen enables you to monitor server information, configuration, and activity
for every server that you registered in the Setup screen, as instructed in the "Registering
Servers in the Management Console" section. Access the Servers screen by clicking a server
name in the Wall, History, Events, Pipelines, or Setup screens. If you are in a cluster envi-
ronment, a Cluster pane displays providing a real-time health status of the cluster.
The left panel of the screen displays static Summary information about the server you are
monitoring: the edition, version, installation date, license type, and expiration date of Snap-
Logic it is running, as well as the operating system and architecture of the machine. It also dis-
plays information about the Cluster to which this server belongs, if you are using that
configuration. The main portion of the screen is devoted to the activity of the server you are
examining: which Pipelines it has executed, their Status, time Started and Ended, and the
number of Records processed and Errors encountered. The bottom of the panel displays the
server Log file content.
You can manipulate the server information display as follows:
l Time Range: Along the top of the screen are links to control the time range in display.
The screen initially displays the Pipelines run on the server in the last 48 hours. Click
Last Week to display all Pipelines run within the last week, or specify a custom range
using the calendars that pop out of the Date Range from and to fields.
l Sorting: Click on any of the column headers pertaining to Pipeline executions (Pipeline,
Status, Started, Ended, Records, and Errors) to sort the data by that criterion. Click
again to toggle between ascending and descending order.
l Pipeline Components: Drill down to a single Pipeline's contents by selecting a Pipeline
using the checkbox in the left column. The bottom panel of the Servers screen displays,
in its Details view, a list of the Pipeline's Components and their execution status.
l Pipeline Details: Drill down to Pipeline-level historical and content details by clicking
on a Pipeline's name to navigate to the Pipelines screen.
- 93 -
SnapLogic® User Guide
Log Files
Log files provide a behind-the-scenes look into how your SnapLogic system is running.
Every Snap includes a permissions file that dictates the permissions required by each Com-
ponent in the Snap. With sandboxing, when you install a Snap, you are prompted to grant or
deny each of its requests to use JVM Components. Upon approval of the Snap's declared per-
missions, the server puts the approved permissions into the repository for use during
execution.
By default, sandboxing is already enabled in the Java Component Container. You can disable it
by setting the disable_sandbox property to true in the snapserver.conf file. Refer to the
"Configuring SnapLogic Server" section for more information on the snapserver.conf file,
and to its "Component Container Configuration" topic for details.
Note: Authentication to the SnapLogic Server being connected to is required
before running the commands documented.
- 94 -
Administration
The following is a list of additional import options:
l Recursive import: To recursively import everything in a location, use the -r option, as
follows:
snapadmin> resource import -r /Samples/Demo1
Imported /Samples/Demo1/Pipelines/Emp_Dept_Pipeline
Imported /Samples/Demo1/Resources/Emp
Imported /Samples/Demo1/Resources/Dept
Imported 3 resources to file 'snaplogic.dmp'.
- 95 -
SnapLogic® User Guide
l right-clicking on a resource in the Library and selecting Export Resource.
l selecting Export All from the Server menu.
The following is a list of additional export options:
- 96 -
Administration
l Recursive export: To recursively export everything in a location, use the -r option, as
follows:
snapadmin> resource export -r /SnapLogic/Tutorial/Exercise_1
Exported /SnapLogic/Tutorial/Exercise_1/Resources/Prospects
Exported /SnapLogic/Tutorial/Exercise_1/Pipelines/Leads_to_Prospects
Exported /SnapLogic/Tutorial/Exercise_1/Resources/Leads
Exported 3 resources to file 'snaplogic.dmp'.
Snapshots
Under the Server menu there is a new option by the name Snapshots. This allows users to
create a dump file of all the Pipelines on a specific SnapLogic Server. Importing the dump file
back on the Server will restore all the Pipelines to the state they were in when the dump was
created.
- 97 -
SnapLogic® User Guide
The difference between the Snapshots and the Export All functionalities is that while the con-
tents of the dump file generated by either actions is the same, Snapshots creates the dump
file on the server while Export All provides the dump file as a local download. This means
Snapshots is saves time and is less of a hassle when it comes to recovery/rollback.
In the case of nginx, configuring proxy forwarding is straightforward. Add a server con-
figuration for each process to your nginx.conf file, as follows:
...
http {
...
underscores_in_headers on;
server {
# For the SnapLogic main server process
listen 80; # The port on which nginx should list
# for requests for this server (can be
# anything you want, of course)
location / {
# Replace the example URI with your actual Server URI.
proxy_pass http://myhost.mycompany.com:443;
proxy_redirect off;
proxy_buffering off;
}
}
server {
# One of these entries for each CC (component container) process
# If multiple CCs are running on the same host, please configure
# a different port for each of these entries in the nginx
# config file.
listen 81;
location / {
# Replace the example URI with your actual CC URI.
proxy_pass http://myhost.mycompany.com:8089;
proxy_redirect off;
proxy_buffering off;
}
- 98 -
Administration
}
...
}
Please note that it is currently NOT possible to separate requests to the various SnapLogic
servers by path (simply using separate 'location' directives within a single 'server' definition).
Instead, a different front-end port must be chosen in the proxy for each SnapLogic server and
the location must be '/', as shown in the preceding config file example.
The underscores_in_headers property is required to ensure that all headers used by the
SnapLogic are forwarded by nginx. 'proxy_buffering' has to be disabled to ensure that out-
put view reads do not fail with network read errors.
Configuring SnapLogic
In order to ensure correct operation of the SnapLogic Server processes, you must edit the
SnapLogic configuration file. Add proxy-URI definitions for each server process, as follows:
server_hostname = myhost.mycompany.com
server_port = 443
server_proxy_uri = http://myproxy.mycompany.com:80
Likewise, in the CC section of the config file, add proxy-URI definitions for each CC:
cc_hostname = myhost.mycompany.com
cc_port = 8089
cc_proxy_uri = http://myproxy.mycompany.com:81
The server_proxy_uri and cc_proxy_uri parameters define the complete URI of the proxy
front end for that process. Your proxy may run on a different host than your SnapLogic
servers, or it may run on the same host, in which case the hostnames in the examples above
would be the same.
Extending the example described in the "Running SnapLogic Behind a Proxy" section, the fol-
lowing example shows how this can be done using nginx as a proxy front end:
...
http {
...
server {
# SSL proxy for the SnapLogic Server
listen 443;
ssl on;
ssl_certificate /etc/ssl/certs/myssl.crt;
ssl_certificate_key /etc/ssl/private/myssl.key;
location / {
...
- 99 -
SnapLogic® User Guide
}
}
server {
# SSL proxy for the SnapLogic CC
listen 444;
ssl on;
ssl_certificate /etc/ssl/certs/myssl.crt;
ssl_certificate_key /etc/ssl/private/myssl.key;
location / {
...
}
}
...
}
...
Note: Replace the example paths to the CRT and KEY file with the correct paths to
you CRT and KEY files.
When using the proxy front end to take care of SSL, SSL support can be disabled in the Snap-
Logic configuration. Comment out the server_secure_port and cc_secure_port properties
in the snapserver.conf. The proxy URI has to be modified to express the new scheme and,
possibly, port, as follows:
server_proxy_uri = https://myproxy.mycompany.com:443
...
cc_proxy_uri = https://myproxy.mycompany.com:444
Self-signed Certificates
If self-signed certificates are being used, the certificate used by the proxy should be added to
the trust store used by the Java CC. The keytool command which is part of JRE can be used for
this. For example,
keytool -import -alias proxy -file /etc/ssl/certs/myssl.crt -keystore
/opt/snaplogic/3.4.1.19583PE/pkg/java/jre1.6.0_20/lib/security/cacerts
If the SnapLogic Server is being accessed through a Java SnAPI program, the proxies cer-
tificate would have to be added to the trust store of the JRE running the Java client process.
l String: This is a Unicode string.
l Number: This is a decimal number, with a precision of 28 digits.
- 100 -
Administration
l Datetime: This is a combined date and time data type, which stores time with micro-
second resolution.
l SnapLogic Identifier: This is an ASCII string which follows the same rules as Python
identifiers. A SnapLogic identifier is a string beginning with a letter or underscore, fol-
lowed by a sequence of letters, digits, or underscores.
l Filename: Most file read components support a URL specification for their input using
the format 'scheme://input_path'. The valid schemes and their meanings are:
l file: Specifies a file local to the SnapLogic Server. input_path is the local filename. If
input path begins with a /, then the filename is treated as absolute, otherwise the path
is relative to the server root.
l http: Specifies data is read via HTTP. input_path is the HTTP URL.
l https: Specifies data is read via secure HTTP. input_path is the HTTP URL.
l ftp: Specifies data is read from an ftp data source.
l ASN.1: information encoded in Abstract Syntax Notation One format.
Mime type: 'application/x-snap-asn1'
l JSON: information encoded in JavaScript Object Notation format (SnapLogic Version 1.0
had a JSON Component for explicitly returning output records from Pipelines to the appli-
cation. This component is obsolete as of Version 2.0.)
Mime type: 'application/json'
l HTML: text in the form of Hypertext Markup Language
Mime type: 'text/html'
l CSV: text with comma-separated values; introduced in version 2.0.3
Mime type: 'text/csv'
l TSV: text with tab-separated-values; introduced in version 2.0.3
Mime type: 'text/tab-separated-values'
Applications can request a specific format by specifying the sn.content_type parameter in
the HTTP request. For example, a web browser can read the output from Output1 of a Com-
ponent whose URI is http://server:443/SnapLogic/Tutorial/Exercise_
1/Resources/Leads using the following /feed URI:
http://server:443/feed/SnapLogic/Tutorial/Exercise_
1/Resources/Leads/Output1?sn.content_type=text/html
- 101 -
SnapLogic® User Guide
SnapAdmin Utility
SnapAdmin is a simple command line interface that provides basic SnapLogic Server admin-
istration functions. By default, SnapAdmin starts an interactive command session. Alter-
natively, by using the -c filename option, you can give SnapAdmin a file containing
commands to execute.
Starting SnapAdmin
The SnapAdmin utility is located in the /bin directory of the SnapLogic Server installation. To
start SnapAdmin:
l Linux: Execute snapadmin.sh at the command prompt.
SnapAdmin Commands
The SnapAdmin command library consists of the following commands.
acl
l acl addrule: Add a new acl rule. Admin credentials must be used.
l acl create: Create a new acl. Admin credentials must be used.
l acl delete: Delete an existing acl from the username/password file. Admin cre-
dentials must be used. Admin credentials must be used.
l acl get: Print info for a particular acl in the username/password file. SnapAdmin must
be connected to a server. Admin credentials must be used.
l acl list: Print list of acl in the username/password file. SnapAdmin must be con-
nected to a server. Admin credentials must be used.
bye
l bye: Exit the SnapAdmin utility.
cluster
l cluster set: Sets the specified parameter to the new value. The only parameter
allowed currently is jobs_per_worker. Example: cluster set jobs_per_worker 5
l cluster workers: Prints all the worker nodes configured on the head node. Returns an
error if not connected to a cluster configured server.
- 102 -
Administration
component
l component list: List the Components available to this server.
connect
l connect server <url>: Connect to a server by specifying its URL.
l connect list: List the current server connections and their index. You can use the
index to switch the active connection with the connect switch command.
credential
l credential set {default | current | <connection-index>} <user>: Set the cre-
dentials used for requests to the server. Use default to specify the default credentials
used for any connection that does not have its own credentials set. Specifying current
sets credentials only for the current connection. Lastly, a connection index is used to set
the credentials for a specific connection. Use the connection list command to view each
connection and its index.
The command prompts for a password, and the resulting user/password pair is used for
all server requests for the relevant connections. Note that if you are using a command
script (for example, snapadmin -c command_file) and you don't want the prompt, you
can place the password after the user credential, as follows:
connect server http://localhost:443
credential set default admin yourpassword
resource import -r -R -f -i ./demo.dmp /path/to/...more commands
that follow...
Note: You may need to edit your ACL configurations for this command to
work. For example, you may need to disable ALLOW GROUP known PER-
MISSION read write execute and enable ALLOW GROUP public PER-
MISSION read write execute.
- 103 -
SnapLogic® User Guide
disconnect
l disconnect: Disconnect the current connection.
exit
l exit: Exit the SnapAdmin utility.
group
l group adduser <groupname> <username>: Add a user to a group. Admin credentials
must be used.
l group list: Print list of group in the username/password file. SnapAdmin must be con-
nected to a server. Admin credentials must be used.
help
l help: Print-related help.
log
l log search [-l limit] [-o offset] pipeline_rid [regex_search_string]: Dis-
plays the Pipeline logs for the given Pipeline runtime ID. You can find the Pipeline ID by
using the pipeline list command.
If the optional regex search string is specified, only the log lines matching the search
string are displayed. If the search string consists of several words, enclose it in quotes.
Lines are shown in reverse chronological order; that is, the most recent records are
returned first.
Options
l -l LIMIT, --limit=LIMIT: how many log lines to show. Default is 0 (show all).
l -o OFFSET, --offset=OFFSET: Log lines at the beginning to skip
pipeline
l pipeline list: List the Pipeline execution history.
- 104 -
Administration
repository
l repository create -t sqlite <filename>: Create a new SQLite repository database
at the given path. The path must either point to a nonexistent file to create, or to a
SQLite database that does not already contain a SnapLogic repository.
l repository create -t mysql [-H <host>] [-P <port>] -u <username> [-p
<password>] <database>: Create a new MySQL repository. The optional host and port
specify the hostname and port of the MySQL server. The username is the MySQL user. If
the password option is not supplied, the command prompts the user for the password.
The last argument specifies the name of the database in the MySQL server in which to
create the repository.
l Options:
-n, --no_save Do not save the new repository encryption password on the
server. If this option is used, the repository password has to be entered every
time the server is restarted. If this option is used, the user needs to ensure that
the repository password is saved in a secure manner. If the repository password
is lost, the repository contents CANNOT be recovered. The repository contents can
be backed up using export before using this command.
-p PASSWORD, --password=PASSWORD The password to use to encrypt the repos-
itory database. Use empty password to disable encryption.
- 105 -
SnapLogic® User Guide
-c CIPHER, --cipher=CIPHER The cipher to use to encrypt the repository data-
base. The supported options are aes-128-ecb, aes-128-cbc,aes-128-cfb, aes-256-
ecb, aes-256-cbc and aes-256-cfb.The default is aes-256-cbc.
l Options:
-n, --no_save Do not save the new repository encryption password on the
server. If this option is used, the repository password has to be entered every
time the server is restarted. If this option is used, the user needs to ensure that
the repository password is saved in a secure manner. If the repository password
is lost, the repository contents CANNOT be recovered. The repository contents can
be backed up using export before using this command.
-p PASSWORD, --password=PASSWORD The password to use to encrypt the repos-
itory database. Use empty password to disable encryption. Using this option takes
you out of the --no_save mode.
-c CIPHER, --cipher=CIPHER The cipher to use to encrypt the repository data-
base. The supported options are aes-128-ecb, aes-128-cbc, aes-128-cfb, aes-256-
ecb, aes-256-cbc and aes-256-cfb. The default is aes-256-cbc.
-o CURRENT_PASSWORD, --current_password=CURRENT_PASSWORD The current
password to use to decrypt the repository database.
l repository wait_on_upgrade: Wait for the startup repository upgrade to complete.
resource
l resource delete [-f] {<uri-list> | *}: Delete Components from the repository.
If supplied, uri-list is a list of URIs of Components to delete, separated by whitespace.
If * is supplied instead of uri-list, all Components are deleted. The -f flag forces the
delete without requiring user interaction. Otherwise, the command confirms whether to
proceed for each Component.
l resource export Refer to the "Import and Export" section for details.
l resource import: Refer to the "Import and Export" section for details.
l resource list: List the Components in the repository.
- 106 -
Administration
l Options: -r, --recursive: descend recursively into matched subfolders
server
l server shutdown: Shutdown the SnapLogic server in a graceful manner. Waits for all
currently running jobs to finish. No new job executions are permitted when the server is
shutting down.
shell
l shell: Execute an operating system shell command.
source
l source: Execute SnapAdmin commands contained in a file.
users
l users create <username> <password>: Add a user.
l users list: Print a list of users in the username/password file.
verbose
l verbose lasterr: Print additional information about the last error, if available.
l verbose off: Turn off verbose mode. Succinct error reporting.
l verbose on: Verbose mode. Print additional error information.
worker
l worker add: Add a worker node to the cluster configuration.
Example: worker add http://worker.mydomain.com:443
l worker delete: Delete a worker node from the cluster configuration.
Example: worker delete http://worker.mydomain.com:443
l worker list: List all the worker nodes configured on the head node. Returns an error if
not connected to a cluster-configured server.
- 107 -
SnapLogic® User Guide
Sidekick
The SnapLogic Sidekick is a service installed locally that lets you access data both on site
behind a firewall and in the cloud. The Sidekick can be installed anywhere on your network
that has access to the data that you want to use. The advantage of using Sidekick instead of
an on-premises installation of the Server is a lighter install and footprint on the ground. Side-
kick's lighter footprint means that all Pipelines and associated metadata are stored in the
cloud instance of SnapLogic Server.
Sidekick is essentially a Java Component Container (Java CC). A Server running in the cloud
stores the metadata and controls the Sidekick. When a Pipeline needs to run, the server tells
the Sidekick to run it; actual Java code runs in the Sidekick on the ground.
Snap output streams are made available to the Server in such a way that there is no dif-
ference whether the Java CC is running on the ground (as Sidekick) or in the cloud. Snap logs
are stored with the Sidekick, but logs can be shown in the Designer. If communication
between the Server and the Sidekick is interrupted, the server will no longer be able to talk to
the Sidekick, so Pipelines will not be executed. Scheduled Pipelines have an option to be
executed if they miss their window, so those Pipelines can still be run.
- 108 -
A
l Create an instance of a Component.
l Set the Component properties.
l Define the inputs and/or outputs.
l Validate and save the Component to the server.
Refer to the following example for Component creation in SnAPI.
from snaplogic import snapi
- 109 -
SnapLogic® User Guide
emp_reader_resource.add_record_output_view("output1", output_fields,
"output view")
validate_error = emp_reader_resource.validate()
emp_reader_resource.save(server_uri + '/SnapLogic/Demo/Resources/Emp')
server_uri = 'http://localhost:443'
emp_reader_resource.set_property_value('filename',
'file://demo/data/emp_csv.txt')
print emp_reader_resource
print emp_suggest_resource
#...
- 110 -
Appendix: Completing Tasks in SnAPI
raise.set_output_view_pass_through("Output1",["Input1"])
SERVER_URI = 'http://myhost.example.com:443'
PIPELINE_NAME = '/SnapLogic/Tutorial/Exercise_1/Pipelines/Leads_to_
Prospects'
# You can specify values for any of the parameters declared by the
Pipeline
- 111 -
SnapLogic® User Guide
server_uri = 'http://localhost:443'
# URL of the SnapLogic data server to which we connect
emp_reader_resource = snapi.create_resource_object(server_uri,
'snaplogic.components.FixedWidthRead')
emp_reader_resource.add_record_output_view("output1", output_fields,
"output view")
# Validate and save it to the server
validate_error = emp_reader_resource.validate()
if validate_error:
print validate_error
else:
emp_reader_resource.save(server_uri +
'/SnapLogic/Demo/Resources/Emp')
- 112 -
Appendix: Completing Tasks in SnAPI
.
.
.
p = snapi.create_resource_object(server, snapi.PIPELINE)
# Add the resources to the pipeline.
p.add(leads_res_def, "Leads")
p.add(prospects_res_def, "Prospects")
# Specify the field linkage
field_links = (('First', 'First_Name'),
('Last', 'Last_Name'),
('Address', 'Address'),
('City', 'City'),
('State', 'State'),
('Zip', 'Zip_Code'),
('Phone_w', 'Work_Phone'))
p.link_views('Leads','output1', 'Prospects','input1', field_links)
# Save the pipeline
p.save(p_uri)
.
.
.
When using the SnAPI programmatic interface, you can specify parameter values and pass
them into the snapi.exec_resource call.
import time
from snaplogic import snapi
SERVER_URI = 'http://myhost.example.com:443'
PIPELINE_NAME = '/SnapLogic/Tutorial/Exercise_1/Pipelines/Leads_to_
Prospects'
# You can specify values for any of the parameters declared by the
Pipeline
parameters = {'LEADS' : 'file://tutorial/data/leads.csv',
'INPUT_DELIMITER' : ',' }
- 113 -
SnapLogic® User Guide
# Define the resource linkage and the mapping used for each link
p.linkViews('Emp Read', 'output1', 'Emp Write', 'input1', emp_record_
link)
SERVER_URI = 'http://myhost.example.com:443'
PIPELINE_NAME = '/SnapLogic/Tutorial/Exercise_1/Leads_to_Prospects'
# You can specify values for any of the parameters declared by the
Pipeline
parameters = {'LEADS' : 'file://tutorial/data/leads.csv',
'INPUT_DELIMITER' : ',' }
- 114 -
Appendix: Completing Tasks in SnAPI
To view execution logs in this method, use the Management Console by entering its URI into
your browser address bar: http://<hostname>:<port>/console.
l input: Use this setting to force all Component inputs in the Pipeline to dump their data
into trace files.
l output: Use this setting to force all Component outputs in the Pipeline to dump their
data into trace files.
l input,output: Use this setting to force all Component inputs and outputs to be traced.
The following is an example of the data tracing command to trace both inputs and outputs:
from snaplogic import snapi
SERVER_URI = 'http://snaplogic1.snaplogic.org:443'
TUTORIAL_1_PIPE = '/SnapLogic/Tutorial/Exercise_1/Pipelines/Leads_to_
Prospects'
- 115 -
SnapLogic® User Guide
- 116 -
B
l Ubuntu 10.04
l PostgreSQL 8.4 (installed with apt-get install postgresql-plpython-8.4)
Prerequisites
PL/Python
This assumes PL/Python has been installed and CREATE LANGUAGE plpythonu; has been run
on the appropriate databases. For more details on that, see http://www.-
postgresonline.com/journal/index.php?/archives/99-Quick-Intro-to-PLPython.html.
First, check the version of Python PL/Python will be using. It has to be 2.5 or 2.6. This can be
done by executing the following at the psql prompt:
CREATE FUNCTION get_plpython_ver() RETURNS text
AS $$
import sys
return str(sys.version)
$$ LANGUAGE plpythonu;
SELECT get_plpython_ver();
If the version returned from the query above is not 2.5 or 2.6 you may have to rebuild lib/post-
gresql/plpython.so.
Log Level
Sample snippets below assume SET client_min_messages='LOG'; has been run at the psql
prompt. For more information on that, see 41.3. Database Access chapter of the PostgreSQL
manual.
Triggers
This example presents integration with triggers, as a useful case for change data capture.
Here we will write new data to a CSV file as it gets inserted into a table. Triggers are chosen
as an example for a nice use case, but, as triggers use functions, this means that functions
can be used as well to integrate with SnapLogic. We will assume, for the purposes of this
example, that every time a row is inserted into a table, we'd want to capture this changed
data and write it out into the CSV file. A more advanced scenario could, of course, set up the
- 117 -
SnapLogic® User Guide
trigger on both insert and update, and use DBUpsert component to propagate the change to
another database. We will, then, go through the following steps:
Create Function
Create a function that will be used as a trigger:
CREATE OR REPLACE FUNCTION write_to_csv() RETURNS trigger
AS $$
import sys
$$ LANGUAGE plpythonu;
- 118 -
Appendix: SnAPI and PostgreSQL
Try it
Now, every insert into the snap_cdc_example table will result in writing the inserted data to
the SnapLogic CSV Write resource created above. Try it for yourself. Execute
INSERT INTO snap_cdc_example VALUES ('New York', 'NY');
And see that New York,NY line is written to the postgres.csv file.
- 119 -
SnapLogic® User Guide
Appendix: ACLs
The access control lists, stored in the snapaccess.conf file, let you configure which user or
groups has which role or privileges. Information on configuring ACLs can be found at "Under-
standing ACLs".
The following is the list of predefined ACLs in the snapaccess.conf file.
l Location name: /
l Description: Root directory. Default deny, except for root, which is needed for the land-
ing page. Required by the landing page (http://<hostname>:<port>).
l Default Permissions:
l DENY GROUP public
l ALLOW GROUP public PERMISSION read NONRECURSIVE
l ALLOW GROUP known PERMISSION read write execute
l Location Name: /__snap__
l Description: Allow all Snap handlers, most of which protect themselves further with
admin restrictions or secret keys. Required by the product and usually should not be
changed by the user.
l Default Permissions: ALLOW GROUP known PERMISSION read write execute
l Location Name: /__snap__/__static__
l Description: Allow static handlers, needed by anonymous users to login. Required by
the product and usually should not be changed by the user.
l Default Permissions: ALLOW GROUP public PERMISSION read
l Location Name: /__snap__/__static__/protected
l Description: Only admin can access it. Required by the product and usually should not
be changed by the user.
l Default Permissions:
l DENY GROUP public
l DENY GROUP known
l Location Name: /__snap__/auth
l Description: Allow auth entry point. Required by the product and usually should not be
changed by the user.
l Default Permissions: ALLOW GROUP public PERMISSION read NONRECURSIVE
- 120 -
Appendix: ACLs
l Location Name: /__snap__/uri_check
l Description: Allow uri checks. Required by the product and usually should not be
changed by the user.
l Default Permissions: ALLOW GROUP public PERMISSION read write execute NON-
RECURSIV
l Location Name: /__snap__/auth/acl/list
l Description: Allow auth entry point. Required by the product and usually should not be
changed by the user.
l Default Permissions: ALLOW GROUP known PERMISSION read NONRECURSIVE
l Location Name: /__snap__/cluster/status
l Description: Allow cluster status check. Required by the product and usually should not
be changed by the user.
l Default Permissions: ALLOW GROUP public PERMISSION read NONRECURSIVE
l Location Name: /__snap__/runtime/status
l Description: Allow runtime status check. Required by the product and usually should
not be changed by the user.
l Default Permissions: ALLOW GROUP public PERMISSION read write execute
l Location Name: /__snap__/auth/check
l Description: Allow auth login point. Required by the product and usually should not be
changed by the user.
l Default Permissions: ALLOW GROUP public PERMISSION read
l Location Name: /__snap__/cc/register
l Description: Allow cc register uri, which is further protected by tokens. Required by
the product and usually should not be changed by the user.
l Default Permissions: ALLOW GROUP public PERMISSION write
l Location Name: /__snap__/meta/info
l Description: Allow info page, which is needed by snapadmin. Required by the product
and usually should not be changed by the user.
l Default Permissions: ALLOW GROUP public PERMISSION read NONRECURSIVE
- 121 -
SnapLogic® User Guide
l Location Name: /__snap__/resources/upgrade/status
l Description: Allow polling of repository upgrade status. Required by the product and
should not usually be changed by the user.
l Default Permissions: ALLOW GROUP public PERMISSION read
l Location Name: /__snap__/self_check
l Description: Used for determining whether resource references are remote or local.
Required by the product and usually should not be changed by the user.
l Default Permissions: ALLOW GROUP public PERMISSION read write execute
l Location Name: /console
l Description: Runs the SnapLogic Management Console.
l Default Permissions: ALLOW GROUP public PERMISSION read
l Location Name: /crossdomain.xml
l Description: Serve /crossdomain.xml
l Default Permissions: ALLOW GROUP public PERMISSION read
l Location Name: /designer
l Description: Runs the SnapLogic Designer.
l Default Permissions: ALLOW GROUP public PERMISSION read
l Location Name: /extensions
l Description: By default, Snaps will instantiate resources here.
l Default Permissions: allow group public permission read write execute.
l Location Name: /favicon.ico
l Description: Access to the SnapLogic icon (from the root directory) for the browser.
l Default Permissions:ALLOW GROUP public PERMISSION read
l Location Name: /__snap__/cc_proxy/__snap__/runtime/
l Description: Allow proxying for querying runtime info
l Default Permissions: ALLOW GROUP public PERMISSION read
l Location Name: /robots.txt
l Description: Serve /robots.txt
- 122 -
Appendix: ACLs
l Default Permissions: ALLOW GROUP public PERMISSION read
l Location Name: /public
l Description: Public folder
l Default Permissions: ALLOW GROUP public PERMISSION read write execute
l Location Name: /SnapLogic
l Description: Allows access to the SnapLogic directory.
l Default Permissions: ALLOW GROUP public PERMISSION read write execute
l Location Name: /SnapLogic/Tutorial
l Description: Allows access to the SnapLogic Tutorial directory.
l Default Permissions: ALLOW GROUP public PERMISSION read execute
- 123 -
SnapLogic® User Guide
- 124 -
Glossary
canvas
The canvas is your main workspace in Designer. Create data inte-
gration solutions in the canvas by sketching, connecting, and then
configuring Components and Pipelines. Drag generic Component
templates from the Foundry or configured Components from the
Library in the sidebar to the canvas. Connect these objects to each
other, configure, and execute them all from the canvas. Refer to
the section about the canvas for greater detail.
Component
A Component is an object used to perform a simple subtask, such
as read, write, or act on data. Strung together, Components are
the building blocks of Pipelines, or data flows. Components are
generally classified as Connectors (Components that read or write
data) and Operators (Components that perform an action, such as
a join or filter, on data). Basic templates for Components are
included in your SnapLogic installation (refer to the Component
Reference for the list of Component templates that ship as part of
SnapLogic), and reside in the Designer's Foundry panel. These
generic templates, once configured, become configured Com-
ponents that are stored in the SnapLogic Server repository, and
can be found in the Designer's Library panel.
Component Container
A process that runs a Component.
Component template
An unconfigured Component in the Foundry. Component tem-
plates are generic objects used to perform a simple subtask, such
as read, write, or act on data. They are included in the SnapLogic
installation, and reside in the SnapLogic Designer Foundry panel.
To create a Component, you must configure a Component tem-
plate to your specific needs. To create a Pipeline, you must con-
figure one or more Component templates, and connect them to
data sources, other Components, or data targets.
- 125 -
SnapLogic® User Guide
Connectivity Snaps
A Snap that adds connectivity to an application or data source.
Designer
The SnapLogic graphic user interface where you can visually
create data integration scenarios.
Foundry
The bottom panel in the Designer's sidebar that stores the building
blocks from which you can build projects.
input
Inputs are pieces of data that a Component consumes for the pur-
pose of performing functions on them, or (when supported) pass-
ing them through to another downstream Component. Not all
Components accept input data.
Library
The top panel in the Designer's sidebar that stores the projects
you are building: your Pipelines and configured Components.
link
A link defines a mapping between the fields of any two Com-
ponents.
Management Console
The SnapLogic browser-based management console provides
details about the performance of executed Pipelines, whether in a
cluster or on individual SnapLogic servers. The management con-
sole draws on comprehensive log message access, Pipeline- and
Component-level statistics, and analysis of Pipeline run history to
- 126 -
Glossary
mapping
The identification of data relationships between different entities.
In SnapLogic, field linking is used to specify how fields or columns
from one Component map to those of a downstream Component.
output
Outputs are pieces of data a Component produces that can serve
as inputs to downstream Components. Not all Components
produce outputs.
parameter
Parameters are variables that can be used for run-time sub-
stitution in the properties of Components or Pipelines. You can use
parameters to avoid hard-coding property values that are likely to
change. Using parameters enables you to use a single Component
or a single Pipeline for multiple purposes. Parameters defined at
the Pipeline level must be mapped to corresponding parameters
at the Component level within the Pipeline. Refer to the Com-
ponent Parameters section for more information on parameters.
pass-through
The pass-through capability allows Components to accept fields
that are not specified as inputs, and pass these fields directly as
their outputs. When you link two Components, only those fields
specified as inputs of the downstream Component must be linked.
All the remaining unlinked outputs of the upstream Component
are passed through to the downstream Component's output. With
pass-through, the inputs of Components in a Pipeline need not be
explicitly designed to handle all of the incoming fields from
upstream Components. Component inputs only need to specify
fields that the Component requires for its computations. This
reduces the field linking to the absolute minimum.
- 127 -
SnapLogic® User Guide
Pipeline
A collection of one or more Components linked together to orches-
trate a flow of data between end points.
Scheduler
The SnapLogic utility for scheduling automatic, periodic executions
of a Pipeline. The SnapLogic Server runs the Pipeline unattended
at the dates and times you specify, and using any parameter
values you specify.
slider
A feature in the SnapLogic Designer that enables you to navigate
through Component, connection, and Pipeline properties below
the canvas, while still viewing the corresponding object in the can-
vas above.
Snap
An object that performs a complete, and usually high-level, func-
tion. A Snap can be a collection of Components that are func-
tionally related, such as the Salesforce Snap, which contains
Components for inserting contacts into and deleting contacts from
Salesforce. A Snap can also consist of a single low-level building
block, such as a filter. A Snap can comprise a complete Pipeline
packaged as a simple Component to insert an item. The definition
of "Snap" is therefore a recursive one: A complex Snap can con-
tain multiple Pipelines; a simple Snap can stand alone or par-
ticipate in a Pipeline. You can purchase a Snap in SnapStore and
install it into the Foundry. After you configure its Component tem-
plates to your specific sources and targets, the Snap resides in the
Library.
SnapAdmin
A simple command line interface that provides basic SnapLogic
Server administration functions. Refer to the "SnapAdmin Utility"
section for more information.
SnAPI
SnapLogic Application Program Interface is the programmatic
interface to the SnapLogic Server that enables you to create and
- 128 -
Glossary
SnapStore
SnapLogic's online marketplace, SnapStore, enables developers,
SIs, and ISVs to develop and monetize custom Snaps, extending
the SnapLogic Server's connectivity and functionality.
- 129 -
SnapLogic® User Guide
- 130 -
Index
A worker node 73
Component
ACLs
configuring 26
in snapaccess.conf 81
creating in Designer 25
syntax 82
creating in SnAPI 109
understanding 81
creation overview 25
admin
database related 29
password 65
definition 11
administration
overview 25
buffer configuration 79
parameters 36
memory configuration 78
optional 36
overview 63
required 36
architecture 9
syntax 37
authentication
pass-through fields
Active Directory 84
configuring in Designer 33
overview 80
configuring in SnAPI 110
permissions 80
overview 32
snapaccess.conf 81
properties 27
user credentials 80
suggestions
B
Designer 29
buffer
overview 28
configuration 79
SnAPI 110
C Components
- 131 -
SnapLogic® User Guide
trace files 55 H
data types;output representation for- high availability 77
mats 100
http requests
Designer
credentials 85
launching 13
menu bar 14
overview 13
- 132 -
L overview 32
passwords
Library
admin 65
data folder 19
changing users 65
folders 19
permissions
overview 18
execute 80
toolbar 22
read 80
view tabs 19
write 80
M
Pipeline
management console 90
concurrent execution 79
Events 91
configuring 39
Pipelines 92
dataservice
registering servers 90
creating in SnAPI 113
Servers 93
definition 11
using 90
properties 40
Wall 91
Pipelines
memory
aborting 51
configuration 78
creating in Designer 39
O creating in SnAPI 112
- 133 -
SnapLogic® User Guide
parameters 41 slider
running 49 commands 22
R Smart Link 43
Snap
repository
definition 11
encrypt 105
SnapAdmin
password 106
commands 102
requests
overview 102
anonymous 85
SnAPI
http and credentials 85
configure 13
S
configuring pass-through fields 110
sandboxing 94
overview 13
Scheduler
procedures 109
email notifications 53
SnapLogic
properties 52
about 7
security
Component servers 10
overview 87
design process 11
server configuration 77
user interfaces 10
servers
SnapLogic Server
clustering 71
architecture 9
head node 72
Component Container
head node failover 74 parameters 68
- 134 -
management console 70 creating 63
notification 69 in snapaccess.conf 81
repository configuration 69
starting 65
stopping 65
Snaps
accessing 57
configuring 59
defined 57
developing; 59
installing 58
installing in a cluster 58
overview 57
SnapStore
defined 57
SSL
enabling 88
proxy configuration 99
usage 89
T
token-based authentication
server setting 68
U
user interfaces
overview 13
usergroups
in snapaccess.conf 81
users
changing password 65
- 135 -
SnapLogic® User Guide
- 136 -