Ssis SQL Server 2005

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 8

Level 200

SQL Server Integration Services (SSIS) SQL Server 2005


This note describes about usages and configurations about Link Servers in SQL Server 2005

Dinesh Asanka MVP SQL Server


www.dbfriend.net dineshasanka@dbfriend.net

SSIS in SQL Server 2005

Page 1

Contents
Contents................................................................................................................ 2 Introduction........................................................................................................... 3 Design Environment .............................................................................................3 Connection Managers Pane................................................................................3 Toolbox............................................................................................................... 3 Control Flow Task............................................................................................... 3 Data Flow Task................................................................................................... 3 Event Handlers................................................................................................... 3 Package Explorer................................................................................................3 Connection Managers............................................................................................3 Database ........................................................................................................... 3 File...................................................................................................................... 4 Special ............................................................................................................... 4 Control Flow Elements...........................................................................................4 Containers.......................................................................................................... 4 Control Flow Tasks..............................................................................................5 Data Flow Components .........................................................................................5 Sources............................................................................................................... 5 Destinations........................................................................................................5 Transformations..................................................................................................6 Package Configurations.........................................................................................6 Check Point............................................................................................................ 7 Error Handling........................................................................................................7 Executing............................................................................................................... 7 Further Reading.....................................................................................................8

Accessing Data in Remote Servers

Introduction
SQL Server Integration Services (SSIS) is most commonly described as an extracttransform-load (ETL) tool. ETL tools are traditionally associated with preparing data for warehousing, analysis and reporting, but SSIS represents a step beyond the traditionally role. It is really a robust programming environment that happens to be good at data and database related tasks.

Design Environment
SQL Server Business Intelligence Development Studio (SSBIDS) is used to develop SSIS packages. Following are the panes you need to know to develop SSIS packages.

Connection Managers Pane


Connection Managers are pointers to files, databases etc, that are used to provide context for the execution of tasks placed on the design surface.

Toolbox
Providers a list of tasks that can be dragged onto the design surface. The list of available tasks varies depending on the selection of different tabs.

Control Flow Task


This is the primary design surface on which tasks are placed, configured and ordered by connecting tasks with precedence arrows.

Data Flow Task


One of the tasks that can be configured on the control tab is a data flow task, used to move and transform data. The Data flow tab is used to configure Data Flow tasks.

Event Handlers
Events are exposed for the overall package and each task within it. Tasks are placed here to execute for any event such as onError.

Package Explorer
Lists the entire packages element in a single tree view. This can be helpful for discovering configured elements not always obvious in other views such as event handlers and variables.

Connection Managers
A connection manager is a wrapper for the connection string and properties required to make a connection at runtime. Once the connection is defined, it can be referenced by other elements in the package without duplicating the connection definition, thus simplifying the management of this information and configuration for alternate environments

Database
Defining database connections through one of the available connection managers requires setting a few key properties such as Provider, Server, Initial Catalog and Security.
3

Accessing Data in Remote Servers

The first choice for accessing databases is generally an OLE DB connection manager using one of the many native providers, including SQL Server, Oracle etc.

File
Remember that every file or folder referenced needs to be available not only at design time, but after a package is deployed as well. Consider using UNC paths for file connections. The many file configuration managers are listed here. Flat file presents a text file as if it were a table, with header options. The file can be in one of three formats. Delimited: File data is separated by column and row delimiters. Fixed Width: file data has known sizes without either column or row delimiters. Ragged Right: file data is interrupted using fixed with for all columns. Excel: Indentifies a file contacting a group of cells that can be interpreted as a table.

Special
There some other non-traditional connection managers. FTP: defines a connection to a FTP Server. For most situations, entering the server name and credentials is sufficient to define the connection. This is used with the FTP task to move and remove files or create and remove directories using FTP. MSMQ: defines a connection to a Microsoft Message Queue send used in conjunction with a Message Queue task to send or receive queued messages. SMTP: specifies the name of the Simple Mail Transfer Protocol Server (SMTP) for use with the send mail task.

Control Flow Elements


The control flow tab provides an environment for defining the overall work flow of the package.

Containers
Containers provide important features for an SSIS package, including iteration over a group of tasks and isolation for error and event handling. The containers available are as follows Sequence: this simply contains a number of tasks without any iteration feature, but provides a shared event and error-handling context, allows shared variables to be scoped to the container level instead of the package level and enables the entire container to be disabled at once during debugging. For Loop: Provides the advantage of a sequence container. Foreach Loop: provides iteration over the contents of a container but based on various lists of items File: each file in a wildcarded directory command.

Accessing Data in Remote Servers

Item: Each item in a manually entered list. ADO: Each row in a manual contacting an ADO recordset or ADO.NET data set SMO: List of server objects, jobs, databases

Control Flow Tasks


ActiveX Script: allows VB and Java scripts to be included in SSIS packages. New scripts should use the script task instead. Data Flow: provides a flexible structure for loading, transformation and storing data as configured on the data flow tab. Execute Package: Executes the specified SSIS package, allowing packages to be broken down into smaller, reusable pieces. Invoking child packages does require substantial overhead. Execute Process: Executes an external program or batch file. Execute SQL: Runs a SQL scripts or query, optionally returning results into variables. File System Task: Provides a number of file operations such as filecopy, delete, move etc. Script: his task allows visual Basic .Net code to b embedded n a task. Send Mail: Sends a test-only SMTP e-mail message. XML: Perform operations on XML documentation, including comparing two documents.

Data Flow Components


This section describes the individual components that can be configured within a Data Flo task.

Sources
OLEDB: The preferred method of reading database data. It requires OLE DB connection manager. Data Reader: Uses an ADO.NET connection manager to read database data. Flat File: Requires a Flat File connection Excel: Uses an excel connection manager and either worksheet or named ranges as tables. Raw: Reads a file written by a SSIS raw file destination. XML: Reads a simple XML file and presents it to the data flow as a table using either an inline schema file.

Destinations
OLE DB: Writes rows to a table, view fir which an OLE DB driver exists. SQL Server: this destination uses the same fast loading mechanism as the bulk insert task, but it restricted in that the package must execute on the SQL Server that contains the target table or view.
5

Accessing Data in Remote Servers

Flat File: writes the data flow to a file specified by a flat file connection manager. Excel: Sends row from the data flow to a sheet or range in a workbook using an Excel connection manager. However, worksheet can handle at most 65,536 rows of data.

Transformations
Between the source and destination, transformations provide functionality to change the data from what was read into what is needed. Each transformation requires one or more data from what was read into what is needed. Aggregate: Functions rather like a GROUP BY query in SQL, generating Min, Max, Average etc, on the input and providers one or more data flows. Conditional Split: Enables rows of a data flow to be split between different outputs depending upon the contents of the row. Configure by entering output names and expressions in the editor. Derived Column: Uses expressions to generate values that can either be added to the data flow or replace existing columns. Lookup: Finds rows in a database table that match the dataflow and includes selected columns in the data flow. For example, a productID could be added to the data flow by looking up the product name in the master table. Merge Join: Provides SQL JOIN functionality between data flows sorted on the join columns. Pivot: De-normalizes a data flow similar to the way an excel pivot table operates, making attribute values into columns. Script: Script component introduce scripting into the data flow. Slowly Changing Dimensions: Compares the data in a data flow to a dimension table and based on the roles assigned to particular columns, maintains the dimension. Sort: sorts the rows in a data flow by selected columns. Union All: makes a data flow more normalised by making columns into attribute values.

Package Configurations
Package configurations make it easier to move packages between servers and environments providing a way to set properties within the package based on environment-specific configurations. For example, the server names and input directories might change between the development and production environment. There are several types of package configurations available Registry Environment Variables XML File SQL Server Table

Accessing Data in Remote Servers

Parent Package Variable

Check Point
Enabling checkpoint restart allows a package to restart without rerunning tasks that already completed successfully. Following are the basic rules for checkpoint restart. Only Control Flow tasks define restart points. A data flow task is viewed as a single unit of work regardless of the number of components it contains. Any transaction in progress is rolled back on failure, so the restart point must retreat to the beginning if the transaction. Thus, if the entire package executes as a single transaction, it will always restart at the beginning of the package. Any loop containers are started over from the beginning of the current loop. The configuration used on restart is saved in the checkpoint file and not the current configuration file. Enable checkpoints by setting the package properties Checkpointfilename: Name of the file to save checkpoint information in. CheckpointUsage: Set to either IfExists (starts at the beginning of the package if no file, or at the restart point if the checkpoing file exists) or Always (fails if the checkpoint file does not exist) SaveCheckPoints set to true In addition, the FailPAckageOnFailure property must be set to true for the package and every task or container that can act as a restart point.

Error Handling
You can use error handling tab of the SSIS package development environment to handle the errors.

Executing
Once package is created and tested, package can be executed in several ways. Locate the installed package in SQL Server Management Studio, right click and chose the run package, which will in turn invokes dtexec for selected package. Run dtexecui utility From SQL Agent Job step, choose the step type as SQL Server Integration Services Package.

Accessing Data in Remote Servers

Further Reading
Package Configurations http://204.9.76.233/articles/dba/package_configuration_2005_p1.aspx Data Cleansing http://www.sql-server-performance.com/articles/dba/data_cleaning_ssis_p1.aspx Import Text files http://204.9.76.233/articles/dba/import_text_files_ssis_p1.aspx Slowly Changing Dimension

http://www.sql-serverperformance.com/articles/biz/slowly_changing_dimension_type1_p1.aspx http://www.sql-serverperformance.com/articles/biz/Slowly_Changing_Dimension_in_SQL_Server_2005_Part2_p1 .aspx Script Component as Data Source

http://www.sql-serverperformance.com/articles/biz/script_component_as_data_source_p1.aspx Pivot and UnPivot with SSIS http://www.sql-server-performance.com/articles/per/ssis_pivot_unpivot_p1.aspx

Accessing Data in Remote Servers

You might also like