Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 12

Chapter 3 - Data Flow

CHAPTER 3 - DATA FLOW


3.1

Sources, Transforms and Destinations

Data flows through an SSIS data flow task just


like marbles through a marble run:

The Source specifies where marbles (or data


records) come from.

Transforms manipulate marbles (or data records):


adding extra columns, perhaps, or (as here)
sending it down different paths.

For any data flow task, you can have one or more
Destinations, specifying where marbles (or data
records) will end up.

Our Example
Well write a package to export films to a text file, then export this into another table:

The source for the data flow will


be this SQL Server table.

Copyright 2015

The destination will


be a flat file.

Page 19

Chapter 3 - Data Flow

3.2

Creating a Project Connection

Its likely in our example that well reuse the same connection (ie the same link to a SQL Server
or other database) many times, so it makes sense to define it at project level:
a)

b)

Right-click on Connection Managers and


choose to add a new one.

You can link to almost every common


type of data. If youre using SQL Server, it
makes sense to use an OLEDB connection.

c) Click at the bottom right of the


dialog box which appears to create a
new connection:

d)

Type the name of your server


(here were using a named instance
called sql2008r2 on the current
computer).

e)

Choose the database you want to


link to from the drop down list.

f)

Click to create a connection.

g) Give the connection a better name


(well call this one Movies):

Copyright 2015

Page 20

Chapter 3 - Data Flow

3.3

Data Flow Tasks

To accomplish our task, we need a single data flow task (this can be as complex as you like
many applications in SSIS will consist of one data flow task only).

Creating Data Flow Tasks


You can create a data flow task in the same way as for any other task:
a)

Either drag this task onto the


Control Flow window, or just click on
it and press .

b)

It makes sense to rename this data


flow task to give it a better description.

Another way to create a data flow task click on the Data Flow tab,
then click on this link which appears:
Wise
Wise Owls
Owls
Hint
Hint

Switching to the Data Flow Tab


There is a separate tab for editing data flow tasks:
Either double-click on the data flow
task icon to edit it

or switch to the Data Flow tab,


click on the dropdown arrow and
choose which data flow task you want
to edit.

Copyright 2015

Page 21

Chapter 3 - Data Flow

3.4

Creating the Source

Default Connections and Connection Scope


As soon as you go to the new data flow task, youll
see that it already has one connection listed:
This is the connection manager we created for the entire
project it will be available to all of the packages in the
project, and so is listed here. Connections can have project
scope or package scope (in the latter case, they are only
visible to and available for use in a single package).

Creating Sources
You can create sources either using the assistant (basically a wizard) or the hard way:
This assistant will help you
add sources to a data flow
task (see hint below).

Wise
Wise Owls
Owls
Hint
Hint

Copyright 2015

Possible sources for data flows are:


SourceNotesADO.NETAn alternative to the faster OLEDB source, required
by some data providers.CDCLinks to a Change Data Capture data source,
retrieving only rows which have changed.ExcelAny table contained in an
Excel worksheet. If youre running a 64-bit copy of Excel, you will need to
install an additional driver.Flat FileImporting CSV or other text files
(covered in a separate chapter).ODBCUses any ODBC provider (useful
when an application doesnt support OLEDB).OLE DBAny OLEDB
compliant data source, such as Access, SQL Server, Oracle or DB2.Raw
FileA fast but inflexible file format usually used to hold intermediate data
stages at checkpoints.XMLConnect to an XML file, possibly on a remote
website.

The advantage of using the source assistants is that SSIS will then only
lists out sources for which you have drivers installed on your computer
(although this manual will show creating sources the long way).

Page 22

Chapter 3 - Data Flow

Creating our OLEDB Source


Whichever method you choose for adding a data source will lead you to the same place, so well
use the OLEDB Source tool:
a)

Double-click on this tool to add it


to your Data Flow window.

b)

Optionally, rename the data source


to something easier to recognise:

Double-click on your new data source and set the following details:
What

Notes

OLDEB connection manager

If youve created one for your project (as weve done), you can just use this;
otherwise, click on the New button to create one specific to this package.

Table or view

The name of a table or view in your database (although its better practice to
use an SQL command, as overleaf, to pick out only those columns we want to
work with).

Copyright 2015

Page 23

Chapter 3 - Data Flow

Specifying a SQL Command


You should usually build a SQL query to pull down only those columns you need to work with:

c)

a)

Choose to
work with an
SQL command.

b)

Click here
to build your
query (or just
paste it in to the
SQL command
text window).

Generate the SQL command that you want


to use as shown on the right.

Use this tool to run your query to see the results.


Use this tool to add a table to your query. This
query builder is almost identical to the ones used in
Access, Management Studio, Reporting Services
and other Microsoft applications.

Choosing Output Columns


Having chosen the connection to use, its time to choose the output columns:
a)

Click to specify
which columns the
data source should
output.

b)

For each column,


you can effectively
rename it here.

Copyright 2015

Page 24

Chapter 3 - Data Flow

3.5

Creating the Flat File Connection Manager

To create a connection manager for the destination for our data flow, we first need to create a
connection to a flat file. Heres how to do this!

Step 1 Ensure you have a Flat File Template


For the purposes of this chapters example, you first need to create a new file in Notepad:
Well export two bits of information: the name of each film, and how many
Oscars it won. Were using the vertical bar as a delimiter.

Step 2 Starting to Create a Connection to the Flat File


You can create a connection while you create a
destination, but it might be easier to do the two
things separately, as here.

a)

Right-click in the Connection Managers


section of the data flow task, and choose to
create a new connection specific to this task.

Wise
Wise Owls
Owls
Hint
Hint

Copyright 2015

If you had several packages using the same flat file, you might choose
instead to define a connection manager at project level.

Page 25

Chapter 3 - Data Flow

Step 3 Configuring the New Connection Manager (General)


You can now point this connection manager at your flat file:
a) Give the flat file a sensible name (it can be an idea to include
a reference to the type of connection in this name, as here).

c) Usually the rows will be separated by


carriage returns, and the first row will
contain column names.

Copyright 2015

b) The description is
optional!

d) Click on this button to find and


select the flat file you want to
connect to.

Page 26

Chapter 3 - Data Flow

Step 4 Configuring the Connection Managers Columns


You should now say how your flat files columns are separated:
a)

Choose to
show Columns.

b)

Choose the
column separator
(ours is a vertical
bar character).

c)

Click on this
button (disabled
here) to bring the
preview shown up
to date.

Step 5 Choose Advanced Settings for the Connection Manager


To get our particular data transformation to work, you need to make one more change:
a)

Show advanced
settings for this
connection manager.

b)

Change the
column width to a
higher number, to
accommodate longer
film names.

Copyright 2015

Page 27

Chapter 3 - Data Flow

3.6

Creating and Configuring the Destination

Again, you could use the Destination Assistant to add a destination, but well do things the
(slightly) harder way.

Step 1 - Creating the Destination


To add a flat file destination to the data flow:

c) SSIS adds the destination you


can now rename this if you like:

a) Choose to add a flat file


destination.

b) Click where you


want it to go.

Step 2 - Connecting the Source to the Destination


Its a good idea to do this next, so that you can map columns in the destination:

Copyright 2015

a)

Click on the Movies database


source, then click on the blue arrow
emanating from it (this represents
successful data; the red arrow is the flow
of failed data).

b)

Drag this on to the destination


when you release the mouse, the source
and destination will be joined.

Page 28

Chapter 3 - Data Flow

Step 3 Assigning a Connection Manager to the Destination


Its time now to tell SSIS which connection manager the destination will use:
a)

Double-click on this icon to edit the destination


(the error message and red circle both show that
theres a problem with the destination, which were
about to remedy!).

b) SSIS automatically
takes you to this tab.

c) SSIS guesses that you want to assign the flat file connection manager to
this flat file destination task a not unreasonable guess!

Step 4 Mapping Columns


Finally for the destination, choose which columns from the source map onto which columns in
the destination:
a)

Choose this
tab.

b)

Click on this
drop arrow and
choose to map the
FilmOscarWins
column from the
source on to the
Oscars column in
the destination.

Copyright 2015

Page 29

Chapter 3 - Data Flow

3.7

Executing the Data Flow Package

You can now test to see if your package works:


a)

Copyright 2015

Right-click in the Data Flow window to execute


the task, or right-click on the package in Solution
Explorer to execute the entire package:

b)

Things are going well every part of the data flow has a
tick next to it!

c)

When you look at the text


file, you should see the films
listed.

Page 30

You might also like