Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

SSIS-IQ

Question 1 - True or False - Using a checkpoint file in SSIS is just like issuing the CHECKP I!T co""an# against the relational engine$ It co""its all of the #ata to the #ata%ase$ False. SSIS provides a Checkpoint capability which allows a package to restart at the point of failure. Question & - Can 'ou e(plain the )hat the I"port*E(port tool #oes an# the %asic steps in the )i+ar#, The Import\E port tool is accessible via !I"S or e ecuting the dtswi#ard command. The tool identifies a data source and a destination to move data either within $ database% between instances or even from a database to a file &or vice versa'. Question - - .hat are the co""an# line tools to e(ecute SQ/ Ser0er Integration Ser0ices packages, "TSE(EC)I * +hen this command line tool is run a user interface is loaded in order to configure each of the applicable parameters to e ecute an SSIS package. "TE(EC * This is a pure command line tool where all of the needed switches must be passed into the command for successful e ecution of the SSIS package. Question 1 - Can 'ou e(plain the SQ/ Ser0er Integration Ser0ices functionalit' in 2anage"ent Stu#io, ,ou have the ability to do the following.ogin to the S/. Server Integration Services instance 0iew the SSIS log 0iew the packages that are currently running on that instance !rowse the packages stored in 1S"! or the file system Import or e port packages "elete packages 2un packages Question 3 - Can 'ou na"e so"e of the core SSIS co"ponents in the 4usiness Intelligence 5e0elop"ent Stu#io 'ou )ork )ith on a regular %asis )hen %uil#ing an SSIS package, Connection 1anagers Control Flow "ata Flow Event 3andlers 0ariables window Toolbo window 4utput window .ogging 5ackage Configurations Question 5ifficult' 6 2o#erate

Question 1 - True or False7 SSIS has a #efault "eans to log all recor#s up#ate#8 #elete# or inserte# on a per ta%le %asis$ False% but a custom solution can be built to meet these needs. Question & - .hat is a %reakpoint in SSIS, Ho) is it setup, Ho) #o 'ou #isa%le it, 6 breakpoint is a stopping point in the code. The breakpoint can give the "eveloper\"!6 an opportunity to review the status of the data% variables and the overall status of the SSIS package. $7 uni8ue conditions e ist for each breakpoint. !reakpoints are setup in !I"S. In !I"S% navigate to the control flow interface. 2ight click on the ob9ect where you want to set the breakpoint and select the :Edit !reakpoints...: option.

Question - - Can 'ou na"e 3 or "ore of the nati0e SSIS connection "anagers, 4.E"! connection * )sed to connect to any data source re8uiring an 4.E"! connection &i.e.% S/. Server ;777' Flat file connection * )sed to make a connection to a single file in the File System. 2e8uired for reading information from a File System flat file 6"4.<et connection * )ses the .<et 5rovider to make a connection to S/. Server ;77= or other connection e posed through managed code &like C>' in a custom task 6nalysis Services connection * )sed to make a connection to an 6nalysis Services database or pro9ect. 2e8uired for the 6nalysis Services "". Task and 6nalysis Services 5rocessing Task File connection * )sed to reference a file or folder. The options are to either use or create a file or folder E cel FT5 3TT5 1S1/ S14 S1T5 S/.1obile +1I Question 1 - Ho) #o 'ou eli"inate 9uotes fro" %eing uploa#e# fro" a flat file to SQ/ Ser0er, In the SSIS package on the Flat File Connection 1anager Editor% enter 8uotes into the Te t 8ualifier field then preview the data to ensure the 8uotes are not included. 6dditional information- 3ow to strip out double 8uotes from an import file in S/. Server Integration Services Question 3 - Can 'ou na"e 3 or "ore of the "ain SSIS tool %o( )i#gets an# their functionalit', For .oop Container Foreach .oop Container Se8uence Container 6ctive( Script Task 6nalysis Services E ecute "". Task 6nalysis Services 5rocessing Task !ulk Insert Task

"ata Flow Task "ata 1ining /uery Task E ecute "TS ;777 5ackage Task E ecute 5ackage Task E ecute 5rocess Task E ecute S/. Task etc. Question 5ifficult' 6 5ifficult Question 1 - Can 'ou e(plain one approach to #eplo' an SSIS package, 4ne option is to build a deployment manifest file in !I"S% then copy the directory to the applicable S/. Server then work through the steps of the package installation wi#ard 6 second option is using the dtutil utility to copy% paste% rename% delete an SSIS 5ackage 6 third option is to login to S/. Server Integration Services via S/. Server 1anagement Studio then navigate to the :Stored 5ackages: folder then right click on the one of the children folders or an SSIS package to access the :Import 5ackages...: or :E port 5ackages...:option. 6 fourth option in !I"S is to navigate to File ? Save Copy of 5ackage and complete the interface.

Question & - Can 'ou e(plain ho) to setup a checkpoint file in SSIS, The following items need to be configured on the properties tab for SSIS packageCheckpointFile<ame * Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. 2ather than using a hard*coded path as shown above% it:s a good idea to use an e pression that concatenates a path defined in a package variable and the package name. Checkpoint)sage * "etermines if@how checkpoints are used. Choose from these options- <ever &default'% IfE ists% or 6lways. <ever indicates that you are not using Checkpoints. IfE ists is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts e ecution with the first task. The 6lways choice raises an error if the Checkpoint file does not e ist. SaveCheckpoints * Choose from these options- True or False &default'. ,ou must select True to implement the Checkpoint behavior. Question - - Can 'ou e(plain #ifferent options for #'na"ic configurations in SSIS, )se an (1. file )se custom variables )se a database per environment with the variables )se a centrali#ed database with all variables Question 1 - Ho) #o 'ou upgra#e an SSIS Package, "epending on the comple ity of the package% one or two techni8ues are typically used2ecode the package based on the functionality in S/. Server "TS )se the 1igrate "TS ;777 5ackage wi#ard in !I"S then recode any portion of the package that is not accurate

Question 3 - Can 'ou na"e fi0e of the Perf"on counters for SSIS an# the 0alue the' pro0i#e, S/.Server-SSIS Service SSIS 5ackage Instances * Total number of simultaneous SSIS 5ackages running S/.Server-SSIS 5ipeline !.4! bytes read * Total bytes read from binary large ob9ects during the monitoring period. !.4! bytes written * Total bytes written to binary large ob9ects during the monitoring period. !.4! files in use * <umber of binary large ob9ects files used during the data flow task during the monitoring period. !uffer memory * The amount of physical or virtual memory used by the data flow task during the monitoring period. !uffers in use * The number of buffers in use during the data flow task during the monitoring period. !uffers spooled * The number of buffers written to disk during the data flow task during the monitoring period. Flat buffer memory * The total number of blocks of memory in use by the data flow task during the monitoring period. Flat buffers in use * The number of blocks of memory in use by the data flow task at a point in time. 5rivate buffer memory * The total amount of physical or virtual memory used by data transformation tasks in the data flow engine during the monitoring period. 5rivate buffers in use * The number of blocks of memory in use by the transformations in the data flow task at a point in time. 2ows read * Total number of input rows in use by the data flow task at a point in time. 2ows written * Total number of output rows in use by the data flow task at a point in time. Sourcehttp-@@www.dotnetspider.com@forum@$=ABB$*S8l*Server*Integration*services*Interview* 8uestions.asp

Common search for new SSIS programmer looking for change is what 8uestions to e pect on SSIS. !ased on the interviews I take on SSIS% I will list down my favorites and e pected 8uestions on SSIS. /$ E plain architecture of SSISC Integration Services 6rchitecture 1icrosoft S/. Server ;77= Integration Services &SSIS' consists of four key parts- the Integration Services service% the Integration Services ob9ect model% the Integration Services runtime and the run*time e ecutables% and the "ata Flow task that encapsulates the data flow engine and the data flow components. The following diagram shows the relationship of the parts.

"evelopers who access the Integration Services ob9ect model from custom clients or write custom tasks or transformations can write code by using any common language runtime &C.2' compliant language. For more information% see Integration Services 5rogramming. Integration Services Service The Integration Services service% available in S/. Server 1anagement Studio% monitors running Integration Services packages and manages the storage of packages. For more information% click one of the following topicsIntegration Services Service Introducing S/. Server 1anagement Studio Integration Services 4b9ect 1odel

The Integration Services ob9ect model includes managed application programming interfaces &65I' for accessing Integration Services tools% command*line utilities% and custom applications. For more information% click one of the following topicsIntegration Services 5rogramming Integration Services Tools and )tilities Integration Services 2untime The Integration Services runtime saves the layout of packages% runs packages% and provides support for logging% breakpoints% configuration% connections% and transactions. The Integration Services run*time e ecutables are the package% containers% tasks% and event handlers that Integration Services includes% and custom tasks. For more information% click one of the following topicsIntegration Services 5ackages Integration Services Containers Integration Services Tasks Integration Services Event 3andlers 1icrosoft.S8lServer."ts.2untime Integration Services "ata Flow The "ata Flow task encapsulates the data flow engine. The data flow engine provides the in* memory buffers that move data from source to destination% and calls the sources that e tract data from files and relational databases. The data flow engine also manages the transformations that modify data% and the destinations that load data or make data available to other processes. Integration Services data flow components are the sources% transformations% and destinations that Integration Services includes. ,ou can also include custom components in a data flow. For more information% click one of the following topics"ata Flow Task "ata Flow Elements 1icrosoft.S8lServer."ts.5ipeline.+rapper Source- http-@@technet.microsoft.com@en*us@library@ms$D$B7E&S/..E7'.asp

/; "ifference between Control Flow and "ata FlowC 0ery easy. /F 3ow would you do .ogging in SSISC .og using the logging configuration inbuilt in SSIS or use Custom logging through Event handlers. 2onitoring Ho)-to Topics :Integration Ser0ices; This section contains procedures for adding log providers to a package and configuring logging by using the S/. Server Integration Services tools that !usiness Intelligence "evelopment Studio provides. 3ow to- Enable .ogging in a 5ackage 3ow to- Enable .ogging in a 5ackage This procedure describes how to add logs to a package% configure package*level logging% and save the logging configuration to an (1. file. ,ou can add logs only at the package level% but the package does not have to perform logging to enable logging in the containers that the package includes. !y default% the containers in the package use the same logging configuration as their parent container. For information about setting logging options for individual containers% see 3ow toConfigure .ogging by )sing a Saved Configuration File. To enable logging in a package $. In !usiness Intelligence "evelopment Studio% open the Integration Services pro9ect that contains the package you want. 2. 4n the SSIS menu% click /ogging. 3. Select a log provider in the Pro0i#er t'pe list% and then click <##. 4. In the Configuration column% select a connection manager or click =!e) connection> to create a new connection manager of the appropriate type for the log provider. "epending on the selected provider% use one of the following connection managerso For Te t files% use a File connection manager. For more information% see File Connection 1anager o For S/. Server 5rofiler% use a File connection manager. o For S/. Server% use an 4.E "! connection manager. For more information% see 4.E "! Connection 1anager. o For +indows Event .og% do nothing. SSIS automatically creates the log. o For (1. files% use a File connection manager. =. 2epeat steps F and D for each log to use in the package. !ote7 6 package can use more than one log of each type. 6. 4ptionally% select the package*level check bo % select the logs to use for package*level logging% and then click the 5etails tab. 7. 4n the 5etails tab% select E0ents to log all log entries% or clear E0ents to select individual events.

8. 4ptionally% click <#0ance# to specify which information to log.


!ote7 !y default% all information is logged. 9. 4n the 5etails tab% click Sa0e$ The Sa0e <s dialog bo appears. .ocate the folder in which to save the logging configuration% type a file name for the new log configuration% and then click Sa0e. 10. Click K. 11. To save the updated package% click Sa0e Selecte# Ite"s on the File menu.

3ow to- Configure .ogging by )sing a Saved Configuration File 3ow to- Configure .ogging by )sing a Saved Configuration File This procedure describes how to configure logging for new containers in a package by loading a previously saved logging configuration file. !y default% all containers in a package use the same logging configuration as their parent container. For e ample% the tasks in a Foreach .oop use the same logging configuration as the Foreach .oop. To configure logging for a container $. In !usiness Intelligence "evelopment Studio% open the Integration Services pro9ect that contains the package you want. 2. 4n the SSIS menu% click /ogging. F. E pand the package tree view and select the container to configure. 4. 4n the Pro0i#ers an# /ogs tab% select the logs to use for the container. !ote7 ,ou can create logs only at the package level. For more information% see 3ow to- Enable .ogging in a 5ackage. 5. Click the 5etails tab and click /oa#. 6. .ocate the logging configuration file you want to use and click pen. 7. 4ptionally% select a different log entry to log by selecting its check bo in the E0ents column. Click <#0ance# to select the type of information to log for this entry. !ote7 The new container may include additional log entries that are not available for the container originally used to create the logging configuration. These additional log entries must be selected manually if you want them to be logged. 8. To save the updated version of the logging configuration% click Sa0e. 9. To save the updated package% click Sa0e Selecte# Ite"s on the File menu. Source- http-@@msdn.microsoft.com@en*us@library@ms$D$B$7.asp 3ow to- 0iew .og Entries in the .og Events +indow 3ow to- 0iew .og Entries in the .og Events +indow

This procedure describes how to run a package and view the log entries it writes. ,ou can view the log entries in real time. The log entries that are written to the /og E0ents window can also be copied and saved for further analysis. It is not necessary to write the log entries to a log to write the entries to the /og E0ents window. To view log entries $. In !usiness Intelligence "evelopment Studio% open the Integration Services pro9ect that contains the package you want. 4n the SSIS menu% click /og E0ents. ,ou can optionally display the /og E0ents window by mapping the 0iew..ogEvents command to a key combination of your choosing on the Ke'%oar# page of the ptions dialog bo . 4n the 5e%ug menu% click Start 5e%ugging. 6s the runtime encounters the events and custom messages that are enabled for logging% log entries for each event or message are written to the /og E0ents window.

2. 3.

4. 4n the 5e%ug menu% click Stop 5e%ugging.


The log entries remain available in the /og E0ents window until you rerun the package% run a different package% or close !usiness Intelligence "evelopment Studio.

5. 0iew the log entries in the /og E0ents window. 6. 4ptionally% click the log entries to copy% right*click% and then click Cop'. 7. 4ptionally% double*click a log entry% and in the /og Entr' dialog bo % view the details for
a single log entry.

8. In the /og Entr' dialog bo % click the up and down arrows to display the previous or
E. ne t log entry% and click the copy icon to copy the log entry. 4pen a te t editor% paste% and then save the log entry to a te t file.

Sourcehttp-@@msdn.microsoft.com@en*us@library@ms$D$B;B.asp /D 3ow would you do Error 3andlingC its for you. /= 3ow to pass property value at 2un timeC 3ow do you implement 5ackage ConfigurationC 5ackage Configurations S/. Server Integration Services provides package configurations that you can use to update the values of properties at run time. 6 configuration is a property@value pair that you add to a completed package. Typically% you create a package set properties on the package ob9ects during package development% and then add the configuration to the package. +hen the package runs% it gets the new values of the property from the configuration. For e ample% by using a configuration% you can change the connection string of a connection manager% or update the value of a variable.

5ackage configurations provide the following benefits Configurations make it easier to move packages from a development environment to a production environment. For e ample% a configuration can update the path of a source file% or change the name of a database or server. Configurations are useful when you deploy packages to many different servers. For e ample% a variable in the configuration for each deployed package can contain a different disk space value% and if the available disk space does not meet this value% the package does not run. Configurations make packages more fle ible. For e ample% a configuration can update the value of a variable that is used in a property e pression.

Integration Services supports several different methods of storing package configurations% such as (1. files% tables in a S/. Server database% and environment and package variables. Each configuration is a property@value pair. The (1. configuration file and S/. Server configuration types can include multiple configurations. The configurations are included when you create a package deployment utility for installing packages. +hen you install the packages% the configurations can be updated as a step in the package installation. !ote7 To become better ac8uainted with the concepts e plained in this section% see Tutorial- "eploying 5ackages and .esson F- 6dding 5ackage Configurations of Tutorial- Creating a Simple ET. 5ackage. 5ackage Configuration Types The following table describes the package configuration types. T'pe (1. configuration file Environment variable 2egistry entry 5arent package variable S/. Server table 5escription 6n (1. file contains the configurations. The (1. file can include multiple configurations. 6n environment variable contains the configuration. 6 registry entry contains the configuration. 6 variable in the package contains the configuration. This configuration type is typically used to update properties in child packages. 6 table in a S/. Server database contains the configuration. The table can include multiple configurations.

?2/ Configuration Files If you select the ?2/ configuration file configuration type% you can create a new configuration file% reuse an e isting file and add new configurations% or reuse an e isting file but overwrite e isting file content. 6n (1. configuration file includes two sections-

10

6 heading that contains information about the configuration file. This element includes attributes such as when the file was created and the name of the person who generated the file. Configuration elements that contain information about each configuration. This element includes attributes such as the property path and the configured value of a property.

The following (1. code demonstrates the synta of an (1. configuration file. This e ample shows a configuration for the @alue property of an integer variable named 1y0ar. Copy Code GC ml versionHI$.7ICJ G"TSConfigurationJ G"TSConfiguration3eadingJ G"TSConfigurationFileInfo Kenerated!yHI"omain<ame\)ser<ameI KeneratedFrom5ackage<ameHI5ackageI KeneratedFrom5ackageI"HIL;6F7MBMM*A$B6*DE;A*EABA*7"EFB6$=7MDANI Kenerated"ateHI;@7$@;77= =-=A-7E 51I@J G@"TSConfiguration3eadingJ GConfiguration ConfiguredTypeHI5ropertyI 5athHI\5ackage.0ariablesO)ser--1y0arP.0alueI 0alueTypeHIIntF;IJ GConfigured0alueJ7G@Configured0alueJ G@ConfigurationJ G@"TSConfigurationJ Aegistr' Entr' If you want to use a registry entry to store the configuration% you can either use an e isting key or create a new key in 3QE,RC)22E<TR)SE2. The registry key that you use must have a value named @alue. The value can be a "+42" or a string. If you select the Aegistr' entr' configuration type% you type the name of the registry key in the 2egistry entry bo . The format is Gregistry keyJ. If you want to use a registry key that is not at the root of 3QE,RC)22E<TR)SE2% use the format Gregistry key\registry key\...J to identify the key. For e ample% to use the 1y5ackage key located in SSIS5ackages% type SSISPackages*2'Package. SQ/ Ser0er If you select the SQ/ Ser0er configuration type% you specify the connection to the S/. Server database in which you want to store the configurations. ,ou can save the configurations to an e isting table or create a new table in the specified database. The following S/. statement shows the default C2E6TE T6!.E statement that the 5ackage Configuration +i#ard provides. Copy Code C2E6TE T6!.E OdboP.OSSIS ConfigurationsP & ConfigurationFilter <062C362&;==' <4T <)..% Configured0alue <062C362&;==' <)..%

11

5ackage5ath <062C362&;==' <4T <)..% Configured0alueType <062C362&;7' <4T <).. ' The name that you provide for the configuration is the value stored in the ConfigurationFilter column. "irect and Indirect Configurations Integration Services provides direct and indirect configurations. If you specify configurations directly% Integration Services creates a direct link between the configuration item and the package ob9ect property. "irect configurations are a better choice when the location of the source does not change. For e ample% if you are sure that all deployments in the package use the same file path% you can specify an (1. configuration file. Indirect configurations use environment variables. Instead of specifying the configuration setting directly% the configuration points to an environment variable% which in turn contains the configuration value. )sing indirect configurations is a better choice when the location of the configuration can change for each deployment of a package.

http-@@msdn.microsoft.com@en*us@library@ms$D$MA;.asp /M 3ow would you deploy a SSIS 5ackage on productionC $. Create deployment utility by setting its property as true . ;. It will be created in the bin folder of the solution as soon as package is build. F. Copy all the files in the utility and use manifest file to deply it on the 5rod. /B "ifference between "TS and SSISC Every thing e cept both are product of 1icrosoft -*' /A +hat are new features in SSIS ;77AC http-@@s8lserversolutions.blogspot.com@;77E@7$@new*improvementfeatures*in*ssis*;77A.html /E 3ow would you pass a variable value to Child 5ackageC http-@@s8lserversolutions.blogspot.com@;77E@7;@passing*variable*to*child*package*from.html 3ow to- )se 0alues of 5arent 0ariables in Child 5ackages <ew- 3 5ece"%er &BB3 This procedure describes how to create a package configuration that uses the parent variable configuration type to enable a child package that is run from a parent package to access a variable in the parent. It is not necessary to create the variable in the parent package before you create the package configuration in the child package. ,ou can add the variable to the parent package at any time% but you must use the e act name of the parent variable in the package configuration. 3owever% before you can create a parent variable configuration% there must be an e isting variable in the child package that the configuration can update. For more information about adding and configuring variables% see 3ow to- 6dd a 0ariable to a 5ackage )sing the 0ariables +indow.

12

The scope of the variable in the parent package that is used in a parent variable configuration can be set to the E ecute 5ackage task% to the container that has the task% or to the package. If multiple variables with the same name are defined in a package% the variable that is closest in scope to the E ecute 5ackage task is used. The closest scope to the E ecute 5ackage task is the task itself. To add a variable to a parent package $. In !usiness Intelligence "evelopment Studio% open the Integration Services pro9ect that contains the package to which you want to add a variable to pass to a child package. ;. In Solution E plorer% double*click the package to open it. F. In SSIS "esigner% to define the scope of the variable% do one of the followingo To set the scope to the package% click anywhere on the design surface of the Control Flo) tab. o To set the scope to a parent container of the E ecute 5ackage task% click the container. o To set the scope to a parent container of the E ecute 5ackage task% click the task. D. 6dd and configure a variable. !ote7 Select a data type that is compatible with the data that the variable will store. 5. To save the updated package% click Sa0e Selecte# Ite"s on the File menu. To add a variable to a child package $. In !usiness Intelligence "evelopment Studio% open the Integration Services pro9ect that contains the package to which you want to add a parent variable configuration. ;. In Solution E plorer% double*click the package to open it. 3. In SSIS "esigner% to set the scope to the package% click anywhere on the design surface of the Control Flo) tab. D. 6dd and configure a variable. !ote7 Select a data type that is compatible with the data that the variable will store. 5. To save the updated package% click Sa0e Selecte# Ite"s on the File menu. To add a parent package configuration to a child package $. If it is not already open% open the child package in !usiness Intelligence "evelopment Studio. Click anywhere on the design surface of the Control Flo) tab. 4n the SSIS menu% click Package Configurations. In the Package Configuration rgani+er dialog bo % select Ena%le package configuration% and then click <##. 4n the welcome page of the 5ackage Configuration +i#ard% click !e(t$ 4n the Select Configuration Type page% in the Configuration t'pe list% select Parent package 0aria%le and do one of the followingo Select Specif' configuration settings #irectl'% and then in the Parent 0aria%le bo % provide the name of the variable in the parent package to use in the configuration. I"portant7 0ariable names are case sensitive.

2. 3. 4. 5. 6.

13

Select or Configuration location is store# in an en0iron"ent 0aria%le8 and then in the En0iron"ent 0aria%le list% select the environment variable that contains the name of the variable. 7. Click !e(t. 8. 4n the Select Target 5roperty page% e pand the @aria%le node% and e pand the Properties node of the variable to configure% and then click the property to be set by the configuration. 9. Click !e(t. $7. 4n the Completing the +i#ard page% optionally% modify the default name of the configuration and review the configuration information. 11. Click Finish to complete the wi#ard and return to the Package Configuration rgani+er dialog bo . 12. In the Package Configuration rgani+er dialog bo % the Configuration bo lists the new configuration. 13. Click Close. o

http-@@technet.microsoft.com@en*us@library@msFD=$BE&S/..E7'.asp /$7 +hat is E ecution TreeC E(ecution Trees E ecution trees demonstrate how your package uses buffers and threads. 6t run time% the data flow engine breaks down "ata Flow task operations into e ecution trees. These e ecution trees specify how buffers and threads are allocated in the package. Each tree creates a new buffer and may e ecute on a different thread. +hen a new buffer is created such as when a partially blocking or blocking transformation is added to the pipeline% additional memory is re8uired to handle the data transformationS however% it is important to note that each new tree may also give you an additional worker thread. E amine the e ecution trees in the e ample depicted in Figure $ and Table $ where two Employee datasets are combined together and then aggregated to load into a common destination table.

14

Figure 17 E(a"ple package !ote7 E ecution trees are listed in the table in the order that they e ecute. Ta%le 17 E(ecution trees #efine# E(ecution Tree 5escription Enu"eration begin e(ecution tree In E(ecution Tree &% SSIS reads data from the Employee 4.E "! Source into & the pipeline% a "erived Column transformation adds another column% and SSIS passes data to the )nion 6ll transformation. 6ll of the operations in output I4.E "! this e ecution tree use the same bufferS data is not copied again once it is Source 4utputI &;B' read into the 4.E "! Source 4utput. input I"erived Column InputI &$B;' output I"erived Column 4utputI &$BF' input I)nion 6ll Input $I &D$$' output I"erived

15

Column Error 4utputI &$BD' end e ecution tree ; begin e(ecution tree output I4.E "! Source Error 4utputI &;A' input I4.E "! "estination InputI &;M7F' output I4.E "! "estination Error 4utputI &;M7D' end e ecution tree F begin e(ecution tree 1 output IFlat File Source 4utputI &;FMF' input I)nion 6ll Input FI &;7EA' end e ecution tree D begin e(ecution tree 3 output IFlat File Source Error 4utputI &;FMD' input I4.E "! "estination InputI &FA$A' output I4.E "! "estination Error 4utputI &FA$E' end e ecution tree = begin e(ecution tree In E(ecution Tree B% the 5artially !locking )nion 6ll transformation is In E(ecution Tree 3% a buffer is created to hold errors from the asynchronous Employee Flat File Source before loading them into a destination error table. In E(ecution Tree -% SSIS creates a buffer to hold error records from the asynchronous Employee 4.E "! Source before loading them into a destination error table.

In E(ecution Tree 1% SSIS reads data from the Employee Flat File Source and passes it to the )nion 6ll. These two operations use the same buffer.

16

B output I)nion 6ll 4utput $I &D$;' input I6ggregate Input $I &;DB;' end e ecution tree 7 begin e(ecution tree 1 output I6ggregate 4utput $I &;DBF' input I4.E "! "estination InputI &$=7' output I4.E "! "estination Error 4utputI &$=$' end e ecution tree $ This e ample demonstrates how e ecution trees can help you understand buffer usage in a common SSIS package. This e ample also highlights how 5artially !locking transformations like )nion 6ll and Fully !locking transformations like 6ggregate create new buffers and threads whereas 2ow Transformations like "erived Column do not. E ecution trees are enormously valuable in understanding buffer usage. ,ou can display e ecution trees for your own packages by turning on package logging% enabling logging for the "ata Flow task% and then selecting the 5ipeline E ecution Tree event. <ote that you will not see the e ecution trees until you e ecute the package. +hen you do e ecute the package% the e ecution trees appear in the .og Events window in !usiness Intelligence &!I' "evelopment Studio. In E(ecution Tree 1% after the Fully !locking 6ggregate transformation is completed% the output from the 6ggregate operation is copied into a new buffer and data is loaded into the 4.E "! "estination.

e ecuted and a new buffer is created to store the combined data and the aggregate is calculated.

http-@@technet.microsoft.com@en*us@library@ccEMM=;E.asp /$$ +hat are the points to keep in mind for performance improvement of the packageC http-@@technet.microsoft.com@en*us@library@ccEMM=;E.asp /$; ,ou may get a 8uestion stating a scenario and then asking you how would you create a package for that e.g. 3ow would you configure a data flow task so that it can transfer data to different table based on the city name in a source table columnC /$F "ifference between )nionall and 1erge ToinC I have been asked by many new SSIS developer about difference between 1erge and )nion all transformation in SSIS.

17

+ell both of them essentially takes outputs from more than one sources and combines them into a single result set but there are couple of differences between twoa; 2erge transfor"ation can accept onl' t)o inputs )hereas Union all can take "ore than t)o inputs %; 5ata has to %e sorte# %efore 2erge Transfor"ation )hereas Union all #oesnCt ha0e an' con#ition like that$ http-@@s8lserversolutions.blogspot.com@;77E@7$@difference*between*merge*and*union* all.html /$D 1ay get 8uestion regarding what ( transformation doC.ookup% fu##y lookup% fu##y grouping transformation are my favorites. For you. /$= 3ow would you restart package from previous failure pointC+hat are Checkpoints and how can we implement in SSISC )sing Checkpoints in 5ackages Integration Services can restart failed packages from the point of failure% instead of rerunning the whole package. If a package is configured to use checkpoints% information about package e ecution is written to a checkpoint file. +hen the failed package is rerun% the checkpoint file is used to restart the package from the point of failure. If the package runs successfully% the checkpoint file is deleted% and then re*created the ne t time the package is run. )sing checkpoints in a package can provide the following benefits. 6void repeating the downloading and uploading of large files. For e ample% a package that downloads multiple large files by using an FT5 task for each download can be restarted after the downloading of a single file fails and then download only that file. 6void repeating the loading of large amounts of data. For e ample% a package that performs bulk inserts into dimension tables in a data warehouse using a different !ulk Insert task for each dimension can be restarted if the insertion fails for one dimension table% and only that dimension will be reloaded. 6void repeating the aggregation of values. For e ample% a package that computes many aggregates% such as averages and sums% using a separate "ata Flow task to perform each aggregation% can be restarted after computing an aggregation fails and only that aggregation will be recomputed.

If a package is configured to use checkpoints% Integration Services captures the restart point in the checkpoint file. The type of container that fails and the implementation of features such as transactions affect the restart point that is recorded in the checkpoint file. The current values of variables are also captured in the checkpoint file. 3owever% the values of variables that have the %ject data type are not saved in checkpoint files. "efining 2estart 5oints

18

The task host container% which encapsulates a single task% is the smallest atomic unit of work that can be restarted. The Foreach .oop container% the "ata Flow task and all that it contains% and a transacted container are also treated as atomic units of work. If a package is stopped while a transacted container is running% the transaction ends and any work performed by the container is rolled back. +hen the package is restarted% the container that failed is rerun. The completion of any child containers of the transacted container is not recorded in the checkpoint file. Therefore% when the package is restarted% the transacted container and its child containers run again. !ote7 )sing checkpoints and transactions in the same package could cause une pected results. For e ample% when a package fails and restarts from a checkpoint% the package might repeat a transaction that has already been successfully committed. +hen a package is restarted from a checkpoint% the Foreach .oop container and its child containers are run again. If a child container in the loop previously ran successfully% this is not recorded in the checkpoint fileS instead% the child container is run again. If the package is restarted% the package configurations are not reloadedS instead the package uses the configuration information written to the checkpoint file. This ensures that% when the package is run again% the package uses the same configurations as when it failed. 6 package can be restarted only at the control flow level. ,ou cannot restart a package in the middle of a data flow. To avoid rerunning the whole data flow% the package might be designed to include multiple "ata Flow tasks. This way the package can be restarted% and will rerun only the "ata Flow tasks that failed. Configuring a 5ackage to 2estart The checkpoint file includes the e ecution results of all completed containers% the current values of system and user*defined variables% and package configuration information. The file also includes the uni8ue identifier of the package. To successfully restart a package% the package identifier in the checkpoint file and the package must matchS otherwise the restart fails. This prevents a package from using a checkpoint file written by a different package version. If the package runs successfully% after it is restarted the checkpoint file is deleted. The following table lists the package properties that you set to implement checkpoints. Propert' 5escription CheckpointFile!a"e Specifies the name of the checkpoint file. CheckpointUsage Specifies whether checkpoints are used. Indicates whether the package saves checkpoints. This property must be Sa0eCheckpoints set to True to restart a package from a point of failure. 6dditionally% you must set the FailPackage nFailure property to true for all the containers in the package that you want to identify as restart points.

19

,ou can use the ForceE(ecutionAesult property to test the use of checkpoints in a package. !y setting ForceE(ecutionAesult of a task or container to Failure% you can imitate real*time failure. +hen you rerun the package% the failed task and containers will be rerun. Checkpoint Usage The CheckpointUsage property can be set to the following values5escription Specifies that the checkpoint file is not used and that the package runs from the start of !e0er the package workflow. Specifies that the checkpoint file is always used and that the package restarts from the <l)a's point of the previous e ecution failure. If the checkpoint file is not found% the package fails. Specifies that the checkpoint file is used if it e ists. If the checkpoint file e ists% the IfE(ists package restarts from the point of the previous e ecution failureS otherwise% it runs from the start of the package workflow. !ote7 The DCheckPointing on option of dte ec is e8uivalent to setting the Sa0eCheckpoints property of the package to True% and the CheckpointUsage property to 6lways. For more information% see dte ec )tility. Securing Checkpoint Files 5ackage level protection does not include protection of checkpoint files and you must secure these files separately. Checkpoint data can be stored only in the file system and you should use an operating system access control list &6C.' to secure the location or folder where you store the file. It is important to secure checkpoint files because they contain information about the package state% including the current values of variables. For e ample% a variable may contain a recordset with many rows of private data such as telephone numbers. For more information% see Controlling 6ccess to Files )sed by 5ackages. To configure the checkpoint properties 3ow to- Configure Checkpoints for 2estarting a Failed 5ackage @alue

http-@@msdn.microsoft.com@en*us@library@ms$D7;;M.asp /$M +here are SSIS package stored in the S/. ServerC 1S"!.sysdtspackagesE7 stores the actual content and ssydtscategories% sysdtslogE7% sysdtspackagefoldersE7% sysdtspackagelog% sysdtssteplog% and sysdtstasklog do the supporting roles. /$B 3ow would you schedule a SSIS packagesC )sing S/. Server 6gent. 2ead about Scheduling a 9ob on S8l server 6gent /$A "ifference between asynchronous and synchronous transformationsC 6synchronous transformation have different Input and 4utput buffers and it is up to the component designer in an 6sync component to provide a column structure to the output buffer and hook up the data from the input.

20

/$E 3ow to achieve multiple threading in SSISC Sourcehttp-@@s8lserversolutions.blogspot.com@;77E@7;@ssis*interview*8uestions.html

$' +hat is the control flow ;' what is a data flow F' how do you do error handling in SSIS D' how do you do logging in ssis =' how do you deploy ssis packages. M' how do you schedule ssis packages to run on the fly B' how do you run stored procedure and get data A' 6 scenario- +ant to insert a tect file into database table% but during the upload want to change a column called as months * Tanuary% Feb% etc to a code% * $%;%F.. .This code can be read from another database table called months. 6fter the conversion of the data % upload the file. If there are any errors% write to error table. Then for all errors% read errors from database% create a file% and mail it to the supervisor. 3ow would you accomplish this task in SSISC E'what are variables and what is variable scope C The website also says :These are SSIS fundamentals and if you want to be a competent developer those are the 1I<I1)1 that you need to know...:

For Q 1 an# &7 In SSIS a workflow is called a control*flow. 6 control*flow links together our modular data*flows as a series of operations in order to achieve a desired result. < control flo) consists of one or more tasks and containers that e ecute when the package runs. To control order or define the conditions for running the ne t task or container in the package control flow% you use precedence constraints to connect the tasks and containers in a package. 6 subset of tasks and containers can also be grouped and run repeatedly as a unit within the package control flow. S/. Server ;77= Integration Services &SSIS' provides three different types of control flow elements- containers that provide structures in packages% tasks that provide functionality% and precedence constraints that connect the e ecutables% containers% and tasks into an ordered control flow. < #ata flo) consists of the sources and destinations that e tract and load data% the transformations that modify and e tend data% and the paths that link sources% transformations% and destinations. !efore you can add a data flow to a package% the package control flow must include a "ata Flow task. The "ata Flow task is the e ecutable within the SSIS package that creates% orders% and runs the data flow. 6 separate instance of the data flow engine is opened for each "ata Flow task in a package. S/. Server ;77= Integration Services &SSIS' provides three different types of data flow components- sources% transformations% and destinations. Sources e tract data from data stores

21

such as tables and views in relational databases% files% and 6nalysis Services databases. Transformations modify% summari#e% and clean data. "estinations load data into data stores or create in*memory datasets. Q-7 +hen a data flow component applies a transformation to column data% e tracts data from sources% or loads data into destinations% errors can occur. Errors fre8uently occur because of une pected data values. For e ample% a data conversion fails because a column contains a string instead of a number% an insertion into a database column fails because the data is a date and the column has a numeric data type% or an e pression fails to evaluate because a column value is #ero% resulting in a mathematical operation that is not valid. Errors typically fall into one the following categories*"ata conversion errors% which occur if a conversion results in loss of significant digits% the loss of insignificant digits% and the truncation of strings. "ata conversion errors also occur if the re8uested conversion is not supported. *E pression evaluation errors% which occur if e pressions that are evaluated at run time perform invalid operations or become syntactically incorrect because of missing or incorrect data values. *.ookup errors% which occur if a lookup operation fails to locate a match in the lookup table. 1any data flow components support error outputs% which let you control how the component handles row*level errors in both incoming and outgoing data. ,ou specify how the component behaves when truncation or an error occurs by setting options on individual columns in the input or output. For e ample% you can specify that the component should fail if customer name data is truncated% but ignore errors on another column that contains less important data. Q 17 SSIS includes logging features that write log entries when run*time events occur and can also write custom messages. Integration Services supports a diverse set of log providers% and gives you the ability to create custom log providers. The Integration Services log providers can write log entries to te t files% S/. Server 5rofiler% S/. Server% +indows Event .og% or (1. files. .ogs are associated with packages and are configured at the package level. Each task or container in a package can log information to any package log. The tasks and containers in a package can be enabled for logging even if the package itself is not. To customi#e the logging of an event or custom message% Integration Services provides a schema of commonly logged information to include in log entries. The Integration Services log schema defines the information that you can log. ,ou can select elements from the log schema for each log entry. To enable logging in a package $. In !usiness Intelligence "evelopment Studio% open the Integration Services pro9ect that contains the package you want. ;. 4n the SSIS menu% click .ogging. F. Select a log provider in the 5rovider type list% and then click 6dd.

22

Q37 S/. Server ;77= Integration Services &SSIS' makes it simple to deploy packages to any computer. There are two steps in the package deployment process*The first step is to build the Integration Services pro9ect to create a package deployment utility. *The second step is to copy the deployment folder that was created when you built the Integration Services pro9ect to the target computer% and then run the 5ackage Installation +i#ard to install the packages. QE7 0ariables store values that a SSIS package and its containers% tasks% and event handlers can use at run time. The scripts in the Script task and the Script component can also use variables. The precedence constraints that se8uence tasks and containers into a workflow can use variables when their constraint definitions include e pressions. Integration Services supports two types of variables- user*defined variables and system variables. )ser*defined variables are defined by package developers% and system variables are defined by Integration Services. ,ou can create as many user*defined variables as a package re8uires% but you cannot create additional system variables. Scope 6 variable is created within the scope of a package or within the scope of a container% task% or event handler in the package. !ecause the package container is at the top of the container hierarchy% variables with package scope function like global variables and can be used by all containers in the package. Similarly% variables defined within the scope of a container such as a For .oop container can be used by all tasks or containers within the For .oop container. 1ore to come... 3ere are some more SSIS related Interview /uestions which I got from dotnetspider. 3ope they help. /uestion $ * True or False * )sing a checkpoint file in SSIS is 9ust like issuing the C3ECQ54I<T command against the relational engine. It commits all of the data to the database. False. SSIS provides a Checkpoint capability which allows a package to restart at the point of failure. /uestion ; * Can you e plain the what the Import\E port tool does and the basic steps in the wi#ardC The Import\E port tool is accessible via !I"S or e ecuting the dtswi#ard command. The tool identifies a data source and a destination to move data either within $ database% between instances or even from a database to a file &or vice versa'. /uestion F * +hat are the command line tools to e ecute S/. Server Integration Services packagesC "TSE(EC)I * +hen this command line tool is run a user interface is loaded in order to configure each of the applicable parameters to e ecute an SSIS package. "TE(EC * This is a pure command line tool where all of the needed switches must be passed into the command for successful e ecution of the SSIS package.

23

/uestion D * Can you e plain the S/. Server Integration Services functionality in 1anagement StudioC ,ou have the ability to do the following.ogin to the S/. Server Integration Services instance 0iew the SSIS log 0iew the packages that are currently running on that instance !rowse the packages stored in 1S"! or the file system Import or e port packages "elete packages 2un packages /uestion = * Can you name some of the core SSIS components in the !usiness Intelligence "evelopment Studio you work with on a regular basis when building an SSIS packageC Connection 1anagers Control Flow "ata Flow Event 3andlers 0ariables window Toolbo window 4utput window .ogging 5ackage Configurations /uestion "ifficulty H 1oderate /uestion $ * True or False- SSIS has a default means to log all records updated% deleted or inserted on a per table basis. False% but a custom solution can be built to meet these needs. /uestion ; * +hat is a breakpoint in SSISC 3ow is it setupC 3ow do you disable itC 6 breakpoint is a stopping point in the code. The breakpoint can give the "eveloper\"!6 an opportunity to review the status of the data% variables and the overall status of the SSIS package. $7 uni8ue conditions e ist for each breakpoint. !reakpoints are setup in !I"S. In !I"S% navigate to the control flow interface. 2ight click on the ob9ect where you want to set the breakpoint and select the :Edit !reakpoints...: option. /uestion F * Can you name = or more of the native SSIS connection managersC 4.E"! connection * )sed to connect to any data source re8uiring an 4.E"! connection &i.e.% S/. Server ;777' Flat file connection * )sed to make a connection to a single file in the File System. 2e8uired for reading information from a File System flat file 6"4.<et connection * )ses the .<et 5rovider to make a connection to S/. Server ;77= or other connection e posed through managed code &like C>' in a custom task 6nalysis Services connection * )sed to make a connection to an 6nalysis Services database or pro9ect. 2e8uired for the 6nalysis Services "". Task and 6nalysis Services 5rocessing Task File connection * )sed to reference a file or folder. The options are to either use or create a file or folder E cel

24

FT5 3TT5 1S1/ S14 S1T5 S/.1obile +1I /uestion D * 3ow do you eliminate 8uotes from being uploaded from a flat file to S/. ServerC In the SSIS package on the Flat File Connection 1anager Editor% enter 8uotes into the Te t 8ualifier field then preview the data to ensure the 8uotes are not included. 6dditional information- 3ow to strip out double 8uotes from an import file in S/. Server Integration Services /uestion = * Can you name = or more of the main SSIS tool bo widgets and their functionalityC For .oop Container Foreach .oop Container Se8uence Container 6ctive( Script Task 6nalysis Services E ecute "". Task 6nalysis Services 5rocessing Task !ulk Insert Task "ata Flow Task "ata 1ining /uery Task E ecute "TS ;777 5ackage Task E ecute 5ackage Task E ecute 5rocess Task E ecute S/. Task etc. /uestion "ifficulty H "ifficult /uestion $ * Can you e plain one approach to deploy an SSIS packageC 4ne option is to build a deployment manifest file in !I"S% then copy the directory to the applicable S/. Server then work through the steps of the package installation wi#ard 6 second option is using the dtutil utility to copy% paste% rename% delete an SSIS 5ackage 6 third option is to login to S/. Server Integration Services via S/. Server 1anagement Studio then navigate to the :Stored 5ackages: folder then right click on the one of the children folders or an SSIS package to access the :Import 5ackages...: or :E port 5ackages...:option. 6 fourth option in !I"S is to navigate to File ? Save Copy of 5ackage and complete the interface.

/uestion ; * Can you e plain how to setup a checkpoint file in SSISC The following items need to be configured on the properties tab for SSIS packageCheckpointFile<ame * Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. 2ather than using a hard*coded path as shown above% it:s a good idea to use an e pression that concatenates a path defined in a package variable and the package name. Checkpoint)sage * "etermines if@how checkpoints are used. Choose from these options- <ever &default'% IfE ists% or 6lways. <ever indicates that you are not using Checkpoints. IfE ists is the

25

typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts e ecution with the first task. The 6lways choice raises an error if the Checkpoint file does not e ist. SaveCheckpoints * Choose from these options- True or False &default'. ,ou must select True to implement the Checkpoint behavior. /uestion F * Can you e plain different options for dynamic configurations in SSISC )se an (1. file )se custom variables )se a database per environment with the variables )se a centrali#ed database with all variables /uestion D * 3ow do you upgrade an SSIS 5ackageC "epending on the comple ity of the package% one or two techni8ues are typically used2ecode the package based on the functionality in S/. Server "TS )se the 1igrate "TS ;777 5ackage wi#ard in !I"S then recode any portion of the package that is not accurate /uestion = * Can you name five of the 5erfmon counters for SSIS and the value they provideC S/.Server-SSIS Service SSIS 5ackage Instances * Total number of simultaneous SSIS 5ackages running S/.Server-SSIS 5ipeline !.4! bytes read * Total bytes read from binary large ob9ects during the monitoring period. !.4! bytes written * Total bytes written to binary large ob9ects during the monitoring period. !.4! files in use * <umber of binary large ob9ects files used during the data flow task during the monitoring period. !uffer memory * The amount of physical or virtual memory used by the data flow task during the monitoring period. !uffers in use * The number of buffers in use during the data flow task during the monitoring period. !uffers spooled * The number of buffers written to disk during the data flow task during the monitoring period. Flat buffer memory * The total number of blocks of memory in use by the data flow task during the monitoring period. Flat buffers in use * The number of blocks of memory in use by the data flow task at a point in time. 5rivate buffer memory * The total amount of physical or virtual memory used by data transformation tasks in the data flow engine during the monitoring period. 5rivate buffers in use * The number of blocks of memory in use by the transformations in the data flow task at a point in time. 2ows read * Total number of input rows in use by the data flow task at a point in time. 2ows written * Total number of output rows in use by the data flow task at a point in time. Sourcehttp-@@forums.keysoft.co.in@forumRposts.aspCTI"HDB

26

S/. Server Integration Services &SSIS' Interview 8uestions $. +hat is for*loop containerC Kive an e ample of where it can be used. ;. +hat is foreach*loop containerC Kive an e ample of where it can be used. F. +hat is se8uence containerC Kive an e ample of where it can be used. D. +hat is the difference between 6nalysis Services processing task U 6nalysis services e ecute "". taskC =. +hat is the difference between for*loop container U foreach*loop containerC M. +hat are the different parameters or configurations that Vsend mail taskW re8uiresC B. 1ention few mapping operations that the Character 1ap transformation supports. A. E plain the functionality of- Import Column Transformation and E port Column Transformation E. E plain the functionality of- 5ercentage Sampling transformation $7. E plain the functionality of- SC" transformation $$. E plain the functionality of- )nion 6ll transformation $;. +hat does V.ookupW transformation used forC $F. +hat are checkpointsC For which ob9ects we define checkpointC 3ow to configure checkpoint for a packageC $D. +hat is the use of Vpackage configurationsW available in SSISC $=. +hat are the different ways in which configuration details can be storedC $M. 3ow to deploy a package from development server to production serverC $B. 3ow to create Integration Services 5ackage "eployment )tilityC $A. 3ow to deploy packages to file systemC $E. 3ow to deploy packages to S/. serverC +here in database packages will be storedC ;7. 3ow to set security for a packageC E plain the same as per different deployment options. ;$. E plain the architecture of SSIS ;;. E plain the how SSIS engine workflow

Sourcehttp-@@www.datawarehousingguide.com@content@view@E=@M7@

1icrosoft !usiness Intelligence fre8uently asked 8uestions Q$ .hat is 4usiness Intelligence an# )hat #oes it #o,

!usiness Intelligence% a complete suite of server% client% and developer applications fully integrated with the ;77B 1icrosoft 4ffice system% delivers business intelligence on the desktop in an integrated% centrally managed environment. !usiness Intelligence simplifies information discovery and analysis% making it possible for decision*makers at all levels of an organi#ation to more easily access% understand% analy#e% collaborate% and act on information% anytime and anywhere. 1ove from 9ust consuming information to developing deep conte tual knowledge about that information. !y tying strategy

27

to metrics% organi#ations can gain competitive advantage by making better decisions faster% at all levels of the organi#ation. !usiness Intelligence delivers business intelligence to everyone in an organi#ation by integrating two ma9or componentsX The 4usiness Intelligence platfor"% driven by 1icrosoft S/. Server ;77= and including its powerful relational database management system% S/. Server Integration Services% S/. Server 6nalysis Services% S/. Server 2eporting Services% and S/. Server "ata 1ining capabilities. !usiness Intelligence is built on the scalable and reliable S/. Server ;77= platform% proven to support mission*critical environments% and integrated with the 1icrosoft 0isual Studio ;77= development platform. The &BBF 2icrosoft ffice s'ste"% delivering information through the tools that users already know and rely on. )sers can share more powerful% interactive spreadsheets using improved charting and formula authoring% greater row and column capacity% and enhanced sorting and filtering along with enhanced 5ivotTable and 5ivotChart views. +ith server*based spreadsheets% you can share information broadly with confidence% knowing that your information is more secure and centrally managed% yet accessible to colleagues% customers% and partners through the +eb. "ynamic scorecards combine the power of predictive analysis with real*time reporting. Strategy maps make it easy to visuali#e key areas Y you can see trends% identify problem areas early% ma imi#e success areas% and monitor performance against key goals in real time. Q$ .ho is 4usiness Intelligence for,

!usiness Intelligence is for businesses that want to drive intelligent decision*making throughout their organi#ations and make it easy for everyone in the organi#ation to collaborate% analy#e% share% and act on business information from a centrally managed% more secure source. Enterprise grade yet attractively priced% !usiness Intelligence supports IT professionals% information workers% and developers% and empowers organi#ations of all si#es. Q$ .hat if I ha0e a s"all :fe)er than 1BB-person; co"pan', Can I still use 4usiness Intelligence,

,es. !usiness Intelligence provides an e cellent business intelligence solution for organi#ations of all si#es. ,ou can deploy reporting solutions to a small workgroup or department with S/. Server ;77= 2eporting Services. ,ou can also perform 8ueries and analysis using E cel Services Y new to the ;77B release of 1icrosoft 4ffice Y through 1icrosoft 4ffice Share5oint Server ;77B. This combination delivers +eb*based 8uery and analysis capabilities to every user in a format that is easy to use and centrally secured and managed. Q$ .hatGs a t'pical )a' an organi+ation "ight use 4usiness Intelligence,

!usiness Intelligence connects the right people to the right information at the right time. For e ample% when reviewing the current financial scorecard% your sales manager% 1argaret% notices that one particular region is not contributing as much as other regions. +hen analy#ing the data from the spreadsheet for the low*performing region% she notices that one particular salesperson%

28

Toe% has below*average sales numbers. 6t the same time% Toe receives through e*mail a weekly status report that contains 8ualified leads in the region% pipeline information% and details about deals closed. <e t% he opens the dashboard and searches on information about his top account% and he sees data from his enterprise resource planning &E25' system related to that account. Toe notes that his average deal si#e is smaller than others in the region. ItZs easy for Toe to find out why this is by doing some Iwhat ifI analysis. 3e inputs different variables to determine the number of leads he needs to reach the company sales average. <e t% by doing further analysis on data for the region% Toe can compare his sales numbers with regional averages. 3e adds more information that shows the discount rate% and then adds visuali#ation to better understand the results. The visual representation of the data shows Toe that his discount rate is much lower than the average for the region. <e t% itZs time to tell his manager. Toe publishes this information to the server% schedules a meeting with 1argaret to discuss getting approval to increase the discount rate so that heZll be better able to compete% and alerts 1argaret through online collaboration that heZs 9ust posted his analysis report. Toe and 1argaret meet to discuss details. 6fterward% he makes a note on the key performance indicator &Q5I' that he owns for that region% and 1argaret sees the annotation in her latest scorecard as a reminder that thereZs a new strategy in place to increase ToeZs results and address the poor sales performance. .hat progra"s are inclu#e# in 4usiness Intelligence, Is 4usiness Intelligence a0aila%le as a single pro#uct in a %o(,

!usiness Intelligence includes two ma9or components- the business intelligence platform &S/. Server ;77=' and end*user tools Y the ;77B 1icrosoft 4ffice system.

Sourcehttp-@@www.datawarehousingguide.com@content@view@M7@M7@

+hat is ET.&E tract% Transform% and .oad' C ET. stands for e tract% transform and load% the processes that enable companies to move data from multiple sources% reformat and cleanse it% and load it into another database% a data mart or a data warehouse for analysis% or on another operational system to support a business process. ET/ - Ta%le of contents +hat is ET.C E traction Transformation

29

.oading Challenges of ET. Tools in market

.hat is ET/ :E(tract8 Transfor"8 an# /oa#; , E tract% Transform% and .oad &ET.' is a process in data warehousing that involves e tracting data from outside sources% transforming it to fit business needs &which can include 8uality levels'% and ultimately loading it into the end target% i.e. the data warehouse.

ET. is important% as it is the way data actually gets loaded into the warehouse. This article assumes that data is always loaded into a data warehouse% whereas the term ET. can in fact refer to a process that loads any database. ET. can also be used for the integration with legacy systems. )sually ET. implementations store an audit trail on positive and negative process runs. In almost all designs% this audit trail is not at the level of granularity which would allow to reproduce the ET.:s result if the raw data were not available. E(traction The first part of an ET. process is to e tract the data from the source systems. 1ost data warehousing pro9ects consolidate data from different source systems. Each separate system may also use a different data organi#ation @ format. Common data source formats are relational databases and flat files% but may include non*relational database structures such as I1S or other data structures such as 0S61 or IS61% or even fetching from outside sources such as web spidering or screen*scraping. E traction converts the data into a format for transformation processing. 6n intrinsic part of the e traction is the parsing of e tracted data% resulting in a check if the data meets an e pected pattern or structure. If not% the data is re9ected entirely. Transfor"ation The transform stage applies a series of rules or functions to the e tracted data from the source to derive the data to be loaded to the end target. Some data sources will re8uire very little or even no manipulation of data. In other cases% one or more of the following transformations types to meet the business and technical needs of the end target may be re8uired Selecting only certain columns to load &or selecting null columns not to load' Translating coded values &e.g.% if the source system stores $ for male and ; for female% but the warehouse stores 1 for male and F for female'% this is called automated data cleansingS no manual cleansing occurs during ET. Encoding free*form values &e.g.% mapping I1aleI to I$I and I1rI to 1' "eriving a new calculated value &e.g.% saleRamount H 8ty X unitRprice' Toining together data from multiple sources &e.g.% lookup% merge% etc.' Summari#ing multiple rows of data &e.g.% total sales for each store% and for each region' Kenerating surrogate key values

30

Transposing or pivoting &turning multiple columns into multiple rows or vice versa' Splitting a column into multiple columns &e.g.% putting a comma*separated list specified as a string in one column as individual values in different columns' 6pplying any form of simple or comple data validationS if failed% a full% partial or no re9ection of the data% and thus no% partial or all the data is handed over to the ne t step% depending on the rule design and e ception handling. 1ost of the above transformations itself might result in an e ception% e.g. when a code*translation parses an unknown code in the e tracted data.

/oa#ing The load phase loads the data into the end target% usually being the data warehouse &"+'. "epending on the re8uirements of the organi#ation% this process ranges widely. Some data warehouses might weekly overwrite e isting information with cumulative% updated data% while other "+ &or even other parts of the same "+' might add new data in a histori#ed form% e.g. hourly. The timing and scope to replace or append are strategic design choices dependent on the time available and the business needs. 1ore comple systems can maintain a history and audit trail of all changes to the data loaded in the "+. 6s the load phase interacts with a database% the constraints defined in the database schema as well as in triggers activated upon data load apply &e.g. uni8ueness% referential integrity% mandatory fields'% which also contribute to the overall data 8uality performance of the ET. process Challenges an# Co"ple(ities in ET/ process ET. processes can be 8uite comple % and significant operational problems can occur with improperly designed ET. systems. The range of data values or data 8uality in an operational system may be outside the e pectations of designers at the time validation and transformation rules are specified. "ata profiling of a source during data analysis is recommended to identify the data conditions that will need to be managed by transform rules specifications. This will lead to an amendment of validation rules e plicitly and implicitly implemented in the ET. process. "+ are typically fed asynchronously by a variety of sources which all serve a different purpose% resulting in e.g. different reference data. ET. is a key process to bring heterogeneous and asynchronous source e tracts to a homogeneous environment. The scalability of an ET. system across the lifetime of its usage% needs to be established during analysis. This includes understanding the volumes of data that will have to be processed within service level agreements &S.6s'. The time available to e tract from source systems may change% which may mean the same amount of data may have to be processed in less time. Some ET. systems have to scale to process terabytes of data to update data warehouses with tens of terabytes of data. Increasing volumes of data may re8uire designs that can scale from daily batch to intra*day micro*batch to integration with message 8ueues or real*time change data capture &C"C' for continuous transformation and update. ET/ tools in the "arket +hile an ET. process can be created using almost any programming language% creating them from scratch is 8uite comple . Increasingly% companies are buying ET. tools to help in the

31

creation of ET. processes. !y using an established ET. framework% you are more likely to end up with better connectivity and scalability. 6 good ET. tool must be able to communicate with the many different relational databases and read the various file formats used throughout an organi#ation. ET. tools have started to migrate into Enterprise 6pplication Integration% or even Enterprise Service !us% systems that now cover much more than 9ust the e traction% transformation and loading of data. 1any ET. vendors now have data profiling% data 8uality and metadata capabilities. Some of the well known ET. tools are Informatica% 6b initio% SSIS% datastage% 5entaho kettle and more. Sourcehttp-@@www.datawarehousingguide.com@content@view@$$M@MM@

Thanks 2amu 2agavan Thomson 2euters Email- ragavan.ramu[thomsonreuters.com 1* EFDFEE;7$;

32

You might also like