SQL Server Integration Services (SSIS) is a component of SQL Server which can be used to perform a wide range of Data Migration and ETL (Extraction, Transformation, and Load Data) operations. SSIS is a component in MSBI process of SQL Server. This is a platform for Integration and Workflow applications. 2. What are the tools associated with SSIS? Business Intelligence Development Studio (BIDS) and SQL Server Management Studio (SSMS) to work with the Development of SSIS Projects. We use SSMS to manage the SSIS Packages and Projects. 3. What are the differences between DTS and SSIS Data Transformation Services SQL Server Integration Services Limited Error Handling Complex and powerful Error Handling Message Boxes in ActiveX Scripts Message Boxes in .NET Scripting No Deployment Wizard Interactive Deployment Wizard Limited Set of Transformation Good number of Transformations NO BI functionality Complete BI Integration 4. What is a workflow in SSIS Workflow is a set of instructions on to specify the Program Executor on how to execute tasks and containers within SSIS Packages 5. What is the control flow? A control flow consists of one or more tasks and containers that execute when the package runs. To control order or define the conditions for running the next task or container in the package control flow, will use precedence constraints to connect the tasks and containers in a package. A subset of tasks and containers can also be grouped and run repeatedly as a unit within the package control flow. SQL Server 2005 Integration Services (SSIS) provides three different types of control flow elements: 1. Containers that provide structure in packages, 2. The tasks that provide functionality, and 3. Precedence Constraints that connect the executable, containers, and tasks into an ordered control flow.
6. What is a data flow?
A data flow consists of the sources and destinations that extract and load data, the transformations that modify and extend data, and the paths that link sources, transformations, and destinations The Data Flow task is the executable within the SSIS package that creates, orders, and runs the data flow. A separate instance of the data flow engine is opened for each Data Flow task in a package. Data Sources, Transformations, and Data Destinations are the three important categories in the Data Flow. 7. Difference between control flow and data flow? Control Flow Data Flow
Process Oriented Data Oriented Made up of Tasks and Container Source, Transformation and Destination Connected through Precedence constraint Paths Smallest unit Task Component Outcome Finite- Success, Failure, Completion Not fixed 8. How does Error-Handling work in SSIS? When a data flow component applies a transformation to column data, extracts data from sources, or loads data into destinations, errors can occur. Errors frequently occur because of unexpected data values.
Type of typical Errors in SSIS: -Data Connection Errors, which occur in case the connection manager cannot be initialized with the connection string. This applies to both Data Sources and Data Destinations along with Control Flows that use the Connection Strings. -Data Transformation Errors, which occur while data is being transformed over a Data Pipeline from Source to Destination. -Expression Evaluation errors, which occur if expressions that are evaluated at run time perform invalid An error handler allows us to create flows to handle errors in the package in quite an easy way. Through event handler tab, we can name the event on which we want to handle errors and the task that needs to be performed when such an error arises. We can also add sending mail functionality in event of any error through SMTP Task in Event handler. This is quite useful in event of any failure in office non-working hours. In Data flow, we can handle errors for each connection through following failure path or red arrow. 9. What is a Transformation? A transformation simply means bringing in the data in a desired format. For example you are pulling data from the source and want to ensure only distinct records are written to the destination, so duplicates are removed. Another example is if you have master/reference data and want to pull only related data from the source and hence you need some sort of lookup. There are around 30 transformation tasks available and this can be extended further with custom built tasks if needed. 10. What is a Task? A task is very much like a method of any programming language which represents or carries out an individual unit of work. There are broadly two categories of tasks in SSIS, Control Flow tasks and Database Maintenance tasks. All Control Flow tasks are operational in nature except Data Flow tasks. Although there are around 30 control flow tasks which you can use in your package you can also develop your own custom tasks with your choice of .NET programming language. 11. What is a Precedence Constraint and what types of Precedence Constraint are there? SSIS allows you to place as many tasks as you want to be placed in control flow. You can connect all these tasks using connectors called Precedence Constraints. Precedence Constraints allow you to define the logical sequence of tasks in the order they should be executed. You can also specify a condition to be evaluated before the next task in the flow is executed. These are the types of precedence constraints and the condition could be either a constraint, an expression or both a. Success (next task will be executed only when the last task completed successfully) or b. Failure (next task will be executed only when the last task failed) or c. Complete (next task will be executed no matter the last task was completed or failed). 12. What is a container and how many types of containers are there? A container is a logical grouping of tasks which allows you to manage the scope of the tasks together. These are the types of containers in SSIS: a. Sequence Container - Used for grouping logically related tasks together b. For Loop Container - Used when you want to have a repeating flow in a package c. For Each Loop Container - Used for enumerating each object in a collection; for example a record set or a list of files. Apart from the above mentioned containers, there is one more container called the Task Host Container which is not visible from the IDE, but every task is contained in it (the default container for all the tasks). 13. What are variables and what is variable scope? A variable is used to store values. There are basically two types of variables, System Variable (like Error Code, Error Description, and Package Name etc.) Whose values you can use but cannot change and User Variable which you create, assign values and read as needed. A variable can hold a value of the data type you have chosen when you defined the variable. Variables can have a different scope depending on where it was defined. For example you can have package level variables which are accessible to all the tasks in the package and there could also be container level variables which are accessible only to those tasks that are within the container.
14. How do you deploy SSIS packages?
Building SSIS Projects provide a Deployment Manifest File. We need to run the manifest file and decide whether to deploy this onto File System or into SQL Server [msdb]. SQL Server Deployment is very faster and more secure than File System Deployment. Alternatively, we can also import the package from SSMS from File System or SQ Server. 15. What is conditional split? As the name suggests, this transformation splits the data based on condition and route them to different path. The logic of this transformation is based on CASE statement. The condition for this transformation is an expression. This transformation also provides us with default output, where rows matching no condition are routed. Conditional split is useful in scenarios like Telecom industry data you want to divide the customer data on gender, condition would be: GENDER == F