SSIS - How To Watch Value Of Variables During Debugging In SSIS Package

Scenario: 

We have created a SSIS Package that takes the files from Source Folder and move them to Archive Folder. While it move them, It also add Time Stamp to file. Here is the blog post for the Package (Link) .In this post we want to watch the values of the variables at run time. This can be helpful when we are getting some errors and we are not sure if the value of variables are changed while executing SSIS Package Or our expressions are evaluated correctly that we have written on one or more variables. By looking at the value of variables, It will help us to debug our package.


Solution:

We will be using Break Point in SSIS Package to view variable values at run time. Here are the variable I have used in package. I have changed the values for VarArchiveFullPath and VarSourceFullPath variables.
Expressions are written on these variable those should evaluate at run time and change the values of these variable.

Step 1: 
Right Click on File System Task and go to Edit Breakpoints


Step 2: 
Choose the Break Condition.

Step 3:
Execute your SSIS Package
Step 4:
To view the values of your variable Go to Debug--> Windows--> Locals

Let's see the values of variables win Locals window
We can see that the values are changed according to the expression and we can debug if these values are correct according to our expectations, if not then we can go back to expressions and change them. 

SSIS - How To Unzip .Zip Files In SSIS [ How To Use Execute Process Task In SSIS ]

Scenario:

We download .zip file from FTP site to our computer and then we need to unzip this folder and load all the files in folder to our destination tables.

Solution :

After downloading the .zip file we have to unzip it, to unzip the file we will use Execute Process Task. The executable that we will use to unzip is 7z.exe

Step 1:
Create variables in SSIS Package for Executable location, Zip file location and Uncompressed folder location.


Step 2:
In your SSIS Package, Drag Execute Process Task from Toolbox to Control Flow Pane.
Double Click on Execute Process Task and then go to Expressions Tab.
Under the properties set Executable=@[User::VarExecuteablePath]
and in Arguments set = "x "+ @[User::VarZipFilePath]+" -o"+ @[User::VarUnzipFolderPath]
Click on evaluate expressions and then click OK.
Step 3:
Right Click on Execute Process Task and go to Properties. Set DelayValidation=True

Final Output : 
After executing our SSIS Package we can see that the .zip file is uncompressed to our given folder.



SSIS - How To Use Script Component AS Source [Generate Numbers In Data Flow]

Scenario:

We want to generate Numbers from 1-99 inside Data Flow Task without using any Source such as OLE DB or File.

Solution:

We can use Script Component Transformation as Source and generate numbers. 

Step 1: 
Create a new SSIS Package and Drag Data Flow Task on Control Flow Pane. Inside Data Flow Task, bring Script Component Transformation and choose to use as Source.

Step 2 :
Double Click on Script Component Transformation. Go to Inputs and Outputs Tab and add a new column SeqNo as shown below


Step 3: 
Click on Edit Tab and then Click on Edit Script and paste below Code. The only Code I added is in Red.


/* Microsoft SQL Server Integration Services Script Component
*  Write scripts using Microsoft Visual C# 2008.
*  ScriptMain is the entry point class of the script.*/

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{

    public override void PreExecute()
    {
        base.PreExecute();
        /*
          Add your code here for preprocessing or remove if not needed
        */
    }

    public override void PostExecute()
    {
        base.PostExecute();
        /*
          Add your code here for postprocessing or remove if not needed
          You can set read/write variables here, for example:
          Variables.MyIntVar = 100
        */
    }

    public override void CreateNewOutputRows()
    {
        for (int i = 1; i < 100; i++)
        {
            Output0Buffer.AddRow();
            Output0Buffer.SeqNo = i;
        }
        /*
          Add rows by calling the AddRow method on the member variable named "Buffer".
          For example, call MyOutputBuffer.AddRow() if your output was named "MyOutput".
        */
    }

}


Step 4:
Drag Multicast Transformation to Data Flow Pane and connection Script Component Transformation to it.Add Data Viewer between them and execute Package.


SSIS- How to Redirect Duplicate Records In SSIS Or Remove Duplicate Records In Data Flow Task

Scenario:

We need to load data from Flat file source but we want to redirect all duplicate records to Audit table or File  with count and load only records which are unique to our destination table.


Solution :

We will be using Aggregate Transformation with Conditional Split Transformation to perform this requirement.

Step 1:
Drag Data Flow Task to Control Flow Pane and then Drag Flat File Source inside Data Flow Pane. Create Connection to source file as shown below

Click on Columns and see if data is displaying correctly and then click Ok.

Step 2:
Bring Aggregate Transformation to Data Flow Pane and connect Flat File Source to Aggregate Transformation and configure as shown. We need to group them and get count.

Step 3: 
Drag Conditional Split Transformation to Data Flow Pane and connect Aggregate Transformation to it. By using Conditional Split we will be redirecting duplicate records to output where we can write them to table/file. I have converted [Count All] to Int by using Cast Function (DT_I4). 
Conditional Split Transformation is going to create two outputs, DuplicateRecords for duplicate records and CorrectRecords for unique records.

Step 4:
Bring two Multicast Transformations and connect each of output from Conditional Split Transformation to them. Put Data Viewer between them so we can display the records. In real time you will be writing these records to your destination tables/files.

Final Output:
By using the Conditional Split, the duplicate records are redirected to Multicast 1 transformation and unique records are directed to Multicast transformation.

Video Tutorial for the post

SSIS - How To Remove Duplicate Record/s and Keep Single Record in Data Flow Task [ Sort Transformation]

Scenario:

We are loading data from Flat File that can has duplicate records in it. We want to remove the duplicate records and keep only single record for those duplicate records.


Solution :

In SSIS, Sort Transformation can help us to perform this requirement. Sort Transformation will remove the duplicate records and let only single record pass through. If you have small file, Sort Transformation will work just fine but if you have millions of records and your computer (Server) has small RAM capacity (Random Access Memory) then you might want to load these records into some staging table and then write TSQL to extract unique records. 

Step 1 : 
Bring Data Flow task to Control Flow Pane and inside Data Flow Task Drag Flat File Source and create connection to your source file.

Click on Columns to see if all columns are correctly mapped


Step 2: 
Drag Sort Transformation and Connect Flat File Source to it. Double Click on Sort Transformation and Choose the columns to Sort. Also Check the Check Box : Remove rows with duplicate sort values and then click OK.


Step 3:
Drag Multicast Transformation and connect Sort Transformation to it. Put Data Viewer between Multicast Transformation and Sort Transformation to view the output.

The duplicate copies of records are removed and only single record is passed through Sort Transformation as we can see in Data Viewer display.

SSIS - Backwards Compatibility For SSIS Packages

Sometime we have different versions of SQL Server installed on our different machines such as Local Machines have SQL Server 2008. Development, UAT and Production has SQL Server 2005 Or our local machines have SQL Server 2012 and all other machines have SQL Server 2008 R2.

After developing our SSIS Package on local machine which has latest version than other machines. If we want to take these SSIS Packages to our Development machine and make some changes, It will not be possible. 


Backward compatibility is not supported for SSIS Package.When you create your SSIS Package/s make sure you have same version on other machines where you need to deploy your packages later.

If you have developed your SSIS Package by using SQL Server 2005. You can always upgrade them to SQL Server 2008 Or If you have developed them by using SQL Server 2008 then you can upgrade them to SQL Server 2012 but not backwards.

Couple of reasons why Backward compatibility  can be challenge :
There are features/components those are available in latest version but not in old version. E.g If you have used Cache Transformation in SSIS Package that was created by using SSIS 2008 and want to convert to 2005. Guess What! Cache Transformation is not available in SSIS 2005.

Or if you have used the Script task and C# coding in it by using SSIS 2008, SSIS 2005 only supports VB.Net in Script task. 









SSIS -How To Get Most Recent File from Directory And Load To a Table [ SSIS Script Task ]

Scenario:

We get the files in our Source Folder all day long. Each file is appended copy of previous file. We want to create a SSIS Package that will load only the most recent file to our table.


Solution :

We will be using Script task to find out the name of the most recent file and Data Flow Task to load that file to our table.Test_File_20131226.txt is the file that should be read as it is most recent file.

Step 1:
Create a variable VarFolderPath that will contain the folder path in which our files exist and second variable with name VarFileName which will hold the value of most recent File Name.

Step 2:
Drag Script Task to Control Flow Pane and Provide the variables to it as shown 

Click on Edit Script and write below script. I have only added the code which is in Red. I have included Messagebox.show just for debugging.

/*
   Microsoft SQL Server Integration Services Script Task
   Write scripts using Microsoft Visual C# 2008.
   The ScriptMain is the entry point class of the script.
*/

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.IO;



namespace ST_2650e9fc7f2347b2826459c2dce1b5be.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

          public void Main()
        {
   
     // TODO: Add your code here
            var directory= new DirectoryInfo(Dts.Variables["User::VarFolderPath"].Value.ToString());
                     
            FileInfo[] files = directory.GetFiles();
            DateTime lastModified = DateTime.MinValue;

            foreach (FileInfo file in files)
            {
                if (file.LastWriteTime > lastModified)
                {
                    lastModified = file.LastWriteTime;
                    Dts.Variables["User::VarFileName"].Value = file.ToString();
                }
            }

            MessageBox.Show(Dts.Variables["User::VarFileName"].Value.ToString());
         

            Dts.TaskResult = (int)ScriptResults.Success;
        }
    }
}

Step 3:
Let's run our SSIS Package to make sure that our script is returning us expected file name( Most recent modified by)

Step 4 :
Next step will be to load this file to our destination table. Bring Data Flow Task to Control Flow Pane and connect Script Task to it.Drag Flat File Source inside Data Flow Task and make connection to any file in the Source Folder as all the files have same structure.


Step 5:
As latest file name will be changing, Let's configure our Flat File Connection Manager to use VarFileName variable.

Final Output:
Drag Multicast Transformation in Data Flow Task and connect Flat File Source to it. Double clicking on green line between them and put Data Viewer.  I am using Multicast Transformation just for test purpose. In real scenario you will be using OLE DB Destination if you are loading data to table or any other destination depending upon your requirements.

The latest file is read from folder and display by using Data Viewer.

SSIS - How To Load Multiple Files ( .txt or .csv ) To a Table With File Names

Scenario: 

We receive sale files for different regions ( Asia, Europe, North America) in our Source Folder. We need to load these files to RegionSale Table with File Name and Folder Path. In this example the files are .txt but they can be .csv.

Here is Sample data for each of the file that I am using 
Asia_Sale_20131224.txt
RegionCd,SalePersonName,SaleAmt
AS,Aamir,100
AS,Raza,200
AS,Sukhjeet,300

Europe_Sale_20131224.txt
RegionCd,SalePersonName,SaleAmt
EU,Mike,1000
EU,Robert,1500

NorthAmerica_Sale_20131224.txt
RegionCd,SalePersonName,SaleAmt
NA,Jimmy,700
NA,Lisa,200

Solution : 

We will be using Foreach Loop Container to loop through these files and then use Derived column to generate a new column for File Name with Folder Path. Let's start with step by step approach

Destination Table DDL 
Create table dbo.RegionalSale
 (RegionSaleID int identity(1,1),
 RegionCd Char(2) ,
SalePersonName Varchar(50) ,
SaleAmt INT,
FileQualifiedPath VARCHAR(100))

Step 1:
Create new SSIS Package. Create variable VarSourceFolder and set the value of this variable to your folder in which files exist. I created the variable so if we have to change the Source Folder path , we can do it by using SSIS Configuration for this variable.


Step 2: 
Bring Foreach Loop Container to Control Flow Pane and Set Directory=VarSourceFolder as shown below. As we are going to read only .txt files. Provide *.txt under Files text box.

Change *.* to *.txt as shown in #2 to only read .txt files. Also choose Full qualified to get complete path of file on each iteration.

Go to Variable Mapping and create a new variable VarFilePath that will hold fully qualified path on each iteration.

Step 3:
Bring Data Flow Task inside Foreach Loop Container and open it by double clicking. Inside Data Flow Task,Drag Flat File Source and create connection string by using any file from Source Folder.

Click on Columns to make sure everything working as expected. You can preview some data by clicking on Preview.

Step 4:
Drag Derived column Transformation and connect Flat File Source to it. Generate a new column by using the VarFilePath as shown below

Expression :(DT_STR,100,1252)@[User::VarFilePath]
I have converted the value of variable to Varchar(100) by using Cast(DT_STR,Length,Code)

Step 5:

Drag OLE DB Destination and connect Derived Column Transformation to it and choose dbo.SaleRegion as our destination table and map the input column to destination columns.



Step 6:
As we have created Flat File Connection to Asia file but the file name will be changing with each iteration so we have to go to Flat File Connection and write expression for connection string to use the VarFileName variable.

Set DelayValidation to True.

Step 7:
Run the SSIS Package and check if the data from all files are loaded to table.
All the files are loaded successfully with fully qualified path.

SSIS -Export Files from SQL Table To Folder [ How To Use Export Column Transformation In SSIS ]

Scenario:

We have saved different type of files such as Excel, Image, Text into SQL Server Table. We want to export them to folder. The folder path with file name is saved in FileName column.

Solution:

We will be using Export Column Transformation to save files from Sql table to folder. Let's start with step by step approach

Step 1:
Bring Data Flow Task to Control Flow Pane and then bring OLE DB Source and write query as shown

SQL Command : Select  FileName,[File] from dbo.ImportFiles
Step 2:
Bring Export Column Transformation to Data Flow Pane and then connect OLE DB Source to it. 
In Extract Column map it to File data column and File Path Column to where you want to save that file.

Final Output: 
Let's run our SSIS Package and check the Archive folder if the files are exported to folder successfully.

All the four files are exported to Archive Folder successfully from Sql Server table.

SSIS -How to Import Files (Text, Pdf, Excel, Image etc) To Sql Server Table [ How to Use Import Column In SSIS ]

Scenario : 

We have different types of files such as Text, Pdf, Image, Excel etc and we want to load them into Sql Server Table.


Solution:

First of all, let's create a table that can store these files. FileName column will be used to store the location from which the file is imported and File itself will be stored in File column whose data type is Varbinary.
We will use For-each loop container to loop through the files and Import Column Transformation to load these files into table.

CREATE TABLE dbo.ImportFiles
  (
     ID       INT IDENTITY,
     FileName VARCHAR(1000),
     [File]   VARBINARY(MAX)
  ) 

step 1: 

Create a variable of string type in SSIS Package with name "VarSourceFolder" and value = Folder Path.

Step 2:
Bring the For-each loop container to Control Flow Pane and configure as shown below to loop through all the files. We will be using VarSourceFolder as Directory in For-each loop. 

Go to Variable Mapping and then create a new variable VarFileName to save File name with extension.

Step 3:
Create a variable VarSQLQuery. We will write expressions on this variable to build TSQL Statement. The statement will contain complete path to file that we will pass to Import Column.

Expressions : 
"Select '"+ @[User::VarSourceFolder]+"\\"+ @[User::VarFileName] +"' AS FileName"
and 
Set EvaluateAsExpression: True


Step 4:

Bring Data Flow Task inside For-each Loop Container. Double Click on Data Flow Task and then Bring OLE DB Source and in SQL Statement provide the VarSQLQuery 

Step 5: 
Drag Import Column Transformation and connection OLE DB Source to it. After connecting, Double Click on Import Column and go to Input Columns Tab and Choose input column (FileName)


Step 6:
Go to Input and Output Properties, Then Output Columns and Add a new column "File" and note down the LienageID. The LineageID for File column in our case is equal to 85.


Step 7:
Click on FileName under Input Columns and then under Customer Properties. Set FileDataColumnId=85(LineageID for File Column that we created above).
Step 8:
Bring OLE DB Destination and map input columns to destination table. FileName is going to contain File name with source path and File column will contain file data itself.


Final Output:
Execute SSIS Package and query the table to see if all information is loaded.

All the four files are successfully loaded into our destination table.