SSIS - Read Top X Rows From Flat File / Excel File OR Range OF Rows

Scenario: 

We have Flat File/Excel File as our source and we have to load only top 10 rows into out destination table Or the requirement can be load the records from Row number 5 to 10.

Solution : 

Here is our solution, We will be reading the flat file by using Flat File Source, after that we will use Script component as transformation to generate the sequence number for each of the record. Once we have the sequence number we will be using Conditional Split to get top X records or range of rows depending upon our requirement. 


Step 1: 
Here is sample data that I am using for this sample package. I have 25 records. Please create sample file on your desktop or any drive you want. 

FirstName,LastName,PersonType
Ken,Sánchez,EM
Terri,Duffy,EM
Roberto,Tamburello,EM
Rob,Walters,EM
Gail,Erickson,EM
Jossef,Goldberg,EM
Dylan,Miller,EM
Diane,Margheim,EM
Gigi,Matthew,EM
Michael,Raheem,EM
Ovidiu,Cracium,EM
Thierry,D'Hers,EM
Janice,Galvin,EM
Michael,Sullivan,EM
Sharon,Salavaria,EM
David,Bradley,EM
Kevin,Brown,EM
John,Wood,EM
Mary,Dempsey,EM
Wanida,Benshoof,EM
Terry,Eminhizer,EM
Sariya,Harnpadoungsataya,EM
Mary,Gibson,EM
Jill,Williams,EM
James,Hamilton,EM

Step 2: 
Create new SSIS Package and Drag Data Flow task to Control Flow pane. Bring Flat File Source to Data Flow Pane and configure as shown below

Step 3:
Bring Script Component and connect your Flat File Source to Script Component, When you will bring Script Component to Data Flow Pane it will prompt you to Choose if you want to use Script Component as Source, Destination or Transformation,Choose Transformation.
After that configure as shown below

Choose the columns you want to Pass through

Step 4: 
Add a new output column Seqno as shown below

Step 5: 

Click on Script in Script Transformation Editor and Then Edit Script and paste below code or write what is shown in Red

/* Microsoft SQL Server Integration Services Script Component
*  Write scripts using Microsoft Visual C# 2008.
*  ScriptMain is the entry point class of the script.*/

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    Int32 RowNumber;
    public override void PreExecute()
    {
        base.PreExecute();
        /*
          Add your code here for preprocessing or remove if not needed
        */
    }

    public override void PostExecute()
    {
        base.PostExecute();
        /*
          Add your code here for postprocessing or remove if not needed
          You can set read/write variables here, for example:
          Variables.MyIntVar = 100
        */
    }

    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
         RowNumber += 1;
         Row.Seqno = RowNumber;
        /*
          Add your code here
        */
    }

}

Step 6:
Bring Conditional Split Transformation and connect Script Component to it and write expressions to get only top 10 rows. You can also defined range if you your requirement is let's see read records from 5 to 15.

Final Output:
Let's execute our package and see the output , It should only display top 10 records. I used Multicast and Data Viewer just to display the records. In real time scenario you will be inserting these records to some destination table/file.

1 comment: