How to Read Excel Sheet Data after Skipping Rows in SSIS Package by using Script Task - SSIS Tutorial

Scenario: Download Script

In year 2013, I wrote a post Read Excel File After Skipping Few Rows by using OLE DB Source. Today we are going to use Script Task to perform the same task, How to skip Few Rows in Excel sheet and load rest of the records to SQL Server Table. 
In this post, we will loop through multiple Excel files. Each file can have single or multiple Sheets. Also we don't care about sheet name. It can be anything. 

Just keep in mind, the sheets on each file should be according to our criteria. I have few sample files. We are going to use variables to make our SSIS Package more dynamic. If our requirement change, we simple need to change the values of variables and package should work just fine.


Excel File TechBrothersIT.com has two sheets ( same meta data or column information)

Excel File TechBrothersIT.com_1 has only single sheet with same meta data as Excel Sheets from Excel file TechBrothersIt.com


Solution:

Take a detail look into your Excel Sheets. Notice that we don't need to load the first 7 Rows. We need to skip those rows and the data we have is only start from Column A and Ends at column C.
We need to use this information in our variables. If tomorrow we will get more columns, we can change our Start Column and End Column values. Also if rows start up or down, we can simple change the value for StartingReadingRow value for variable.

Let's create the table in which we are going to load the data. I am going to create Customer table with id, name and dob columns.

CREATE TABLE dbo.Customer (
    id INT
    ,name VARCHAR(100)
    ,dob DATE
    )
GO

Step 1: Create variables to make your SSIS Package dynamic
Create an SSIS Package in SSDT ( SQL Server Data Tools). Create below variables in SSIS Package.

FolderPath : Provide the Folder path where your excel files will be dropped or placed.
SchemaName : Schema Name of the table in which you would like to load the data
TableName: TableName in which you would like to load the data
StartReadingFromRow : Provide the Row number from which you would like to read. in my case I would like to read from row 8 where I have header row.
StartColumn: Provide the Starting Column from Which you would like to Read. In my case it is A
EndColumn : Provide the End Column, In my case it is C. 

Create Variables in SSIS Package - Skip Rows in Excel and Load Data Dynamically to SQL Server Tables


Step 2: Create ADO.NET Connection in SSIS Package to use in Script Task
Create ADO.NET Connection Manager so we can use in Script Task to Load data from Excel Sheets to SQL Server Table after skipping few rows by using Script Task in SSIS Package.

Create ADO.NET Connection in SSIS Package to use in Script Task to Load Excel File after skipping few rows


Step3: Add Variables to Script Task to use from SSIS Package
Bring the Script Task on Control Flow Pane in SSIS Package and open by double clicking Check-box in front of variable to add to Script Task.
How to Skip Rows in Excel Sheet and Load to SQL Server Table by using Script Task in SSIS Package

Step 4: Add Script to Script task Editor in SSIS To load Excel Data after skipping Rows in Excel Sheet

Click Edit Button and it will open Script Task Editor.
Under #region Namespaces, I have added below code
using System.IO;
using System.Data.OleDb;
using System.Data.SqlClient;


Under public void Main() { 
I have added below code. 

            string FolderPath=Dts.Variables["User::FolderPath"].Value.ToString();
            string TableName = Dts.Variables["User::TableName"].Value.ToString();
            string SchemaName = Dts.Variables["User::SchemaName"].Value.ToString();
            string StartingColumn = Dts.Variables["User::StartingColumn"].Value.ToString();
            string EndingColumn = Dts.Variables["User::EndingColumn"].Value.ToString();
            string StartReadingFromRow = Dts.Variables["User::StartReadingFromRow"].Value.ToString();
            var directory = new DirectoryInfo(FolderPath);
            FileInfo[] files = directory.GetFiles();
            
            //Declare and initilize variables
            string fileFullPath = "";
            
            //Get one Book(Excel file at a time)
            foreach (FileInfo file in files)
            {
                   fileFullPath = FolderPath+"\\"+file.Name;
                                   
                //Create Excel Connection
                string ConStr;
                string HDR;
                HDR="YES";
                ConStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileFullPath + ";Extended Properties=\"Excel 12.0;HDR=" + HDR + ";IMEX=0\"";
                    OleDbConnection cnn = new OleDbConnection(ConStr);
            
                //MessageBox.Show(TableName);
                //Get Sheet Name
                cnn.Open();
                DataTable dtSheet = cnn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
                string sheetname;
                sheetname="";
           foreach (DataRow drSheet in dtSheet.Rows)
            {
                if (drSheet["TABLE_NAME"].ToString().Contains("$"))
                {
                     sheetname=drSheet["TABLE_NAME"].ToString();
                     
                     //Load the DataTable with Sheet Data so we can get the column header
                     OleDbCommand oconn = new OleDbCommand("select top 1 * from [" + sheetname + StartingColumn + StartReadingFromRow + ":" + EndingColumn + "]", cnn);
                     OleDbDataAdapter adp = new OleDbDataAdapter(oconn);
                     DataTable dt = new DataTable();
                     adp.Fill(dt);
                     cnn.Close();

                     //Prepare Header columns list so we can run against Database to get matching columns for a table.
                     string ExcelHeaderColumn = "";
                     string SQLQueryToGetMatchingColumn = "";
                     for (int i = 0; i < dt.Columns.Count; i++)
                     {
                         if (i != dt.Columns.Count - 1)
                             ExcelHeaderColumn += "'" + dt.Columns[i].ColumnName + "'" + ",";
                         else
                             ExcelHeaderColumn += "'" + dt.Columns[i].ColumnName + "'";
                     }

                     SQLQueryToGetMatchingColumn = "select STUFF((Select  ',['+Column_Name+']' from Information_schema.Columns where Table_Name='" +
                         TableName + "' and Table_SChema='" + SchemaName + "'" +
                         "and Column_Name in (" + @ExcelHeaderColumn + ") for xml path('')),1,1,'') AS ColumnList";

                     // MessageBox.Show(SQLQueryToGetMatchingColumn);
                     //MessageBox.Show(ExcelHeaderColumn);

                     //USE ADO.NET Connection
                     SqlConnection myADONETConnection = new SqlConnection();
                     myADONETConnection = (SqlConnection)(Dts.Connections["DBConn"].AcquireConnection(Dts.Transaction) as SqlConnection);

                     //Get Matching Column List from SQL Server
                     string SQLColumnList = "";
                     SqlCommand cmd = myADONETConnection.CreateCommand();
                     cmd.CommandText = SQLQueryToGetMatchingColumn;
                     SQLColumnList = (string)cmd.ExecuteScalar();

                     //MessageBox.Show(" Matching Columns: " + SQLColumnList);


                     //Use Actual Matching Columns to get data from Excel Sheet
                     OleDbConnection cnn1 = new OleDbConnection(ConStr);
                     cnn1.Open();
                     OleDbCommand oconn1 = new OleDbCommand("select " + SQLColumnList + " from [" + sheetname + StartingColumn + StartReadingFromRow + ":" + EndingColumn + "]", cnn1);
                     OleDbDataAdapter adp1 = new OleDbDataAdapter(oconn1);
                     DataTable dt1 = new DataTable();
                     adp1.Fill(dt1);
                     cnn1.Close();


                     //Load Data from DataTable to SQL Server Table.
                     using (SqlBulkCopy BC = new SqlBulkCopy(myADONETConnection))
                     {
                         BC.DestinationTableName = SchemaName + "." + TableName;
                         foreach (var column in dt1.Columns)
                             BC.ColumnMappings.Add(column.ToString(), column.ToString());
                         BC.WriteToServer(dt1);
                     }

                }
                } 
            }


Step 5: Save Script in Script Task Editor and Run SSIS Package to Skip Rows and Load Data

Save the Script in Script Task Editor, close the editor and run your SSIS Pakcage. It should skip the rows we have provided as variable values and load the data for columns given.

I executed for sample files and here are my results. It skipped first 7 rows and read the data from Column A to C( id,name and dob columns) and loaded to dbo.Customer table.
Skip Rows in Excel Sheet and Load the Data to SQL Server Table in SSIS Package by using Script Task C# script language




Things we learnt in this post
How to load Data from Excel Files to SQL Server Table by using C#
How to create variables in SSIS Package and use them in Script Task to make package more dynamic
How to Skip Rows in Excel Sheet and Load Data to SQL server Table in SSIS Packge by using Script Task
How to read Excel File Name in SSIS Package
How to Check matching Header Column Names to SQL Server Table by using C sharp
How to save scalar string value from SQL Server Query to Variable in C Sharp
How to use BulkCopy to load data from DataTable to SQL Server Table
How to Map Columns Dynamically in BulkCopy C Sharp



Check out our other posts/videos for Dynamic Excel Source and Destination
  1. How to Load Data from Excel Files when Number of Columns can decrease or order is changed in Excel Sheet
  2. How to Load Only Matching Column Data to SQL Server Table from Multiple Excel Files (Single Sheet per file) Dynamically in SSIS Package
  3. How to Load Excel File Names with Sheet Names ,Row Count,Last Modified Date, File Size in SQL Server Table
  4. How to Load Multiple Excel Files with Multiple Sheets to Single SQL Server Table by using SSIS Package
  5. How to Load Matching Sheets from Excel to Table and Log Not Matching Sheets Information in SQL Server Table
  6. How to create Table for each sheet in Excel Files and load data to it dynamically in SSIS Package
  7. How to Create Table per Excel File and Load all Sheets Data Dynamically in SSIS Package by using Script Task 
  8. How to create CSV file per Excel File and Load All Sheets from Excel File to it in SSIS Package
  9. How to Create CSV File for Each Excel Sheet from Excel Files in SSIS Package
  10. How to Load Excel File Name and Sheet Name with Data to SQL Server in SSIS Package
  11. How to Import data from Multiple Excel Sheets with a pattern of sheet names from Multiple Excel File in SSIS Package
  12. How to import Data from Excel Files for specific Sheet Name to SQL Server Table in SSIS Package
  13. Load Data To Tables according to Excel Sheet Names from Excel Files dynamically in SSIS Package
  14. How to Load Excel Files with Single/ Multiple Sheets to SQL Server Tables according to Excel File Name Dynamically
  15. How to Read Excel Sheet Data after Skipping Rows in SSIS Package by using Script Task 
  16. How to read data from Excel Sheet and Load to Multiple Tables by using Script Task in SSIS Package
  17. How to create Excel File Dynamically from SQL server Table/View by using Script Task in SSIS Package
  18. How to create Excel File Dynamically for Stored Procedure Results in SSIS Package by using Script Task
  19. How to Export SQL Server Tables from Database to Excel File Dynamically in SSIS Package by using Script Task
  20. How to Convert CSV/Text Files to Excel Files in SSIS Package by using Script Task
  21. How to Load All CSV Files to Excel Sheets ( Sheet Per CSV) in single Excel File in SSIS Package
  22. How to Load All CSV Files to Single Excel Sheet with File Names in an Excel File Dynamically in SSIS Package
  23. How to Create Sample Excel file with Sheet from each table with Top 1000 Rows per sheet in SSIS Package
  24. How to Export Data to Multiple Excel Sheets from Single SQL Server Table in SSIS Package

17 comments:

  1. In starting you are hardcoding raw and column positions. Is it possible to get the positions of the header dynamically?

    ReplyDelete
  2. I am looking for dynamically finding the column and row where the headers start

    ReplyDelete
  3. Hello there, I'm trying to implement this exact logic.
    I'm on SQL Server 2016 using VS Enterprise 2015.
    I'm getting an error on this line:: adp.Fill(dt);
    I have commented out all lines, then uncommented them one by one and building and running the script task.
    This line is throwing this error::
    at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
    at System.Reflection.RuntimeMethodInfo.UnsafeInvokeInternal(Object obj, Object[] parameters, Object[] arguments)
    at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
    at System.RuntimeType.InvokeMember(String name, BindingFlags bindingFlags, Binder binder, Object target, Object[] providedArgs, ParameterModifier[] modifiers, CultureInfo culture, String[] namedParams)
    at Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTATaskScriptingEngine.ExecuteScript()

    I'm googling it now to see what I can find.
    Any suggestions?
    -Dave.

    ReplyDelete
  4. A little more on this.
    It seems that the OleDbConnection cnn using OleDbCommand oconn has single quotes in the sheet name (seems to be related to spaces in the sheet name)
    This appears to cause an error, as 'sheet1$' doesn't parse like sheet1$.
    Now, I'm trying to string.Remove the single quotes in the name of the sheet.

    Anyone else had this issue?

    -Dave.

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. I am getting the same error as well.

      Delete
    3. Alex , i have same problem do you have the code which works correctly . can you post it here

      Delete
  5. This comment has been removed by the author.

    ReplyDelete
  6. I have this sorted now.
    I've relabeled my worksheets so that there are no spaces in the names (I'll deal with that on the client side later).

    This final table insert was giving me errors::
    BC.WriteToServer(dt1);

    I found this online and it works for me::
    BC.WriteToServer(dt1, DataRowState.Unchanged);

    Now I'm writing multiple worksheets to my single SQL Server table.

    I had an issue with column mapping. It's wasn't lining up. All columns were being written, but in a completely random order.

    These lines are very important for getting that working::
    foreach (var column in dt1.Columns)
    BC.ColumnMappings.Add(column.ToString(), column.ToString());


    -Dave.

    ReplyDelete
  7. When you use a genuine service, you will be able to provide instructions, share materials and choose the formatting style. Tableau Data Blending

    ReplyDelete
  8. I really thank you for the valuable info on this great subject and look forward to more great posts. Thanks a lot for enjoying this beauty article with me. I am appreciating it very much! Looking forward to another great article. Good luck to the author! All the best! buy insta likes

    ReplyDelete
  9. Thanks for the nice blog. It was very useful for me. I'm happy I found this blog. Thank you for sharing with us,I too always learn something new from your post. hidden marriage

    ReplyDelete
  10. getting syntax error Syntax error in query expression 'Select Select STUFF((Select ',['+Column_Name+']' from Information_schema.Columns where Table_Name='CatsData3' and Table_SChema='dbo'and Column_Name in ('EmplApplName','PersNo','DAS','Tower','Subtower','Date','Status','FS') for xml path('')),1,1,'') From Sheet1$ on line adp1.Fill(dt1)

    ReplyDelete
  11. I just thought it may be an idea to post incase anyone else was having problems researching but I am a little unsure if I am allowed to put names and addresses on here. here

    ReplyDelete
  12. It was wondering if I could use this write-up on my other website, I will link it back to your website though.Great Thanks labels

    ReplyDelete
  13. I find your tutorial wonderful and very helpful. My C# skills are beginner and I am stuck. I have multiple sheets in each excel file (.xls) but sometimes either one could be empty. I keep getting error/failure when the package runs. How would I/where would I in your code, add a part that if the sheet has no records, go on to the next sheet or continue on with the package?

    ReplyDelete