How to Load Matching Sheets from Excel to Table and Log Not Matching Sheets Information in SQL Server Table - SSIS Tutorial

Scenario: Link for Script Download

Think about a situation where you get single/Multiple Excel files in a folder. Each Excel file can have single or multiple sheets. We need to load all these files to SQL server table dbo.Customer. Definition of table is given below.

CREATE TABLE dbo.Customer (
    id INT
    ,name VARCHAR(50)
    ,dob DATE
    )

If all the columns in a sheet match with our table. We will load the data and if any column does not match with our table, we will load that information in below table. Keep it in mind, your excel sheet can have duplicate columns even that match with our table column, as the column will be duplicate in Excel, I am considering that scenario also not matching, because we are not sure from which column we need to load the data if we have duplicate columns in excel.

CREATE TABLE [dbo].[ExcelErrorFiles] (
    [Id] [int] IDENTITY(1, 1) NOT NULL
    ,[FileName] [varchar](100) NULL
    ,[SheetName] [varchar](100) NULL
    ,[HeaderColumn] [varchar](8000) NULL
    ,[CreatedOn] [datetime] NULL
    )

Here are the Excel files which I will try to load and log the information for Excel File with sheet which columns does not match with our table columns.

How to verify  column head in Excel and load to Table and Log not matching Sheets in SSIS

A sheet from Customer_TechBrothersIT1.xlsx with extra columns 

How to validate excel header before loading to SQL Server Table in SSIS Package


Sheet with exact same number of columns of our SQL Server Table

Excel Header validation by using Script Task in SSIS Package before loading to Table


I have one more file but not going to show the sheets. Think about that I have one correct sheet and one incorrect sheet with less columns as compared to our SQL Server Table.

Solution:

Step 1: 
We will be using Script Task with C# to perform this requirement. Let's create below variables


Create your SSIS Package and then create below variables
FolderPath:  Provide the folder path where Excel files will be dropped.
TableName: Provide the table name in which you would like to load sheet data from each File.
SchemaName: Provide the Schema of Table.

if you feel that your error log table name and schema can change, you can create the variables and then use them in Script Task so you can change in different environments such as QA,UAT and Prod by using SSIS Configuration.


Create variable to use in Script task for Excel file loading in SSIS Package


Step2: 
Create ADO.NET Connection Manager as shown below in your SSIS Package.You have to provide SQL Server Instance Name and Database. I have renamed connection to DBConn as can be seen below.
Create ADO.NET Connection Manager so can use in Script Task for Excel Header Validation and Loading


Step 3: Map variables in Script Task
Bring Script task on Control Flow Task in SSIS Package, Open Script Task and Map variables as shown below.
How to map variables in Script Task for dynamic Excel loading to SQL Server Table in SSIS Package



Step 4: 
Click on Edit Script Button and then copy below code to your Script Task Editor.

Under #Region NameSpaces, paste below namespaces.

using System.IO;
using System.Data.OleDb;
using System.Data.SqlClient;


Got to public void Main()
{
and paste below code.


            String FolderPath = Dts.Variables["User::FolderPath"].Value.ToString();
            String TableName = Dts.Variables["User::TableName"].Value.ToString();
            String SchemaName = Dts.Variables["User::SchemaName"].Value.ToString();
            var directory = new DirectoryInfo(FolderPath);
            FileInfo[] files = directory.GetFiles();

            //Declare and initilize variables
            string fileFullPath = "";


            SqlConnection myADONETConnection = new SqlConnection();
            myADONETConnection = (SqlConnection)(Dts.Connections["DBConn"].AcquireConnection(Dts.Transaction) as SqlConnection);

            //Get one Book(Excel file at a time)
            foreach (FileInfo file in files)
            {
                fileFullPath = FolderPath + "\\" + file.Name;

                //Create Excel Connection
                string ConStr;
                string HDR;
                HDR = "YES";
                ConStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileFullPath + ";Extended Properties=\"Excel 12.0;HDR=" + HDR + ";IMEX=0\"";
                OleDbConnection cnn = new OleDbConnection(ConStr);

                //Get Sheet Name
                cnn.Open();
                DataTable dtSheet = cnn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
                string sheetname;
                sheetname = "";
                //Get Row Count for each sheet
                foreach (DataRow drSheet in dtSheet.Rows)
                {
                    if (drSheet["TABLE_NAME"].ToString().Contains("$"))
                    {
                        sheetname = drSheet["TABLE_NAME"].ToString();

                        //Load the DataTable with Sheet Data so we can get the column header
                OleDbCommand oconn = new OleDbCommand("select top 1 * from [" + sheetname + "]", cnn);
                        OleDbDataAdapter adp = new OleDbDataAdapter(oconn);
                        DataTable dt = new DataTable();
                        adp.Fill(dt);
                        cnn.Close();

                        //Prepare Header columns list so we can run against Database to get matching columns for a table.
                        string ExcelHeaderColumn = "";
                      for (int i = 0; i < dt.Columns.Count; i++)
                        {
                            if (i != dt.Columns.Count - 1)
                             ExcelHeaderColumn += "" + dt.Columns[i].ColumnName + "" + ",";
                            else
                                ExcelHeaderColumn += "" + dt.Columns[i].ColumnName + "";
                        }

                       // MessageBox.Show(ExcelHeaderColumn);

                       
//prepare query to get if Header Columns match with SQL Server Table or Not.
          string SQLQueryIsMatched = "";
          SQLQueryIsMatched=";With CTE AS(Select '" + SchemaName + "'  as SchemaName,'" + TableName + "'  AS TableName,";
          SQLQueryIsMatched+="'"+ExcelHeaderColumn+"' as ColumnList),";
          SQLQueryIsMatched+="CTE1 AS (";
        
         SQLQueryIsMatched+="SELECT A.SchemaName,A.TableName,";
         SQLQueryIsMatched+="Split.a.value('.', 'VARCHAR(100)') AS ColumnName ";
         SQLQueryIsMatched+="FROM  (SELECT [SchemaName],[TableName], ";
         SQLQueryIsMatched+="CAST ('' + REPLACE([ColumnList], ',', '') + '' AS XML) AS ColumnName "; 
         SQLQueryIsMatched+="FROM  CTE ) AS A " ;
         SQLQueryIsMatched+="CROSS APPLY ColumnName.nodes ('/M') AS Split(a))";
         SQLQueryIsMatched+=",CTE_SQLTableInfo AS ";
         SQLQueryIsMatched+="(SELECT Table_Name ,Table_Schema ,Column_Name " ;
         SQLQueryIsMatched+="FROM Information_schema.Columns ";
         SQLQueryIsMatched+="WHERE table_Schema = '"+SchemaName+"'";
         SQLQueryIsMatched+="    AND table_name = '"+TableName+"'";
         SQLQueryIsMatched+="    ),CTE_Join AS (";
         SQLQueryIsMatched+="Select A.*,B.*,(Select count(*) From CTE1) -";
SQLQueryIsMatched+=" (Select count(*) From CTE_SQLTableInfo) RecordCntDiff from CTE1 A ";
         SQLQueryIsMatched+="    full Join CTE_SQLTableInfo B";
         SQLQueryIsMatched+="    on A.TableName=B.Table_Name";
         SQLQueryIsMatched+="    and A.SchemaName=B.Table_Schema";
         SQLQueryIsMatched+="    and A.ColumnName=B.Column_Name)";
         SQLQueryIsMatched+="    ,CTE_NonMatching AS(";
         SQLQueryIsMatched+="    Select count(*) NotMatchingCnt ";
         SQLQueryIsMatched+="    from CTE_Join ";
         SQLQueryIsMatched+="    where SchemaName is null ";
         SQLQueryIsMatched+=" or Table_Schema is null) ";
         SQLQueryIsMatched+="Select distinct ";
     SQLQueryIsMatched+="Case When RecordCntDiff=0 and (Select * From CTE_NonMatching)=0 ";
         SQLQueryIsMatched += "Then 'Matched' Else 'NotMatched' End as IsMatch ";
         SQLQueryIsMatched += " From CTE_Join ";

                         //MessageBox.Show(SQLQueryIsMatched);
                       // MessageBox.Show(ExcelHeaderColumn);

               
                        //Get value if Columns matched or not from SQL
                        string IsMatched = "";
                        SqlCommand cmd = myADONETConnection.CreateCommand();
                        cmd.CommandText = SQLQueryIsMatched;
                        IsMatched = (string)cmd.ExecuteScalar();

                      // MessageBox.Show(IsMatched);

                       if (IsMatched == "Matched")
                       {
                           //Use Actual Matching Columns to get data from Excel Sheet
                           OleDbConnection cnn1 = new OleDbConnection(ConStr);
                           cnn1.Open();
         OleDbCommand oconn1 = new OleDbCommand("select * from [" + sheetname + "]", cnn1);
                           OleDbDataAdapter adp1 = new OleDbDataAdapter(oconn1);
                           DataTable dt1 = new DataTable();
                           adp1.Fill(dt1);
                           cnn1.Close();


                           //Load Data from DataTable to SQL Server Table.
                           using (SqlBulkCopy BC = new SqlBulkCopy(myADONETConnection))
                           {
                               BC.DestinationTableName = SchemaName + "." + TableName;
                               foreach (var column in dt1.Columns)
                               BC.ColumnMappings.Add(column.ToString(), column.ToString());
                               BC.WriteToServer(dt1);
                           }
                       }
                       else
                       {
                   //Insert information in ExcelErrorLog Table for Not matching headers
                           //MessageBox.Show(" I am in else not matched");
                           string InsertQuery = "";
                           InsertQuery = " Insert into dbo.ExcelErrorFiles (";
                           InsertQuery += "[FileName],[SheetName],[HeaderColumn])";
                           InsertQuery += "Values('" + file.Name + "','" + sheetname + "','" + ExcelHeaderColumn + "')";

                           // MessageBox.Show(InsertQuery);
                           //Run Insert Query to Insert Data into SQL Server Table 
                           SqlCommand insertcmd = myADONETConnection.CreateCommand();
                           insertcmd.CommandText = InsertQuery;
                           insertcmd.ExecuteNonQuery();
                       }


                    }
                }
            }




Step 5:
Go ahead and save the script. Run SSIS Package and check the Customer Table and Excel Error Log Table.

How to load correct sheets to SQL Server Table by using Script Task in SSIS Package





 How to save Excel Header Not Matched Sheet Names in SQL Server Table- Script Task in SSIS

As we can see that as the columns were not matching with our table columns, those sheets information is saved in ExcelErrorFiles Table.

From here you can send email if you like. Check the Send Mail Heading how to use Email in SSIS Package on this link.




Items we learnt this in post
How to use variables in SSIS Package and Script Task
How to use ADO.NET Connection in Script Task for reading and writing data
Wrote TSQL Query that should return matching or Not Matching Scalar value
How to insert file information from Script Task to Error Log table
How to use C# Script to load data from Multiple Excel File with Multiple Sheets to SQL Server Table
How to perform Excel Header validation against SQL Server Table Definition
How to map variables in Script Task in SSIS Package


Check out our other posts/videos for Dynamic Excel Source and Destination
  1. How to Load Data from Excel Files when Number of Columns can decrease or order is changed in Excel Sheet
  2. How to Load Only Matching Column Data to SQL Server Table from Multiple Excel Files (Single Sheet per file) Dynamically in SSIS Package
  3. How to Load Excel File Names with Sheet Names ,Row Count,Last Modified Date, File Size in SQL Server Table
  4. How to Load Multiple Excel Files with Multiple Sheets to Single SQL Server Table by using SSIS Package
  5. How to Load Matching Sheets from Excel to Table and Log Not Matching Sheets Information in SQL Server Table
  6. How to create Table for each sheet in Excel Files and load data to it dynamically in SSIS Package
  7. How to Create Table per Excel File and Load all Sheets Data Dynamically in SSIS Package by using Script Task 
  8. How to create CSV file per Excel File and Load All Sheets from Excel File to it in SSIS Package
  9. How to Create CSV File for Each Excel Sheet from Excel Files in SSIS Package
  10. How to Load Excel File Name and Sheet Name with Data to SQL Server in SSIS Package
  11. How to Import data from Multiple Excel Sheets with a pattern of sheet names from Multiple Excel File in SSIS Package
  12. How to import Data from Excel Files for specific Sheet Name to SQL Server Table in SSIS Package
  13. Load Data To Tables according to Excel Sheet Names from Excel Files dynamically in SSIS Package
  14. How to Load Excel Files with Single/ Multiple Sheets to SQL Server Tables according to Excel File Name Dynamically
  15. How to Read Excel Sheet Data after Skipping Rows in SSIS Package by using Script Task 
  16. How to read data from Excel Sheet and Load to Multiple Tables by using Script Task in SSIS Package
  17. How to create Excel File Dynamically from SQL server Table/View by using Script Task in SSIS Package
  18. How to create Excel File Dynamically for Stored Procedure Results in SSIS Package by using Script Task
  19. How to Export SQL Server Tables from Database to Excel File Dynamically in SSIS Package by using Script Task
  20. How to Convert CSV/Text Files to Excel Files in SSIS Package by using Script Task
  21. How to Load All CSV Files to Excel Sheets ( Sheet Per CSV) in single Excel File in SSIS Package
  22. How to Load All CSV Files to Single Excel Sheet with File Names in an Excel File Dynamically in SSIS Package
  23. How to Create Sample Excel file with Sheet from each table with Top 1000 Rows per sheet in SSIS Package
  24. How to Export Data to Multiple Excel Sheets from Single SQL Server Table in SSIS Package

No comments:

Post a Comment