SSIS - How To Validate Excel Header Against Definition Table

Scenario: 

We get Excel File on daily basis, we want to validate the Header against our definition table before start loading. If Header does not match with our definition , we will strop the execution of package and send an email. For File Name validation please check Link


Solution:

Step 1: Create Table In SQL Server Database
Let's create definition table in which we can insert entry for our Test_File.
CREATE TABLE [dbo].[FileValidation](
 [FileValidationID] [int] IDENTITY(1,1) NOT NULL,
 [FileName] [varchar](100) NULL,
 [Header] [varchar](4000) NULL,
 [FolderPath] [varchar](200) NULL,
 [FileExtension] CHAR(4)
)
INSERT INTO Dbo.FileValidation (FileName,Header,FolderPath,FileExtension)
 VALUES ('Test_File','Name,Address,phone,SSN','C:\SourceFile\','xlsx')

Step 2: Create Test Excel File 
Fig:1 Test_File for Header Validation in SSIS

Step 3: Create Variables for SSIS Package
create four variables as shown in Fig 2.

Fig 2: Create Variables in SSIS Package

DefHeader: We will read the value of Header from definition table for 'Test_File'
FileHeader: We will use this inside script task for comparison with DefHeader 
HeaderObj: Read the Excel Header Row and save in this variable and then use the values from this variable to build comma separated Header.
ValidationFlg: If Excel column header matches with definition , we will set ValidationFlg=1 else 0.

Step 4: Set DefHeader
Read the Definition value for 'Test_File' entry from SQL Server Table as shown below
Fig 3: Read Header value from definition by using Execute SQL Task

Fig 4: Map the value to variable in Execute SQL Task

Step 5: Read the Excel Header

Create Excel Connection, make sure not to check the box for First row has column names
Fig 5: Create Excel Connection that can be used in Execute SQL Task

Use the Execute SQL Task to read the Header Row of Excel File.  I used Select * from [Sheet1$A0:zz1] query to get header row by assuming that the max columns I will have can go to ZZ.


Fig 6: Read Header Row from Excel by using Execute SQL Task

Map the resultset to HeaderObj variable as shown below
Fig 7: Map HeaderObj Variable for Full Resultset

Step 6: Compare the Excel Header with Excel in Script Task 
We will use  DefHeader,HeaderObj,FileHeader and ValidationFlg in Script task. After comparing the values of DefHeader and FileHeader, we will set the value of ValidationFlg.
First of all we have to read the HeaderObj data and convert that to comma separated row so we can compare it with DefHeader.
Fig 8: Map the variables to Script Task 

Write the Highlighted code after Edit Script button
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Data.OleDb;
namespace ST_a044e9edd1104a458a7a88d86eb8f8e9.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

 
        public void Main()
        {
            OleDbDataAdapter A = new OleDbDataAdapter();
            System.Data.DataTable dt = new System.Data.DataTable();
            A.Fill(dt, Dts.Variables["User::HeaderObj"].Value);
            int rowCount = 1;
            string colhdr;
            
            foreach (DataRow dr in dt.Rows)
            {
                if (rowCount == 1)
                {

                    for (int i = 0; i < dt.Columns.Count; i++)
                    {
           colhdr = dr[i].ToString();
Dts.Variables["User::FileHeader"].Value 
Dts.Variables["User::FileHeader"].Value + "," + colhdr;
                    }
                }
                rowCount += 1;

            }
if ((Dts.Variables["User::FileHeader"].Value.ToString().Remove(0, 1)).TrimEnd(',') ==
Dts.Variables["User::DefHeader"].Value.ToString())
            {
    Dts.Variables["User::ValidationFlg"].Value = 1;
    MessageBox.Show("Value matches");
            }
            else
            {
     Dts.Variables["User::ValidationFlg"].Value = 0;
     MessageBox.Show("Value Not matches");
            }
                
            // TODO: Add your code here
            Dts.TaskResult = (int)ScriptResults.Success;
        }
    }
}



Final Output:
Let's use the ValidationFlag in Precedence Constraint to run the Data Flow When Value=1( Header Match) and stop if header does not match(ValidationFlg=0).




















Fig 8: Use Precedence Constraints


Execute your package couple of time by changing the header and see if all logic working fine. 
Fig 9: Header Match Execution

1 comment:

  1. How Can I make it dynamic in nature. Eg. I have multiple excel files with sheet name same as excel file and I need to validate headers of those excel file dynamically. How can I do that?

    ReplyDelete