How to Load Excel Files Information to SQL Server Table - SSIS Tutorial

Scenario: Link to Script

You are working as ETL Developer / SSIS Developer for Banking industry. They have process that archive Excel files daily to a folder. There are hundreds / thousands of Excel file sitting in folder. They want to create a table in which they can insert Information such as


  • Folder Path
  • Excel File Name
  • Sheet Name/s from Each Excel file
  • Number of Records per Sheet
  • Last Date Modified
  • Last Access Date
  • File Size in KB
This information is really useful and helpful in different scenario
You estimate the storage and do capactity planning
You need to find data in some file, you will be able to get Sheet Names so you don't have to open it
Data Growth can be estimated as you are going to log Row Count from each sheet
You can tell what time it was access and if somebody make any change to file.

Solution:

There could be multiple solutions for this requirement, we are going to keep it short and simple and use Script Task with C#. 

Step 1: 
Go ahead and create Table in SQL Server Database by using below DDL Statement so we can store Excel file information.


Create table dbo.ExcelFileInformation(
id int identity(1,1),
FolderPath VARCHAR(255),
FileName VARCHAR(255),
SheetName VARCHAR(255),
Rows Int,
LastDateModified DateTime,
LastAccessTime Datetime,
FileSizeinKB Int)


Step 2:
Open SQL Server Data Tools and Create new SSIS Package in SQL Server Integration Services Project. Create Three variables as shown

FolderPath: From where you would like to access Excel files
SchemaName : Schema of Table in which you want to insert the information. In our case it is dbo
TableName : Table in which you would like to insert this information.In our case it is ExcelFileInformation 

How to load Excel file information in SQL Server Table - Script Task C# SSIS Tutorial


Step 3:
Create ADO.NET Connection. You have to provide SQL Server Instance Name and Database where your table exists. I have renamed to DBConn in my case after creating.
Create ADO.NET Connect to use in Script Task in SSIS Package to Log Excel File information


Step 4:
Bring Script Task to Control Flow Pane and open it by double clicking. Here we will map the variables.
Map Variables in Script Task in SSIS Package to get Excel files information from a folder


Step 5:
Open script Task Editor by clicking on Edit Script button. you will be adding below name spaces under #Region NameSpaces

using System.IO;
using System.Data.OleDb;
using System.Data.SqlClient;




Go to Main() and paste below script


String FolderPath=Dts.Variables["User::FolderPath"].Value.ToString();
            String TableName = Dts.Variables["User::TableName"].Value.ToString();
            String SchemaName = Dts.Variables["User::SchemaName"].Value.ToString();
            var directory = new DirectoryInfo(FolderPath);
            FileInfo[] files = directory.GetFiles();
            
            //Declare and initilize variables
            string fileFullPath = "";


            SqlConnection myADONETConnection = new SqlConnection();
           myADONETConnection = (SqlConnection)(Dts.Connections["DBConn"].AcquireConnection(Dts.Transaction) as SqlConnection);
            
            //Get one Book(Excel file at a time)
            foreach (FileInfo file in files)
            {
                   fileFullPath = FolderPath+"\\"+file.Name;
                                   
                //Create Excel Connection
                string ConStr;
                string HDR;
                HDR="YES";
                ConStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileFullPath + ";Extended Properties=\"Excel 12.0;HDR=" + HDR + ";IMEX=0\"";
                    OleDbConnection cnn = new OleDbConnection(ConStr);
   
                //Get Sheet Name
                   cnn.Open();
                DataTable dtSheet = cnn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
                string sheetname;
                sheetname="";
                //Get Row Count for each sheet
           foreach (DataRow drSheet in dtSheet.Rows)
            {
                if (drSheet["TABLE_NAME"].ToString().Contains("$"))
                {
                     sheetname=drSheet["TABLE_NAME"].ToString();
                     OleDbCommand oconn = new OleDbCommand("select count(*) AS RowCnt from [" + sheetname + "]", cnn);
                     OleDbDataAdapter adp = new OleDbDataAdapter(oconn);
                     DataTable dt = new DataTable();
                     adp.Fill(dt);
                     string RC = "";
                     foreach (DataRow row in dt.Rows)
                     {
                         object[] array = row.ItemArray;
                         RC = array[0].ToString();
                     }
                  
                     cnn.Close();
 //If sheets start with Numbers, when we read it add extra '(single quote), this will take //care of that part
                     sheetname=sheetname.Replace("'", "");
                     
                    
                    //Prepare Insert Query 
                    string InsertQuery = "";
    // MessageBox.Show("Excel File Name:" + file.Name + " Sheet Name:" + sheetname + "Rows //per Sheet:" + RC + " File Modified Date:" + file.LastWriteTime+" File Size:"+file.Length///1024
                        // +"FileLastAccessTime:"+file.LastAccessTime);

                    InsertQuery=" Insert into ["+SchemaName+"].["+TableName+"] (";
               InsertQuery += "[FolderPath],[FileName],[SheetName],[Rows],[LastDateModified],[LastAccessTime],[FileSizeinKB])";
                    InsertQuery += "Values('" + FolderPath + "','" + file.Name + "','"
sheetname + "','" + RC + "','" + file.LastWriteTime + "','" + file.LastAccessTime + "'," + file.Length / 1024 + ")";

                   // MessageBox.Show(InsertQuery);
                    //Run Insert Query to Insert Data into SQL Server Table 
                     SqlCommand cmd = myADONETConnection.CreateCommand();
                    cmd.CommandText = InsertQuery;
                    cmd.ExecuteNonQuery();
                } 
            }
            }






Step 6:
Save all changes in Script Task. Exit the Script Task. Time to run the SSIS Package:)


How to load Excel Files Information to SQL Server Table- Script Task SSIS Tutorial





Things we learnt in this post
How to use C# to get Excel file information and load into SQL Server Table
How to read Row Count From Excel and save into SQL server Table
How to read Excel file name, sheet names, Last Access Date, Last Modified Date and Log it to Table
How to use ADO.NET Connection in Script Task
How to use SSIS Package Variables in Script Task
How to insert information from Script task to SQL Table



Check out our other posts/videos for Dynamic Excel Source and Destination
  1. How to Load Data from Excel Files when Number of Columns can decrease or order is changed in Excel Sheet
  2. How to Load Only Matching Column Data to SQL Server Table from Multiple Excel Files (Single Sheet per file) Dynamically in SSIS Package
  3. How to Load Excel File Names with Sheet Names ,Row Count,Last Modified Date, File Size in SQL Server Table
  4. How to Load Multiple Excel Files with Multiple Sheets to Single SQL Server Table by using SSIS Package
  5. How to Load Matching Sheets from Excel to Table and Log Not Matching Sheets Information in SQL Server Table
  6. How to create Table for each sheet in Excel Files and load data to it dynamically in SSIS Package
  7. How to Create Table per Excel File and Load all Sheets Data Dynamically in SSIS Package by using Script Task 
  8. How to create CSV file per Excel File and Load All Sheets from Excel File to it in SSIS Package
  9. How to Create CSV File for Each Excel Sheet from Excel Files in SSIS Package
  10. How to Load Excel File Name and Sheet Name with Data to SQL Server in SSIS Package
  11. How to Import data from Multiple Excel Sheets with a pattern of sheet names from Multiple Excel File in SSIS Package
  12. How to import Data from Excel Files for specific Sheet Name to SQL Server Table in SSIS Package
  13. Load Data To Tables according to Excel Sheet Names from Excel Files dynamically in SSIS Package
  14. How to Load Excel Files with Single/ Multiple Sheets to SQL Server Tables according to Excel File Name Dynamically
  15. How to Read Excel Sheet Data after Skipping Rows in SSIS Package by using Script Task 
  16. How to read data from Excel Sheet and Load to Multiple Tables by using Script Task in SSIS Package
  17. How to create Excel File Dynamically from SQL server Table/View by using Script Task in SSIS Package
  18. How to create Excel File Dynamically for Stored Procedure Results in SSIS Package by using Script Task
  19. How to Export SQL Server Tables from Database to Excel File Dynamically in SSIS Package by using Script Task
  20. How to Convert CSV/Text Files to Excel Files in SSIS Package by using Script Task
  21. How to Load All CSV Files to Excel Sheets ( Sheet Per CSV) in single Excel File in SSIS Package
  22. How to Load All CSV Files to Single Excel Sheet with File Names in an Excel File Dynamically in SSIS Package
  23. How to Create Sample Excel file with Sheet from each table with Top 1000 Rows per sheet in SSIS Package
  24. How to Export Data to Multiple Excel Sheets from Single SQL Server Table in SSIS Package

No comments:

Post a Comment