How to Load all text/csv files to single Text/CSV File from a folder in SSIS Package - SSIS Tutorial

Scenario : Download Script

You are working as SQL Server Integration Services(SSIS) developer. You receive tons of text or csv files in your source folder. You need to create an SSIS Package that should read all the files form Source folder and create single text or csv file with date-time and load data from all of source files.


How to load multiple Text or CSV files to single Text/CSV file in SSIS Package

Notice that all the files have same number of columns. From these two files, I need to create consolidated single file.


Solution:

This requirement can be done by using builtin tasks and transformation but I am going to perform this by using Script Task. 

Step 1: Create new SSIS Package with Variables to Make it Dynamic 
Open SSDT ( Sql Server Data Tools) and create new SSIS Package. After that create below variables

DestinationFolder: Folder path where you would like to create your destination consolidate file
FileDelimiter : Provide the delimiter such as comma (,), Pipe( | ) Whatever your files are using.
FileExtension : Provide the extension of files that you would like to read.
SourceFolder : Source folder path where text files exists
LogFolder: Provide the path of log folder where error file will be created in case of error happen in Script Task
FileName : Provide the file name for your destination file. Date time part will be added to it.
DestinationFileExtension: Provide the extension for your destination file.
Create variables to load all the flat files into single flat file in SSIS Package by using Script Task-SSIS Tutorial


Step 2: Add Script Task to SSIS Package and Map Variables
Bring Script Task to Control Flow Pane and open it by double clicking. Add the SSIS Package variables to it so we can use inside.

Add variables to Script Task to generate single flat file from multiple text or csv files in SSIS Package


Step 3: Add Script to Script task Editor in SSIS Package to Load Each CSV File from a folder to destination flat file ( csv or text)
Click Edit Button and it will open Script Task Editor.
Under #region Namespaces, I have added below code
using System.IO;

Under public void Main() { 
I have added below code.
            string datetime = DateTime.Now.ToString("yyyyMMddHHmmss");
            try
            {

                //Declare Variables
       string DestinationFolder = Dts.Variables["User::DestinationFolder"].Value.ToString();
       string FileDelimiter = Dts.Variables["User::FileDelimiter"].Value.ToString();
       string FileExtension = Dts.Variables["User::FileExtension"].Value.ToString();
       string SourceFolder = Dts.Variables["User::SourceFolder"].Value.ToString();
       string FileName = Dts.Variables["User::FileName"].Value.ToString();
       string DestinationFileExtension = Dts.Variables["User::DestinationFileExtension"].Value.ToString();

                //Building Destination file name
       string FileFullPath = DestinationFolder + "\\" + FileName + "_" + datetime + DestinationFileExtension;

                int counter = 0;

                //Looping through the flat files 
                string[] fileEntries = Directory.GetFiles(SourceFolder, "*" + FileExtension);
                foreach (string fileName in fileEntries)
                {

                   string line;

                    System.IO.StreamReader SourceFile =
                    new System.IO.StreamReader(fileName);

                    StreamWriter sw = null;
                    sw = new StreamWriter(FileFullPath, true);

                    int linecnt = 0;
                    while ((line = SourceFile.ReadLine()) != null)
                    {
                        //Write only the header from first file
                        if (counter == 0 && linecnt == 0)
                        {
                            sw.Write(line);
                            sw.Write(sw.NewLine);

                        }
                        //Write data records from flat files
                        if (linecnt != 0)
                        {
                            sw.Write(line);
                            sw.Write(sw.NewLine);

                        }
                        linecnt++;
                        counter++;
                    }

                    sw.Close();
                    Dts.TaskResult = (int)ScriptResults.Success;
                }

            }

            catch (Exception exception)
            {

                // Create Log File for Errors
                using (StreamWriter sw = File.CreateText(Dts.Variables["User::LogFolder"].Value.ToString() + "\\" +
                    "ErrorLog_" + datetime + ".log"))
                {
                    sw.WriteLine(exception.ToString());
                    Dts.TaskResult = (int)ScriptResults.Failure;

                }
            }
Step 4:
Save the script in script task editor and then exit. Run your SSIS Package, it should create single file in destination folder from multiple source files. You are free to change the value in variable to create txt or csv file from txt or csv files. 

I generate csv and txt destination file from my sample source files as shown below.

How to create single flat file from multiple flat files in SSIS Package by using Script Task





Video Demo: How to create single file from multiple cvs files in SSIS Package





 Related Posts / Videos on Dynamic Text / CSV files by Script Task 



9 comments:

  1. Hello There!!

    Its a great place to learn so much into SSIS, thanks for all your great support and help.
    Though i am working on SSIS since couple of years, i have never used script task and script component earlier. Hence i have learnt so much from your website into those areas.

    ReplyDelete
  2. I've got the following error:

    em System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
    em System.Reflection.RuntimeMethodInfo.UnsafeInvokeInternal(Object obj, Object[] parameters, Object[] arguments)
    em System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
    em System.RuntimeType.InvokeMember(String name, BindingFlags bindingFlags, Binder binder, Object target, Object[] providedArgs, ParameterModifier[] modifiers, CultureInfo culture, String[] namedParams)
    em Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTATaskScriptingEngine.ExecuteScript()

    ReplyDelete
  3. Sorry, i got this error if you can help.

    System.ArgumentException: Illegal characters in path.
    at System.Security.Permissions.FileIOPermission.EmulateFileIOPermissionChecks(String fullPath)
    at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
    at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
    at System.IO.StreamWriter.CreateFile(String path, Boolean append, Boolean checkHost)
    at System.IO.StreamWriter..ctor(String path, Boolean append, Encoding encoding, Int32 bufferSize, Boolean checkHost)
    at System.IO.StreamWriter..ctor(String path, Boolean append)
    at ST_28e7c3290e0844aba3abae19fb3e8c85.ScriptMain.Main()

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. The script and the description have a defect/omission. The script reads the first line of the first file it reads, and then skips the first row of all the other files it reads, starting from the second file. The description simply omits such situations. The assumption evidently is that the first row contains headers in all the files. The assumption that it is always so is wrong. I have a package that creates several .CSV files first, puts them in a designated folder, then reads them and combines them into a single file. The files created by the package, do not have headers in the first row. The first file that is read, has headers for the combined file. It is not created by the package, I manually put it in the designated folder. I tried to publish the code to deal with this situation, here, but the 4096 char limitation does not allow me. Another helpful tip for such a situation is that the files generated by such a package, should have number 2, 3, etc, as the first character of the file name. The file with the headers for the combined file, should have its name start with the number 1. Example: file with headers name: 1medstat.csv. File names generated by the package and then combined into a single file: 2medstat.csv, 3medstat.csv, etc. In my case 2medstat and 3medstat are not interchangeable, the files must be read in one order only. Blog writer, please contact me at artshvetsov@hotmail.com so we can publish the modified script code.

    ReplyDelete
  7. Hi,

    I have a quick how we can add wait statement because we need to wait for file once file arrived the job need to trigger.
    Always we need to check every 10 to 15 min, once files arrive the job need to trigger.

    ReplyDelete
    Replies
    1. How do it wait or sleep and keep check on folder for every 10 min to check for above script.

      Delete