Scenario:
We get the files in our Source Folder all day long. Each file is appended copy of previous file. We want to create a SSIS Package that will load only the most recent file to our table.
Solution :
We will be using Script task to find out the name of the most recent file and Data Flow Task to load that file to our table.Test_File_20131226.txt is the file that should be read as it is most recent file.
Step 1:
Create a variable VarFolderPath that will contain the folder path in which our files exist and second variable with name VarFileName which will hold the value of most recent File Name.
Step 2:
Drag Script Task to Control Flow Pane and Provide the variables to it as shown
Click on Edit Script and write below script. I have only added the code which is in Red. I have included Messagebox.show just for debugging.
/*
Microsoft SQL Server Integration Services Script Task
Write scripts using Microsoft Visual C# 2008.
The ScriptMain is the entry point class of the script.
*/
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.IO;
namespace ST_2650e9fc7f2347b2826459c2dce1b5be.csproj
{
[System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
#region VSTA generated code
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion
public void Main()
{
// TODO: Add your code here
var directory= new DirectoryInfo(Dts.Variables["User::VarFolderPath"].Value.ToString());
FileInfo[] files = directory.GetFiles();
DateTime lastModified = DateTime.MinValue;
foreach (FileInfo file in files)
{
if (file.LastWriteTime > lastModified)
{
lastModified = file.LastWriteTime;
Dts.Variables["User::VarFileName"].Value = file.ToString();
}
}
MessageBox.Show(Dts.Variables["User::VarFileName"].Value.ToString());
Dts.TaskResult = (int)ScriptResults.Success;
}
}
}
Step 3:
Let's run our SSIS Package to make sure that our script is returning us expected file name( Most recent modified by)
Step 4 :
Next step will be to load this file to our destination table. Bring Data Flow Task to Control Flow Pane and connect Script Task to it.Drag Flat File Source inside Data Flow Task and make connection to any file in the Source Folder as all the files have same structure.
Step 5:
As latest file name will be changing, Let's configure our Flat File Connection Manager to use VarFileName variable.
Final Output:
Drag Multicast Transformation in Data Flow Task and connect Flat File Source to it. Double clicking on green line between them and put Data Viewer. I am using Multicast Transformation just for test purpose. In real scenario you will be using OLE DB Destination if you are loading data to table or any other destination depending upon your requirements.
The latest file is read from folder and display by using Data Viewer.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.