Scenario:
Let’s say we need to load data from flat file source. The
Name column in flat file has some non ascii characters as well some other words
those we do not want to load and want to replace with blank space. To replace these
characters and words we used derived column but after few days we got a new
list of characters or garbage words those we also have to replace. We do not want to
open our package again and write expressions in Derived column to replace. Our
goal is to build a SSIS Package where we can create a list of garbage words and
then use that to clean data in our Name column.
Solution:
As our garbage words list can change anytime so we will be
creating a variable in our SSIS Package to hold list of garbage characters or
words. Each character or word will be separated by using Pipe (|). We can use
SSIS Configuration to provide values to this variable or we provide the values
in SQL Server agent job.
Here are steps how we are going to build our SSIS Package.
Step 1:
Create a variable in SSIS Package and provide the
characters/words those you want to replace in your input column with blank
space.
I have created “BadWordsList” variable with some characters
and words “Shahzad|Test|Œ|Õ|Û|ÞÞ” those are garbage for me and want to replace
them in my input column with blank space.
Step 2:
Create connection to your text file by using flat file
connection manager. I have used below data for this example in text file.
Name, Age
AamirŒ Shahzad Test,33
RazaÛ Ali Shahzad,32
Najaf Ali Shahzad,11
ÕJuily,23
ÞÞRobert,54
Step 3:
Bring Script component in Data Flow Pane and then connect
Flat File Source to it and configure as shown in snapshots.
Code that I included in Script Component can be seen in snapshot.
using System.Text.RegularExpressions;
string garbagelist = Variables.BadWordsList;
Row.SCName = Regex.Replace(Row.Name, garbagelist, "", RegexOptions.IgnoreCase);
As we can see in the snapshot that all the garbage characters/words are removed with blank space by using script component and SC_Name column does not have any of those characters/words.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.