Scenario:
We are loading data from Flat File that can has duplicate records in it. We want to remove the duplicate records and keep only single record for those duplicate records.Solution :
In SSIS, Sort Transformation can help us to perform this requirement. Sort Transformation will remove the duplicate records and let only single record pass through. If you have small file, Sort Transformation will work just fine but if you have millions of records and your computer (Server) has small RAM capacity (Random Access Memory) then you might want to load these records into some staging table and then write TSQL to extract unique records.
Step 1 :
Bring Data Flow task to Control Flow Pane and inside Data Flow Task Drag Flat File Source and create connection to your source file.
Click on Columns to see if all columns are correctly mapped
Step 2:
Drag Sort Transformation and Connect Flat File Source to it. Double Click on Sort Transformation and Choose the columns to Sort. Also Check the Check Box : Remove rows with duplicate sort values and then click OK.
Step 3:
Drag Multicast Transformation and connect Sort Transformation to it. Put Data Viewer between Multicast Transformation and Sort Transformation to view the output.
The duplicate copies of records are removed and only single record is passed through Sort Transformation as we can see in Data Viewer display.
I am very clear with this. Thank you :)
ReplyDeleteI think articles related to throwing light on more aspects of SSIS such as SSIS Postgresql Read can also help people learn more about it.
ReplyDeleteAivivu chuyên vé máy bay, tham khảo
ReplyDeletevé máy bay đi Mỹ giá rẻ
giá vé máy bay đi từ mỹ về việt nam
đăng ký bay từ canada về Việt Nam
vé máy bay từ nhật về việt nam
gia ve may bay vietjet tu han quoc ve viet nam
Vé máy bay từ Đài Loan về Việt Nam
danh sách khách sạn cách ly tại tphcm
ve may bay chuyen gia sang Viet Nam