- C# Scripts
- DWH INTERVIEW QUESTIONS
- MS Dynamics AX 2012 R2 Video Tutorial
- Project / Work Support
- SQL SERVER DBA INTERVIEW QUESTIONS
- SQL SERVER DBA Video Tutorial
- SQL Server / TSQL Tutorial
- SQL Server 2016
- SQL Server Scripts
- SSIS INTERVIEW QUESTIONS
- SSIS Video Tutorial
- SSRS INTERVIEW QUESTIONS
- SSRS Video Tutorial
- TSQL INTERVIEW QUESTIONS
- Team Foundation Server 2013 Video Tutorial
- Team Foundation Server 2015 Video Tutorial
- Windows 10
- Windows Server 2012 R2 Installation Videos
SSIS - How To Remove Duplicate Record/s and Keep Single Record in Data Flow Task [ Sort Transformation]
In SSIS, Sort Transformation can help us to perform this requirement. Sort Transformation will remove the duplicate records and let only single record pass through. If you have small file, Sort Transformation will work just fine but if you have millions of records and your computer (Server) has small RAM capacity (Random Access Memory) then you might want to load these records into some staging table and then write TSQL to extract unique records.
Step 1 :
Bring Data Flow task to Control Flow Pane and inside Data Flow Task Drag Flat File Source and create connection to your source file.
Click on Columns to see if all columns are correctly mapped
Drag Sort Transformation and Connect Flat File Source to it. Double Click on Sort Transformation and Choose the columns to Sort. Also Check the Check Box : Remove rows with duplicate sort values and then click OK.
Drag Multicast Transformation and connect Sort Transformation to it. Put Data Viewer between Multicast Transformation and Sort Transformation to view the output.
The duplicate copies of records are removed and only single record is passed through Sort Transformation as we can see in Data Viewer display.