Scenario:
Sometime we have to create Stored Procedure/s to Load Slowly Changing Dimension Type 1 (SCD1) table/s. If we have one or two tables it is not a big deal to write script for those but if we have more tables it take whole lot time to create scripts for each of dimension table.
Solution :
For this Scenario we are going to use Merge Statement to load Slowly Changing Dimension Type 1(SCD1). Our goal is not to write any code but only to provide information in Mapping table for Source and Dimension and below script will create Stored procedure for each of the table that we have provided in our Mapping table.
The mapping table will have below columns :
SourceDatabaseName --> From which database we are loading our Dimension
SourceTableName --> What is the name of our Source Table.
SourceColumnName --> Name of Column from Source Table
BusinessKeyFlt --> Either the column is used as business Key in Dimension or not
DimensionDatabaseName --> Name of the Database where our Dimension table is going to be.
DimensionTableName --> Name of Dimension Table.
DimensionColumnName --> Column name in Dimension Table
We will be creating a Stored procedure in DimensionDatabase with name usp_Load_DimensionTableName for each of the dimension table.
Step 1:
Create Mapping Table
--drop table dbo.DimensionMappingTable
CREATE TABLE dbo.DimensionMappingTable
( DimensionMappingTableID INT IDENTITY(1,1),
SourceDatabaseName VARCHAR(100),
SourceSchema VARCHAR(100),
SourceTableName VARCHAR(100),
SourceColumnName VARCHAR(100),
BusinessKeyFlg BIT,
DimensionDatabaseName VARCHAR(100),
DimensionSchema VARCHAR(100),
DimenionTableName VARCHAR(100),
DimensionColumnName VARCHAR(100)
)
Step 2:
Let's load dbo.DimensionMappingTable with some sample data. Provide the Source Table and Dimension table and all other related information for which you want to create stored procedure. I have create couple of sample tables with some sample columns and loaded metadata information for those tables in Mapping Table.
INSERT INTO [TEST1].[dbo].[DimensionMappingTable]
([SourceDatabaseName]
,[SourceSchema]
,[SourceTableName]
,[SourceColumnName]
,[BusinessKeyFlg]
,[DimensionDatabaseName]
,[DimensionSchema]
,[DimenionTableName]
,[DimensionColumnName])
SELECT *FROM (SELECT N'Test1' AS [SourceDatabaseName],
N'dbo' AS [SourceSchema],
N'Client' AS [SourceTableName],
N'ClientName' AS [SourceColumnName],
N'0' AS [BusinessKeyFlg],
N'Test1' AS [DimensionDatabaseName],
N'dbo' AS [DimensionSchema],
N'Client_SCD1' AS [DimenionTableName],
N'ClientName' AS [DimensionColumnName]
UNION ALL
SELECT N'Test1' AS [SourceDatabaseName],
N'dbo' AS [SourceSchema],
N'Client' AS [SourceTableName],
N'Country' AS [SourceColumnName],
N'0' AS [BusinessKeyFlg],
N'Test1' AS [DimensionDatabaseName],
N'dbo' AS [DimensionSchema],
N'Client_SCD1' AS [DimenionTableName],
N'Country' AS [DimensionColumnName]
UNION ALL
SELECT N'Test1' AS [SourceDatabaseName],
N'dbo' AS [SourceSchema],
N'Client' AS [SourceTableName],
N'ID' AS [SourceColumnName],
N'1' AS [BusinessKeyFlg],
N'Test1' AS [DimensionDatabaseName],
N'dbo' AS [DimensionSchema],
N'Client_SCD1' AS [DimenionTableName],
N'BusinessKey' AS [DimensionColumnName]
UNION ALL
SELECT N'Test1' AS [SourceDatabaseName],
N'dbo' AS [SourceSchema],
N'Client' AS [SourceTableName],
N'Town' AS [SourceColumnName],
N'0' AS [BusinessKeyFlg],
N'Test1' AS [DimensionDatabaseName],
N'dbo' AS [DimensionSchema],
N'Client_SCD1' AS [DimenionTableName],
N'Town' AS [DimensionColumnName]
UNION ALL
SELECT N'Test1' AS [SourceDatabaseName],
N'dbo' AS [SourceSchema],
N'MyTable' AS [SourceTableName],
N'ID' AS [SourceColumnName],
N'1' AS [BusinessKeyFlg],
N'Test1' AS [DimensionDatabaseName],
N'dim' AS [DimensionSchema],
N'MyTable_SCD' AS [DimenionTableName],
N'DimID' AS [DimensionColumnName]
UNION ALL
SELECT N'Test1' AS [SourceDatabaseName],
N'dbo' AS [SourceSchema],
N'MyTable' AS [SourceTableName],
N'Name' AS [SourceColumnName],
N'0' AS [BusinessKeyFlg],
N'Test1' AS [DimensionDatabaseName],
N'dim' AS [DimensionSchema],
N'MyTable_SCD' AS [DimenionTableName],
N'DimName' AS [DimensionColumnName]) t;
Step 2:
Use this Script to Create Stored Procedure/Procedures for given tables in [dbo].[DimensionMappingTable]
DECLARE @SourceDatabaseName VARCHAR(100)
DECLARE @SourceSchema VARCHAR(100)
DECLARE @SourceTableName VARCHAR(100)
DECLARE @SourceColumnName VARCHAR(100)
DECLARE @DimensionDatabaseName VARCHAR(100)
DECLARE @DimensionSchema VARCHAR(100)
DECLARE @DimensionTableName VARCHAR(100)
DECLARE @DimensionColumnName VARCHAR(100)
DECLARE db_cursor CURSOR FOR
SELECT [SourceDatabaseName],
[SourceSchema],
[SourceTableName],
[SourceColumnName],
[DimensionDatabaseName],
[DimensionSchema],
[DimenionTableName],
[DimensionColumnName]
FROM dbo.DimensionMappingTable
WHERE BusinessKeyFlg = 1
OPEN db_cursor
FETCH NEXT FROM db_cursor INTO @SourceDatabaseName,
@SourceSchema,
@SourceTableName,
@SourceColumnName,
@DimensionDatabaseName,
@DimensionSchema,
@DimensionTableName,
@DimensionColumnName
WHILE @@FETCH_STATUS = 0
BEGIN
DECLARE @SQL NVARCHAR(MAX)
DECLARE @DimensionColumns NVARCHAR(MAX)
DECLARE @SourcColumns NVARCHAR(MAX)
DECLARE @ColumnCompare NVARCHAR(MAX)
DECLARE @SetColumns NVARCHAR(MAX)
DECLARE @BigSQL NVARCHAR(MAX)
SET @SQL=NULL;
SET @DimensionColumns=NULL;
SET @SourcColumns=NULL;
SET @ColumnCompare=NULL;
SET @SetColumns=NULL;
SET @BigSQL=NULL;
--Build Merge Statement
SET @SQL='MERGE [' + @DimensionDatabaseName + '].['
+ @DimensionSchema + '].[' + @DimensionTableName
+ '] AS DST
USING [' + @SourceDatabaseName + '].['
+ @SourceSchema + '].[' + @SourceTableName
+ '] AS SRC
ON (SRC.' + @SourceColumnName + ' = DST.'
+ @DimensionColumnName
+ ')
WHEN NOT MATCHED THEN'
PRINT @SQL
-- Get Dimension Columns for INSERT PART for Query
SELECT @DimensionColumns = COALESCE(@DimensionColumns + ', ', '')
+ DimensionColumnName
FROM [DimensionMappingTable]
WHERE SourceTableName = @SourceTableName
ORDER BY DimensionMappingTableID
PRINT @DimensionColumns
--Get Source Columns for INSERT PART for Query
SELECT @SourcColumns = COALESCE(@SourcColumns + ', ', '') + COL
FROM (SELECT 'SRC.' + SourceColumnName AS COL,
DimensionMappingTableID
FROM [DimensionMappingTable]
WHERE SourceTableName = @SourceTableName) D
ORDER BY DimensionMappingTableID
PRINT @DimensionColumns
--Build Compare Column String for Query
SELECT @ColumnCompare = COALESCE(@ColumnCompare + ' OR ', '') + COL
FROM (SELECT 'SRC.' + SourceColumnName + ' <> ' + 'DST.'
+ DimensionColumnName AS COL
FROM [DimensionMappingTable]
WHERE SourceTableName = @SourceTableName
AND BusinessKeyFlg = 0) D
PRINT @ColumnCompare
--Build Set Column String for Query
SELECT @SetColumns = COALESCE(@SetColumns + ',', '') + COL
FROM (SELECT 'DST.' + DimensionColumnName + ' = ' + 'SRC.'
+ SourceColumnName AS COL
FROM [DimensionMappingTable]
WHERE SourceTableName = @SourceTableName
AND BusinessKeyFlg = 0) D
PRINT @SetColumns
SET @SQL='Create Procedure ['+@DimensionSchema+'].[usp_Load_'+@DimensionTableName+'] AS BEGIN '+@SQL
+ ' INSERT (' + @DimensionColumns + ')'
+ ' VALUES (' + @SourcColumns + ')' + '
WHEN MATCHED
AND ('
+ @ColumnCompare + ')' + '
THEN UPDATE
SET ' + @SetColumns
+ ';
END'
PRINT @SQL
SET @BigSQL = 'USE ' + @DimensionDatabaseName + '; EXEC sp_executesql N''' + @SQL + '''';
EXEC (@BigSQL)
FETCH NEXT FROM db_cursor INTO
@SourceDatabaseName,
@SourceSchema,
@SourceTableName,
@SourceColumnName,
@DimensionDatabaseName,
@DimensionSchema,
@DimensionTableName,
@DimensionColumnName
END
CLOSE db_cursor
DEALLOCATE db_cursor
Final Output :
Once you run above Code, two stored procedures will be created with below names. I am going to provide the definition for one. The Stored procedure can be used in SQL Server Agent Job or We can use these Stored procedure in SSIS Package to load our SCD1 Dimension on daily basis.
- usp_Load_Client_SCD1
- usp_Load_MyTable_SCD
Definition of "usp_Load_MyTable_SCD" Stored procedure created by above script.
USE [TEST1]
GO
/****** Object: StoredProcedure [Dim].[usp_Load_MyTable_SCD]
Script Date: 08/06/2013 20:06:49 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [Dim].[Usp_load_mytable_scd]AS
BEGIN
MERGE [Test1].[dim].[MyTable_SCD] AS DST
USING [Test1].[dbo].[MyTable] AS SRC
ON ( SRC.ID = DST.DimID )
WHEN NOT MATCHED THEN
INSERT (DimID,
DimName)
VALUES (SRC.ID,
SRC.Name)
WHEN MATCHED AND (SRC.Name <> DST.DimName) THEN
UPDATE SET DST.DimName = SRC.Name;
END