TSQL- Generate Scripts for Stored Procedures for Slowly Changing Dimensions Type 1 [TSQL Merge Statement Load SCD1]


Scenario: 

Sometime we have to create Stored Procedure/s to Load Slowly Changing Dimension  Type 1 (SCD1) table/s. If we have one or two tables it is not a big deal to write script for those but if we have more tables it take whole lot time to create scripts for each of dimension table.

Solution : 

For this Scenario we are going to use Merge Statement to load Slowly Changing Dimension Type 1(SCD1). Our goal is not to write any code but only to provide information in Mapping table for Source and Dimension and below script will create Stored procedure for each of the table that we have provided in our Mapping table.

The mapping table will have below columns :

SourceDatabaseName --> From which database we are loading our Dimension
SourceTableName --> What is the name of our Source Table.
SourceColumnName --> Name of Column from Source Table
BusinessKeyFlt --> Either the column is used as business Key in Dimension or not
DimensionDatabaseName --> Name of the Database where our Dimension table is going to be.
DimensionTableName --> Name of Dimension Table.
DimensionColumnName --> Column name in Dimension Table

We will be creating a Stored procedure in DimensionDatabase with name usp_Load_DimensionTableName for each of the dimension table.

Step 1: 
Create Mapping Table
--drop table dbo.DimensionMappingTable
CREATE TABLE dbo.DimensionMappingTable
  ( DimensionMappingTableID INT IDENTITY(1,1),
     SourceDatabaseName    VARCHAR(100),
     SourceSchema          VARCHAR(100),
     SourceTableName       VARCHAR(100),
     SourceColumnName      VARCHAR(100),
     BusinessKeyFlg        BIT,
     DimensionDatabaseName VARCHAR(100),
     DimensionSchema       VARCHAR(100),
     DimenionTableName     VARCHAR(100),
     DimensionColumnName   VARCHAR(100)
  )


Step 2: 
Let's load dbo.DimensionMappingTable with some sample data. Provide the Source Table and Dimension table and all other related information for which you want to create stored procedure. I have create couple of sample tables with some sample columns and loaded metadata information for those tables in Mapping Table.

INSERT INTO [TEST1].[dbo].[DimensionMappingTable]
           ([SourceDatabaseName]
           ,[SourceSchema]
           ,[SourceTableName]
           ,[SourceColumnName]
           ,[BusinessKeyFlg]
           ,[DimensionDatabaseName]
           ,[DimensionSchema]
           ,[DimenionTableName]
           ,[DimensionColumnName])

SELECT *FROM   (SELECT N'Test1'       AS [SourceDatabaseName],
               N'dbo'         AS [SourceSchema],
               N'Client'      AS [SourceTableName],
               N'ClientName'  AS [SourceColumnName],
               N'0'           AS [BusinessKeyFlg],
               N'Test1'       AS [DimensionDatabaseName],
               N'dbo'         AS [DimensionSchema],
               N'Client_SCD1' AS [DimenionTableName],
               N'ClientName'  AS [DimensionColumnName]
        UNION ALL
        SELECT N'Test1'       AS [SourceDatabaseName],
               N'dbo'         AS [SourceSchema],
               N'Client'      AS [SourceTableName],
               N'Country'     AS [SourceColumnName],
               N'0'           AS [BusinessKeyFlg],
               N'Test1'       AS [DimensionDatabaseName],
               N'dbo'         AS [DimensionSchema],
               N'Client_SCD1' AS [DimenionTableName],
               N'Country'     AS [DimensionColumnName]
        UNION ALL
        SELECT N'Test1'       AS [SourceDatabaseName],
               N'dbo'         AS [SourceSchema],
               N'Client'      AS [SourceTableName],
               N'ID'          AS [SourceColumnName],
               N'1'           AS [BusinessKeyFlg],
               N'Test1'       AS [DimensionDatabaseName],
               N'dbo'         AS [DimensionSchema],
               N'Client_SCD1' AS [DimenionTableName],
               N'BusinessKey' AS [DimensionColumnName]
        UNION ALL
        SELECT N'Test1'       AS [SourceDatabaseName],
               N'dbo'         AS [SourceSchema],
               N'Client'      AS [SourceTableName],
               N'Town'        AS [SourceColumnName],
               N'0'           AS [BusinessKeyFlg],
               N'Test1'       AS [DimensionDatabaseName],
               N'dbo'         AS [DimensionSchema],
               N'Client_SCD1' AS [DimenionTableName],
               N'Town'        AS [DimensionColumnName]
        UNION ALL
        SELECT N'Test1'       AS [SourceDatabaseName],
               N'dbo'         AS [SourceSchema],
               N'MyTable'     AS [SourceTableName],
               N'ID'          AS [SourceColumnName],
               N'1'           AS [BusinessKeyFlg],
               N'Test1'       AS [DimensionDatabaseName],
               N'dim'         AS [DimensionSchema],
               N'MyTable_SCD' AS [DimenionTableName],
               N'DimID'       AS [DimensionColumnName]
        UNION ALL
        SELECT N'Test1'       AS [SourceDatabaseName],
               N'dbo'         AS [SourceSchema],
               N'MyTable'     AS [SourceTableName],
               N'Name'        AS [SourceColumnName],
               N'0'           AS [BusinessKeyFlg],
               N'Test1'       AS [DimensionDatabaseName],
               N'dim'         AS [DimensionSchema],
               N'MyTable_SCD' AS [DimenionTableName],
               N'DimName'     AS [DimensionColumnName]) t;

Step 2:
Use this Script to Create Stored Procedure/Procedures for given tables in [dbo].[DimensionMappingTable]

DECLARE @SourceDatabaseName VARCHAR(100)
DECLARE @SourceSchema VARCHAR(100)
DECLARE @SourceTableName VARCHAR(100)
DECLARE @SourceColumnName VARCHAR(100)
DECLARE @DimensionDatabaseName VARCHAR(100)
DECLARE @DimensionSchema VARCHAR(100)
DECLARE @DimensionTableName VARCHAR(100)
DECLARE @DimensionColumnName VARCHAR(100)

DECLARE db_cursor CURSOR FOR
SELECT [SourceDatabaseName],
         [SourceSchema],
         [SourceTableName],
         [SourceColumnName],
         [DimensionDatabaseName],
         [DimensionSchema],
         [DimenionTableName],
         [DimensionColumnName]
  FROM   dbo.DimensionMappingTable
  WHERE  BusinessKeyFlg = 1
OPEN db_cursor
FETCH NEXT FROM db_cursor INTO @SourceDatabaseName,
 @SourceSchema,
  @SourceTableName,
   @SourceColumnName,
    @DimensionDatabaseName,
     @DimensionSchema,
      @DimensionTableName,
       @DimensionColumnName
WHILE @@FETCH_STATUS = 0
  BEGIN
 
      DECLARE @SQL NVARCHAR(MAX)
      DECLARE @DimensionColumns NVARCHAR(MAX)
      DECLARE @SourcColumns NVARCHAR(MAX)
      DECLARE @ColumnCompare NVARCHAR(MAX)
      DECLARE @SetColumns NVARCHAR(MAX)
      DECLARE @BigSQL NVARCHAR(MAX)
     
      SET @SQL=NULL;
      SET @DimensionColumns=NULL;
      SET @SourcColumns=NULL;
      SET @ColumnCompare=NULL;
      SET @SetColumns=NULL;
      SET @BigSQL=NULL;
--Build Merge Statement

      SET @SQL='MERGE [' + @DimensionDatabaseName + '].['
               + @DimensionSchema + '].[' + @DimensionTableName
               + '] AS DST
USING [' + @SourceDatabaseName + '].['
               + @SourceSchema + '].[' + @SourceTableName
               + ']  AS SRC
ON (SRC.' + @SourceColumnName + ' = DST.'
               + @DimensionColumnName
               + ')
WHEN NOT MATCHED THEN'

      PRINT @SQL

 -- Get Dimension Columns for INSERT PART for Query

      SELECT @DimensionColumns = COALESCE(@DimensionColumns + ', ', '')
                                 + DimensionColumnName
      FROM   [DimensionMappingTable]
      WHERE  SourceTableName = @SourceTableName
      ORDER  BY DimensionMappingTableID

      PRINT @DimensionColumns

 --Get Source Columns for INSERT PART for Query
  
      SELECT @SourcColumns = COALESCE(@SourcColumns + ', ', '') + COL
      FROM   (SELECT 'SRC.' + SourceColumnName AS COL,
                     DimensionMappingTableID
              FROM   [DimensionMappingTable]
              WHERE  SourceTableName = @SourceTableName) D
      ORDER  BY DimensionMappingTableID

      PRINT @DimensionColumns

  --Build Compare Column String for Query
  
      SELECT @ColumnCompare = COALESCE(@ColumnCompare + ' OR ', '') + COL
      FROM   (SELECT 'SRC.' + SourceColumnName + ' <> ' + 'DST.'
                     + DimensionColumnName AS COL
              FROM   [DimensionMappingTable]
              WHERE  SourceTableName = @SourceTableName
                     AND BusinessKeyFlg = 0) D

      PRINT @ColumnCompare

 --Build Set Column String for Query

      SELECT @SetColumns = COALESCE(@SetColumns + ',', '') + COL
      FROM   (SELECT 'DST.' + DimensionColumnName + ' = ' + 'SRC.'
                     + SourceColumnName AS COL
              FROM   [DimensionMappingTable]
              WHERE  SourceTableName = @SourceTableName
                     AND BusinessKeyFlg = 0) D

      PRINT @SetColumns

      SET @SQL='Create Procedure ['+@DimensionSchema+'].[usp_Load_'+@DimensionTableName+'] AS BEGIN '+@SQL
      + ' INSERT (' + @DimensionColumns + ')'
               + ' VALUES (' + @SourcColumns + ')' + '
    WHEN MATCHED
      AND ('
               + @ColumnCompare + ')' + '
      THEN UPDATE
       SET ' + @SetColumns
               + ';
               END'
               PRINT @SQL

     SET @BigSQL = 'USE ' + @DimensionDatabaseName + '; EXEC sp_executesql N''' + @SQL + '''';

    EXEC (@BigSQL)
 

      FETCH NEXT FROM db_cursor  INTO
      @SourceDatabaseName,
       @SourceSchema,
        @SourceTableName,
         @SourceColumnName,
          @DimensionDatabaseName,
           @DimensionSchema,
            @DimensionTableName,
             @DimensionColumnName
  END

CLOSE db_cursor
DEALLOCATE db_cursor

Final Output : 
Once you run above Code, two stored procedures will be created with below names. I am going to provide the definition for one. The Stored procedure can be used in SQL Server Agent Job or We can use these Stored procedure in SSIS Package to load our SCD1 Dimension on daily basis.

  • usp_Load_Client_SCD1
  • usp_Load_MyTable_SCD


Definition of "usp_Load_MyTable_SCD" Stored procedure created by above script.

USE [TEST1]
GO
/****** Object:  StoredProcedure [Dim].[usp_Load_MyTable_SCD] 
        Script Date: 08/06/2013 20:06:49 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [Dim].[Usp_load_mytable_scd]AS
  BEGIN
      MERGE [Test1].[dim].[MyTable_SCD] AS DST
      USING [Test1].[dbo].[MyTable] AS SRC
      ON ( SRC.ID = DST.DimID )
      WHEN NOT MATCHED THEN
        INSERT (DimID,
                DimName)
        VALUES (SRC.ID,
                SRC.Name)
      WHEN MATCHED AND (SRC.Name <> DST.DimName) THEN
        UPDATE SET DST.DimName = SRC.Name;
  END


 
 


1 comment: