Remove Special Characters from String and Convert to Multiple Columns by using ParseName Function - SQL Server / TSQL

Scenario:


Sometime we get the data in one column with some kind of delimiter (such as comma, pipe, space etc.) and also this string contains some special characters such as (.,./,$,#,@ etc.).
The current problem is related to the string that has space separator and some special characters. Our goal is to split this string into First Name, Middle Name and Last Name by using space as separator.

Solution:


Step 1: 
Let's create test table with some sample data

CREATE TABLE dbo.colsplit 
  ( 
     fullname VARCHAR(300) 
  ) 

go 

INSERT INTO dbo.colsplit 
VALUES      ( 'CHARLES. ROSAMILIA.') 

go 

INSERT INTO dbo.colsplit 
VALUES      ( 'CHARLES. .cc ROSAMILIA. @') 

go 

INSERT INTO dbo.colsplit 
VALUES      ( 'CHARLES. HANS. ROSAMILIA /') 

go 

INSERT INTO dbo.colsplit 
VALUES      ( 'CHARLES ROSAMILIA') 

go 

SELECT * 
FROM   dbo.colsplit 

How to Remove Special Characters from string and convert single value to multiple values in TSQL


Step 2:
Let's create the function that we can use to remove special characters from the string in SQL Server.

CREATE FUNCTION [dbo].[RemoveSpecialCharacters] ( @InputString NVARCHAR(max) ) 
returns VARCHAR(max) 
BEGIN 
  DECLARE @OutputString NVARCHAR(max ) 
  IF (Isnull(@InputString,'')='') 
  SET @OutputString = @InputString 
  ELSE 
  BEGIN 
    DECLARE @Length      INT 
    DECLARE @Counter  INT 
    DECLARE @CharCode INT 
    SET @OutputString = '' 
    SET @Length = Len(@InputString) 
    SET @Counter = 1 
    WHILE @Counter <= @Length
    BEGIN 
      SET @CharCode = Ascii(Substring(@InputString, @Counter, 1)) 
      IF @CharCode=32 
      OR 
      @CharCode BETWEEN 48 
      AND 
      57 
      OR 
      @CharCode BETWEEN 65 
      AND 
      90 
      OR 
      @CharCode BETWEEN 97 
      AND 
      122 
      SET @OutputString = @OutputString   + Char(@CharCode) 
      SET @Counter = @Counter + 1 
    END 
  END 
  RETURN @OutputString  
END



Step 3: 
Let's write out final query, we will use
RemoveSpecialCharacters function to remove special characters.
ParseName builtin function to split the Full Name depending upon ''( space).
Wrote Case Statement to adjust middle name to last name where there is no last name.

;WITH cte 
     AS (SELECT Reverse(Parsename(Replace(Reverse( 
                                  dbo.[Removespecialcharacters](fullname)), ' ', 
                                             '.'), 1 
                        )) AS [FirstName], 
                Reverse(Parsename(Replace(Reverse( 
                                  dbo.[Removespecialcharacters](fullname)), ' ', 
                                  '.'), 2 
                        )) AS [MiddleName], 
                Reverse(Parsename(Replace(Reverse( 
                                  dbo.[Removespecialcharacters](fullname)), ' ', 
                                  '.'), 3 
                        )) AS [LastName] 
         FROM   dbo.colsplit) 
SELECT firstname, 
       CASE 
         WHEN lastname IS NULL THEN NULL 
         ELSE middlename 
       END AS MiddleName, 
       CASE 
         WHEN lastname IS NULL 
              AND middlename IS NOT NULL THEN middlename 
         ELSE lastname 
       END AS LastName 
FROM   cte 


The above query can be changed according to the requirement. We can also write a user function to split string into multiple columns. Here we were required to split First Name,Middle Name and Last Name and that can be handled with built in function (ParseName).
How to Split single Column into Multiple Columns in TSQL - SQL Server Tutorial

2 comments: