Topic: How to Find Duplicate Records in Kusto Query Language.
In this
article, we are going to learn how to find duplicate records in table by using Kusto
query Language, that's very simple and easy to find the
duplicate records, you can easily learn how to find duplicate records by using
the below provided scripts. Kusto Query Language is a powerful tool to explore your data and discover patterns, identify anomalies and outliers, create statistical modeling, and more. The query uses schema entities that are organized in a hierarchy similar to SQL's: databases, tables, and columns. A Kusto query is a read-only request to process data and return results.
// How to find duplicate records in table by using Kusto Query
//create Table and Insert Sample Data in Azure Data Explorer DB for Testing
.drop table TotalSale
.create table TotalSale (
id: int
,SalePersonFName: string
,SalePersonLName : string
,ProductName : string
,ItemsSold : int
,SoldPrice :real
,SoldDate: date
,City : string
,State : string
,Country : string
,Region : string
)
//Insert data
.ingest inline into table TotalSale <|
11,Tamara,Tony,Cell Phone,2,1200,2015-03-03,Frankfurt,Hesse,Germany,Europe
9,Petra,Henry,TV,10,5000,2015-04-08,Paris,Île-de-France,France,Europe
3,Christy,Ladson,TV,3,1600,2015-04-02,High Point,NC,USA,North America
7,Chirag,Patel,Cell Phone,5,1500,2015-06-23,AhmadAbad,Gujrat,India,Asia
2,M,Raza,Cell Phone,2,800,2015-07-15,Charlotte,NC,USA,North America
5,Najaf,Ali,Computer,1,300,2015-06-20,Karachi,Sindh,Pakistan,Asia
6,Sukhjeet,Singh,TV,2,900,2015-06-21,ChandiGar,Punjab,India,Asia
4,John,Rivers,Laptop,5,2400,2014-03-09,Jersey City,NJ,USA,North America
8,Aleena,Aman,Laptop,2,800,2015-05-25,Lahore,Punjab,Pakistan,Asia
10,Rita,Roger,Laptop,7,2100,2015-04-11,Paris,Île-de-France,France,Europe
1,Aamir,Shahzad,TV,1,700,2015-07-15,Charlotte,NC,USA,North America
12,aamir,Shahzad,TV,1,7000,2015-07-15,Charlotte,NC,USA,North America
10,Rita,Roger,Laptop,7,2100,2015-04-11,Paris,Île-de-France,France,Europe
// Find duplicate records by
TotalSale
| summarize count() by id,SalePersonFName,SalePersonLName
| where count_ >1
TotalSale
| summarize rn=count() by tolower(SalePersonFName),tolower(SalePersonLName)
| where rn>1
Excellent post with all the necessary details JoeMcKinley
ReplyDeleteBut right now, I'm looking for the top.