Scenario:You are working as SQL Server developer. You are asked to provide sample random data from dbo.Customer table. You might asked to provide random 100 rows or some percent of total data from table. What query you will use to provide required output?
Solution:In previous post, we learnt how to get top n rows from table. We can either provide row count or percent of records we want to get from a table by using TOP in our select query.
We can use the same Top clause but as we are asked to provide the random records, we need to sort them randomly first. We can use newid() function in order by clause to sort them randomly.
Let's create dbo.Customer table with some sample data.
Create table dbo.Customer (Id int, FName VARCHAR(50), LName VARCHAR(50), CountryShortName CHAR(2)) GO insert into dbo.Customer Values ( 1,'Raza','M','PK'), (2,'Rita','John','US'), (3,'Sukhi','Singh',Null), (4,'James','Smith','CA'), (5,'Robert','Ladson','US'), (6,'Alice','John','US')
1) use NewID ( ) in Order by to get random records
Let's say if we are interested to get 3 random records from dbo.Customer table, we can use below query.
Select top 3 * From dbo.Customer order by NEWID()
|How to get Random records from SQL Server Table - SQL Server / TSQL Tutorial|
Select top 30 percent * From dbo.Customer order by NEWID()
|How to get random records from SQL Server table by using Top Percent with NewID() - SQL Server / TSQL Tutorial|
2) By using TABLESAMPLE SYSTEM
As per Microsoft Books Online "TABLESAMPLE SYSTEM returns an approximate percentage of rows and generates a random value for each physical 8-KB page in the table. Based on the random value for a page and the percentage specified in the query, a page is either included in the sample or excluded. Each page that is included returns all rows in the sample result set".
From here you can understand that if you have small table with few pages , you might not want to use TableSample as it will include or exclude the entire page. With few records in table, you might want to use method 1 and for large tables you can use TableSample.
If I would run below query on my dbo.Customer table, Sometime I will get no records and when get the records, it will return all the records as they are placed on single page.
Select * From dbo.Customer tablesample (30 percent)You can also use Rows you want with table sample as shown below. The rows returned can vary. you can limit them by using top n in select query.
Select * From dbo.Customer tablesample (2 rows)
When I executed above query on dbo.Customer table with total of 6 rows. It either returned me no rows or all six of them.
If you want to limit, you can use below query. Once again, I will suggest to use TableSample with big table where you have data on multiple data pages.
Select top 2 * From dbo.Customer tablesample (2 rows)