Use Distinct in Cassandra - Cassandra / CQL Tutorial

How to use Distinct in Cassandra Query Language

In cassandra you can only select the distinct records from Partition Key column or columns.  If Partition key consists of multiple columns, you have to provide all of the columns otherwise you will get an error.


CQLSH:techbrotherstutorials>SELECT DISTINCT partitioncolumn1, 
from            TableName;



Let's create employee table by using below CQL Statement.

CQLSH:techbrotherstutorials>CREATE TABLE employee 
                          employeeid INT , 
                          fname TEXT, 
                          lname TEXT, 
                          address TEXT, 
                          age TINYINT, 
                          PRIMARY KEY((employeeid,fname),lname) 
             WITH clustering 
ORDER BY     ( 
                          lname DESC);

Notice that I have primary key with columns employeeid,fname and lname. Our partiton key consist of employeeid and fname. We can insert duplicate records in partiton key but all the records will be unique according to primary key (employeeid,fname,lname).

INSERT INTO employee 
            VALUES (1,'John','Doe','ABC Address',23);
INSERT INTO employee 
            VALUES (2,'Robert','Ladson',' Address',40);
INSERT INTO employee 
            VALUES (3,'M','Raza','New Address',38);
            INSERT INTO employee 
            VALUES (3,'M','Ali','New Address',38);  


 I inserted 4 records in table, we have duplicate record 3 and 'M'  by partition key columns. If we want to get distinct records from Partition Key column, we can use below statement. 

CQLSH:techbrotherstutorials>SELECT DISTINCT employeeid, 
FROM            employee;



 employeeid | fname
          3 |      M
          1 |   John
          2 | Robert

(3 rows)