Get Row Count from Table in Cassandra - Cassandra / CQL Tutorial

How to get Row Count from Table in Cassandra by using CQL

You can use Count(*) for Count(1) to get row numbers from table.

Syntax:

SELECT Count(* |1 )
FROM tablename;

Example:

Let's create employee table by using below CQL Statement.

CQLSH:techbrotherstutorials>CREATE TABLE employee
             (
                          employeeid INT ,
                          fname TEXT,
                          lname TEXT,
                          address TEXT,
                          age TINYINT,
                          PRIMARY KEY((employeeid,fname),lname)
             )
             WITH clustering
ORDER BY     (
                          lname DESC);

Insert sample data by using CQL insert statement

INSERT INTO employee
(employeeid,fname,lname,address,age)
VALUES (1,'John','Doe','ABC Address',23);
INSERT INTO employee
(employeeid,fname,lname,address,age)
VALUES (2,'Robert','Ladson',' Address',40);
INSERT INTO employee
(employeeid,fname,lname,address,age)
VALUES (3,'M','Raza','New Address',38);
INSERT INTO employee
(employeeid,fname,lname,address,age)
VALUES (3,'M','Ali','New Address',38);

Get Total Row Count from Table

If we want to get total row count from employee table, we can use below statement

CQLSH:techbrotherstutorials>SELECT Count(*)
FROM employee;

CQLSH:techbrotherstutorials>SELECT Count(1)
FROM employee;

Output

count
-------
4

(1 rows)

Warnings :
Aggregation query used without partition key

Get Row Count with Where Clause:

You can use where clause in your Select query when geting the row count from table.If you are using where clause with partition keys , you will be good but if you try to use where clause with non partition key columns you will get a warning and will have to use Allow Filtering in select query to get row count. This can be expensive operation if you have large table.

Let's get row count where employeeid=3 and fname in ('M','J'), Remeber employeeid and fname are the partition key columns.

CQLSH:techbrotherstutorials>SELECT Count(1)
FROM   employee
WHERE  employeeid=3
AND    fname IN('M',
                'J');

output

count
-------
2

(1 rows)

Warnings :
Aggregation query used on multiple partition keys (IN restriction)

Let's say now if we want to get the row count where age> 25, Notice that Age is not Partition Key column, we have to use Allow Filtering otherwise we will get below warning.

CQLSH:techbrotherstutorials>SELECT Count(*)
FROM employee
WHERE age>25;

Ouput:

Above query is going to throw a warning.

InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"

Let's write the query with allow filtering to get the desired records.

SELECT Count(*)
FROM employee
WHERE age>25 allow filtering;

Output

count
-------
3

(1 rows)

Warnings :
Aggregation query used without partition key

How to Limit returned rows in Cassandra - Cassandra / CQL Tutorial

How to Limit returned rows in Cassandra Query Language

You can use LIMIT option in query to control the number of rows you want to get.

Syntax:

CQLSH:techbrotherstutorials>SELECT column1,
column2,
...FROM tablename LIMIT numberofrow;

Example:

Let's create employee table by using below CQL Statement.

Insert sample data by using CQL insert statement

Now if we want to get only 2 records then we can use below statement. Cassandra is going to return first 2 records.

CQLSH:techbrotherstutorials>SELECT *
FROM employee LIMIT 2;

Output

employeeid | fname | lname | address | age
------------+-------+-------+-------------+-----
3 | M | Raza | New Address | 38
3 | M | Ali | New Address | 38

Use Distinct in Cassandra - Cassandra / CQL Tutorial

How to use Distinct in Cassandra Query Language

In cassandra you can only select the distinct records from Partition Key column or columns. If Partition key consists of multiple columns, you have to provide all of the columns otherwise you will get an error.

Syntax:

CQLSH:techbrotherstutorials>SELECT DISTINCT partitioncolumn1,
partitioncolumn2...
from TableName;

Example:

Let's create employee table by using below CQL Statement.

Notice that I have primary key with columns employeeid,fname and lname. Our partiton key consist of employeeid and fname. We can insert duplicate records in partiton key but all the records will be unique according to primary key (employeeid,fname,lname).

I inserted 4 records in table, we have duplicate record 3 and 'M' by partition key columns. If we want to get distinct records from Partition Key column, we can use below statement.

CQLSH:techbrotherstutorials>SELECT DISTINCT employeeid,
fname
FROM employee;

Output

employeeid | fname
------------+--------
3 | M
1 | John
2 | Robert

(3 rows)

Use Where Clause in Cassandra - Cassandra / CQL Tutorial

How to use Where Clause in Cassandra (CQL)

Where clause in Cassandra Query Language is used to filter the records. To filter the records, you will be using

Partition Key ( Single or composite)
A cluster column with Partition Key
A column on which you have created the index
A column which is not part of Partition Key or no index is created on it, you can use in where clause but you have to use with Allow Filtering.

Allow Filtering:

Allow filtering is not a good choice, as Cassandra will load all the records and then filter the records which you do not need. If you have table with alot of records, the query can be very expensive as resource usage and can take long time to run. Better options could be designing the partition key as per your input queries or create indexes on columns which you need to use in where clause.

Example:

Let's create employee table with composite Parition key( Employeeid,Fname) and cluster column LName and insert sample data.

Insert records in employee table by using CQL

INSERT INTO employee
(employeeid,fname,lname,address,age)
VALUES (1,'John','Doe','ABC Address',23);

INSERT INTO employee
(employeeid,fname,lname,address,age)
VALUES (2,'Robert','Ladson',' Address',40);

INSERT INTO employee
(employeeid,fname,lname,address,age)
VALUES (3,'M','Raza','New Address',38);

INSERT INTO employee
(employeeid,fname,lname,address,age)
VALUES (4,'M','john','New Address',38);

Where Clause with Partition Key:

As we have composite Partition Key ( Employeeid and Fname), we can filter the records by using these columns. If I try to use one of these columns then we will get below error.

We can use Allow Filtering and it will work but remember it is going to bring all the records and then filter the records which we don't need. That can be expensive operation. In our case we have only 3 records so no big deal but think if you have millions of records in table.

Here is right way to get the data from our table, I have two columns as Partition Key so I am using both of them in where clause. if you have single partition key column, you will be using that.

CQLSH:techbrotherstutorials>>SELECT *
FROM   employee
WHERE  employeeid=3
AND    fname='M';

Output

(1 rows)

Use Clustering Column in Where Clause:

From table definition you can see that we have clustering column "lname". To use clustering column properly, we should use Partition key columns first in where and then clustering column to get results fast. If we will use clustering column alone, Cassandra will throw the warning and ask to use Allow Filtering. As discusses above, it can be expensive operation.

Let's write our query by using cluster column. Couple of things to remember,

The values you provide in where clauses are case senstive

CQLSH:techbrotherstutorials>SELECT employeeid,
       fname,
       lname,
       address,
       age
FROM   employee
WHERE  employeeid=1
AND    fname='John'
AND    lname='Doe';

Output

Use Index Column in Where Clause:

Now if we would like to filter the data by using Age, you can use without index on the age column but we have to use Allow Filtering and the can be expensive operation. Let's create index on Age column by using below query and then we will be able to use Age column in where clause without Allowing Filtering.

CQLSH:techbrotherstutorials>CREATE INDEX indexname ON tablename (columnname );

CQLSH:techbrotherstutorials>CREATE INDEX employee_age ON employee (age);

Now we are good to use the age column in our where clause. Below script will get us all the records where age=40.

CQLSH:techbrotherstutorials>SELECT *
FROM employee
WHERE age=40;

Output

Select Data from Table in Cassandra - Cassandra / CQL Tutorial

Select Data from Table in Cassandra

To Select the data from Cassandra , Select statement is used. If you need to select all columns , you can use "*" or provide list of column names. I am not using Where clause and other options in this post, Let's discuss them in next posts.

Syntax:

cqlsh>SELECT * FROM tablename;

cqlsh>SELECT column1, column2, column3, ....FROM tablename;

ALias:

You can also Alias the column names, you will be using "AS AliasName". if you want to use space in Alias Column name, you need to use double quotes around it e.g "My Alias Column Name"

cqlsh>SELECT column1 AS ColumnOne, column2 AS ColumnTwo, column3 AS "Column Three" FROM tablename;

Example:

Let's create employee table by using CQL create statement and insert sample data and then we will use Select statement to query the data.

Insert data into table by using CQL Insert command

INSERT INTO employee
(employeeid,fname,lname,address,age)
VALUES (1,'John','Doe','ABC Address',23);

INSERT INTO employee
(employeeid,fname,lname,address,age)
VALUES (2,'Robert','Ladson',' Address',40);

INSERT INTO employee
(employeeid,fname,lname,address,age)
VALUES (3,'M','Raza','New Address',38);

Let's select all the records from employee table. Employee table in my case is in TechBrothersTutorials keyspace, I am already in TechBrothersTutorial keyspace. If you need to be in keyspace where your table exists, you can use "USE KEYSPACENAME".

CQLSH:techbrotherstutorials>SELECT * FROM employee;

Output

employeeid | fname | lname | address | age

------------+--------+--------+--------------+-----
3 | M | Raza | New Address | 26
1 | John | Doe | ABC Address | 23
2 | Robert | Ladson | Xyz Address | 40

(3 rows)

Let's use column names in our Select query and alias , I am going to fname to FirstName, lname to LastName and address to "Home Address".

CQLSH:techbrotherstutorials>SELECT employeeid,
       fname   AS FirstName,
       lname   AS LastName,
       address AS "Home Address",
       age
FROM   employee;

Output

employeeid | firstname | lastname | Home Address | age

------------+--------+--------+--------------+-----
3 | M | Raza | New Address | 26
1 | John | Doe | ABC Address | 23
2 | Robert | Ladson | Xyz Address | 40

(3 rows)

Welcome To TechBrothersIT

Label

Get Row Count from Table in Cassandra - Cassandra / CQL Tutorial

How to get Row Count from Table in Cassandra by using CQL

Syntax:

Get Total Row Count from Table

Get Row Count with Where Clause:

How to Limit returned rows in Cassandra - Cassandra / CQL Tutorial

How to Limit returned rows in Cassandra Query Language

Syntax:

Example:

Use Distinct in Cassandra - Cassandra / CQL Tutorial

How to use Distinct in Cassandra Query Language

Syntax:

Example:

Use Where Clause in Cassandra - Cassandra / CQL Tutorial

How to use Where Clause in Cassandra (CQL)

Allow Filtering:

Example:

Where Clause with Partition Key:

Use Clustering Column in Where Clause:

Select Data from Table in Cassandra - Cassandra / CQL Tutorial

Select Data from Table in Cassandra

Syntax:

Output

Output