Professional Documents
Culture Documents
Cassandra
Cassandra
Distributed:
Each node in the cluster has same role. There’s no question of failure & the data
set is distributed across the cluster but one issue is there that is the master isn’t
present in each node to support request for service.
Scalability:
It is designed to r/w throughput, Increase gradually as new machines are added
without interrupting other applications.
Fault-tolerance:
Data is automatically stored & replicated for fault-tolerance. If a node Fails, then
it is replaced within no time.
MapReduce Support:
It supports Hadoop integration with MapReduce support.Apache Hive & Apache
Pig is also supported.
Query Language:
Cassandra has introduced the CQL (Cassandra Query Language). Its a simple
interface for accessing the Cassandra.
Data types in CQL:
These four built-in data types commonly used for columns in Cassandra Query
Language (CQL):
Text: Represents variable-length Unicode character string data. It's used for
storing textual data.
Int: Represents a 32-bit signed integer value. It's used for storing whole numbers
within a specific range.
Boolean: Represents a boolean value (true or false). It's used for storing binary
truth values.
Collection in Cassandra:
In Cassandra, Collections refer to data structures that allow the storage of
multiple values within a single column. These collections are useful for scenarios
where you need to handle multiple related pieces of data together. Cassandra
supports several collection types:
Example: LIST<text>.
Using these collection types, you can embed complex data structures within a
single column of a Cassandra table.
Here's an example:
Let's say we have a table called users with columns user_id, name, and email.
To update the email of a user with user_id = 123, the CQL query would look like
this:
This query updates the email column for the user with user_id 123 to
'newemail@example.com'. You can modify multiple columns in a single UPDATE
query by including more SET clauses.
There are two types of DELETE operations in Cassandra: DELETE by primary key or
DELETE by specific columns.
This query removes the entire row where user_id is 123 from the users table.
To delete specific columns for a particular row, you'd use a query like this:
This query removes the email column's value for the user with user_id 123. The
row remains intact, but the email value is deleted.
To create a table in Cassandra, you are use the CREATE TABLE statement.
Let's create a simple table called users with columns for user_id, name, and
email.
Query: CREATE TABLE users (user_id INT PRIMARY KEY, name TEXT,email TEXT);
This query creates a table named users with three columns: user_id, name, and
email. The user_id column is designated as the primary key.
To insert data into the users table, you'd use the INSERT statement. For example:
Query: INSERT INTO users (user_id, name, email) VALUES (123, 'John Doe',
'john@example.com');
Here's an example:
To retrieve all columns for a specific user_id (let's assume user_id = 123), the
query would be:
Query: SELECT * FROM users WHERE user_id = 123;
This query fetches all columns (user_id, name, email) for the user with user_id
123.