Cassandra - Datastax Java Driver - Treselle Systems

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

9/26/2019 Cassandra – Datastax Java Driver | Treselle Systems

BLOG (866)-571-3757
BLOG HOME | BIG DATA | CASSANDRA
CONTACT US
– DATASTAX JAVA DRIVER

Big Data Technology & Integration QA

Big Data

Cassandra – Datastax Java Driver


Big Data Treselle Engineering
SHARE
Tweet
Share on Facebook
Google+
Linkedin
10487 Views
Twitter
Facebook
Google +r
LinkedIN

Introduction Table of Content [hide]

1 Introduction
The NoSQL phenomenon has become the center of attraction 2 Use Cases
3 Solution
in the past few years because there is a rising demand to
3.1 Pre-requisites:
accommodate high volumes of real-time data. Hence, major 3.2 Read the following to
internet companies have popularized the use of data storage understand the terms, data
storage structure, and different
solutions that differ from traditional RDBMS. types of Data Model in
Cassandra
3.3 Design a Data Model in a
Apache Cassandra convenient way to retrieve it fast
3.4 Create a Java Program to
do the DML operation
One good solution for data storage is Apache Cassandra- a
4 Conclusion
distributed database management system. It was originally 5 Reference
developed by Face book. Cassandra is an integration of a
schema-flexible data model, (from Google’s BigTable) with a
fully distributed, shared-nothing design (from Amazon’s Dynamo). This structure of Cassandra offers
high availability, linear scalability and high performance while relaxing some consistency guarantees.

This blog deals with Cassandra’s interaction using Datastax Java driver, to create a perfect data model
for our application.

Preference of Datastax over other drivers:

Datastax is the one of the Java client driver for Apache Cassandra. This driver works exclusively with
the Cassandra Query Language version 3 (CQL3) which is similar to SQL, and Cassandra’s binary
protocol. CQL3 is considered to be simple and better suited API for Cassandra than thrift API. Other
Cassandra client drivers appear to be complex while interacting with Cassandra, and writing Queries.

Use Cases
Let’s have a use case starting from the basic DML operation with Cassandra, using Datastax Java
driver in Java.

What we need to do:

Pre-requisites
Understand the terms, and data storage structure of Cassandra.
Design a perfect Data Model in a convenient way to read it fast
Create a Java Program to do DML operations

www.treselle.com/blog/cassandra-datastax-java-driver/ 1/7
9/26/2019 Cassandra – Datastax Java Driver | Treselle Systems

Solution (866)-571-3757 CONTACT US

Pre-requisites: Big Data Technology & Integration QA


Cassandra must be in running state on the node where we insert the data.
If you need help regarding installation of Cassandra, creation of key space and Column family refer
here.
https://wiki.apache.org/cassandra/GettingStarted.
Get the Latest Datastax driver from
http://mvnrepository.com/artifact/com.datastax.cassandra/cassandra-driver-core
JDK 1.6 +
Read the following to understand the terms, data storage structure,
and different types of Data Model in Cassandra

The best way to approach data modeling for Cassandra is to start with the queries and work backwards
from there. Think about the actions our application needs to perform, how we want to access the data,
and then design column families to support those access patterns.

To understand the Cassandra data model, we need to get accustomed with the conventions of RDMS
(Relational Data Base Management System) and their naming structure in Cassandra.

The following table shows the terms used in RDBMS and Cassandra

RDBMS

Limitation in Column size


Column name are same for entire rows of a table.
Column name does not store any value

Cassandra

If we are familiar with JAVA then it is easy to understand how Cassandra stores the data.
It stores the data as a Map of a Map: an outer Map keyed by a row key, and an inner Map keyed by a
Column name/key, where both maps are sorted.

SortedMap<RowKey, SortedMap<ColumnKey, ColumnValue>>

Column name/key will vary for all rows of a Column family but Column datatype will be same for the
entire column family. This is because the Column name/key stores value.

In Cassandra, there is no limitation in Column size as we can store billions of columns with a single row
key. These types of columns are called wide-rows.

Before designing the Data Model, please remember the following:

Avoid thinking in context of relational table


Model the column families around query patterns
De-normalize and duplicate for read performance

www.treselle.com/blog/cassandra-datastax-java-driver/ 2/7
9/26/2019 Cassandra – Datastax Java Driver | Treselle Systems
There are many ways to model data in Cassandra
Indexing is not an afterthought, anymore (866)-571-3757 CONTACT US
Think of physical storage structure
Design a Data Model in a convenient way to retrieve it fast Big Data Technology & Integration QA

1 cqlsh> CREATE KEYSPACE dmlkeyspace WITH replication = { 'class': 'SimpleStrategy


2 cqlsh>use dmlkeyspace;
3 cqlsh>CREATE TABLE dmloperations ( dept_name text,  emp_id int,  emp_name text, 

Create a Java Program to do the DML operation


DMLOperations.java
1 package com.treselle.cassandra.dml;
2  
3 import java.io.FileInputStream;
4 import java.io.IOException;
5 import java.util.Properties;
6  
7 import com.datastax.driver.core.BoundStatement;
8 import com.datastax.driver.core.Cluster;
9 import com.datastax.driver.core.PreparedStatement;
10 import com.datastax.driver.core.ResultSet;
11 import com.datastax.driver.core.Row;
12 import com.datastax.driver.core.Session;
13  
14 public class DMLOperations {
15  
16     private static Properties properties;
17  
18     Cluster cluster = null;
19     Session session = null;
20  
21     /**
22      * Disconnect from the current cluster
23      */
24     public void disconnect() {
25         this.session.shutdown();
26         this.cluster.shutdown();
27         System.out.println("DisConnected!!");
28     }
29  
30     /**
31      * @param ip
32      * @param keySpace
33      *            Connected to the keyspace and node
34      */
35     public void connect(String ip, String keySpace) {
36         this.cluster = Cluster.builder().addContactPoints(ip).build();
37         this.session = cluster.connect(keySpace);
38         System.out.println("Connected!!");
39     }
40  
41     /**
42      * Select all the rows from the given columnfamily
43      */
44     public void selectALL() {
45  
46         BoundStatement boundStatement = null;
47         PreparedStatement prepare_statement = null;
48  
49         prepare_statement = this.session.prepare(properties
50                 .getProperty("SELECT_ALL"));
51         boundStatement = new BoundStatement(prepare_statement);
52         ResultSet rs = this.session.execute(boundStatement);
53  
54         for (Row row : rs) {
55             System.out.println(row.toString());
56         }
57     }
58  
59     /**
60      * @param deptName
61      * @param EmployeeId
62      *            insert the data to column family
63      */
64     public void insertAll(String deptName, int EmployeeId, String EmployeeName
65  
66         BoundStatement boundStatement = null;

www.treselle.com/blog/cassandra-datastax-java-driver/ 3/7
9/26/2019 Cassandra – Datastax Java Driver | Treselle Systems
67         PreparedStatement prepare_statement = null;
(866)-571-3757 CONTACT US
68         prepare_statement = this.session.prepare(properties
69                 .getProperty("INSERT_ALL"));
70         boundStatement = new BoundStatement(prepare_statement); Big Data Technology & Integration QA
71         this.session.execute(boundStatement.bind(deptName, EmployeeId,
72                 EmployeeName));
73     }
74  
75     /**
76      * @param deptName
77      * @param EmployeeId
78      *            update the data to using the deptname
79      */
80     public void update(String deptName, String EmployeeName, int id) {
81         BoundStatement boundStatement = null;
82         PreparedStatement prepare_statement = null;
83         prepare_statement = this.session.prepare(properties
84                 .getProperty("UPDATE_NAME"));
85         boundStatement = new BoundStatement(prepare_statement);
86         this.session.execute(boundStatement.bind(EmployeeName, deptName, id));
87     }
88  
89     public void delete(String deptName, int id) {
90         BoundStatement boundStatement = null;
91         PreparedStatement prepare_statement = null;
92         prepare_statement = this.session.prepare(properties
93                 .getProperty("DELETE_EMP"));
94         boundStatement = new BoundStatement(prepare_statement);
95         this.session.execute(boundStatement.bind(deptName, id));
96     }
97  
98     /**
99      * @param propertiesFileName
100      * @return java.util.Properties Object Load the values from File to Proper
101      *         Object
102      */
103     private Properties loadProperties(String propertiesFileName) {
104         Properties prop = new Properties();
105         try {
106             prop.load(new FileInputStream(propertiesFileName + ".properties"))
107         } catch (IOException ex) {
108             ex.printStackTrace();
109             System.err.println(ex.getMessage());
110         }
111  
112         return prop;
113     }
114  
115     public static void main(String[] args) {
116         DMLOperations object = new DMLOperations();
117         properties = object.loadProperties("queries");
118         object.connect(properties.getProperty("SERVER_IP"),
119                 properties.getProperty("keyspace"));
120         object.insertAll("bigdata", 03, "sam");
121         object.insertAll("bigdata", 05, "santhosh");
122         object.insertAll("java", 04, "joe");
123         System.err.println("Inserted ");
124         object.selectALL();
125         object.update("bigdata", "samKrish", 03);
126         System.err.println("Updated ");
127         object.selectALL();
128         object.delete("bigdata", 03);
129         System.err.println("Deleted");
130         object.selectALL();
131         object.disconnect();
132     }
133  
134 }

queries.properties
1 SELECT_ALL=SELECT * FROM  dmloperations;
2 INSERT_ALL = insert into dmloperations (dept_name, emp_id, emp_name ) VALUES (?,
3 UPDATE_NAME = update dmloperations SET emp_name =? where dept_name=? and emp_id=
4 DELETE_EMP = delete from dmloperations where dept_name=? and emp_id=?;
5 SERVER_IP=127.0.0.1
6 keyspace=dmlkeyspace

www.treselle.com/blog/cassandra-datastax-java-driver/ 4/7
9/26/2019 Cassandra – Datastax Java Driver | Treselle Systems
Challenges:
(866)-571-3757 CONTACT US
Throws java.lang.UnsupportedClassVersionError org/apache/cassandra/transport/FrameCompressor :
Unsupported major.minor version 51.0 because while trying to connect Cassandra with the
Big Data Technology & Integration QA
combination of higher version of Cassandra with lower version of Datastax Java driver.

1 Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/c


2         at java.lang.ClassLoader.defineClass1(Native Method)
3         at java.lang.ClassLoader.defineClassCond(Unknown Source)
4         at java.lang.ClassLoader.defineClass(Unknown Source)
5         at java.security.SecureClassLoader.defineClass(Unknown Source)
6         at java.net.URLClassLoader.defineClass(Unknown Source)
7         at java.net.URLClassLoader.access$000(Unknown Source)
8         at java.net.URLClassLoader$1.run(Unknown Source)
9         at java.security.AccessController.doPrivileged(Native Method)
10         at java.net.URLClassLoader.findClass(Unknown Source)
11         at java.lang.ClassLoader.loadClass(Unknown Source)
12         at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
13         at java.lang.ClassLoader.loadClass(Unknown Source)
14         at com.datastax.driver.core.Cluster$Builder.<init>
15 (Cluster.java:288)
16     at com.datastax.driver.core.Cluster.builder(Cluster.java:107)

Solution: Check version of Datastax java driver in the classpath and update it to the current or
upgraded version of Cassandra.

Throws NoHostAvailableException because the classpath Cassandra version is not matched with the
Cassandra instance version.

1 Exception in thread "main" com.datastax.driver.core.exceptions.NoHostAvailableE


2           at com.datastax.driver.core.ControlConnection.reconnectInternal
3 (ControlConnection.java:179)
4         at com.datastax.driver.core.ControlConnection.connect
5 (ControlConnection.java:77)
6         at com.datastax.driver.core.Cluster$Manager.init
7 (Cluster.java:868)
8 at com.datastax.driver.core.Cluster$Manager.newSession
9 (Cluster.java:888)
10 at com.datastax.driver.core.Cluster$Manager.access$200
11 (Cluster.java:792)
12 at com.datastax.driver.core.Cluster.connect
13 (Cluster.java:155)
14 at com.datastax.driver.core.Cluster.connect(Cluster.java:174)

Solution: Change the Cassandra classpath jars to the existing Cassandra version we are trying to
connect.
Throws com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for value 0 of CQL type
text, expecting class Java.lang.String but class java.lang.Long provided this is because of the miss
match of data while binding with the data type of the specific column.

1 Exception com.datastax.driver.core.exceptions.InvalidTypeException:
2 Invalid type for value 0 of CQL type text, expecting class java.lang.String but
3 at com.datastax.driver.core.BoundStatement.bind
4 (BoundStatement.java:185)
5                 ..2 more

Solution: Check with the data type we passed and while binding with the CQL query using Bound
Statement.

Conclusion
We are able to connect Cassandra through Datastax Java driver and complete DML operation using
CQL (Cassandra Query Language) in Java, which is comparatively easy as the jdbc.odbc driver in
MySQL.

Reference
Apache Cassandra: http://wiki.apache.org/cassandra/GettingStarted
Cassandra and Datastax:http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html
CQL Queries: https://cassandra.apache.org/doc/cql3/CQL.html
10487 Views 2 Views Today
10487 Views
Twitter
Facebook
Google +r
LinkedIN

www.treselle.com/blog/cassandra-datastax-java-driver/ 5/7
9/26/2019 Cassandra – Datastax Java Driver | Treselle Systems

0 Comments Treselle Systems 


1 Login (866)-571-3757 CONTACT US

 Recommend t Tweet f Share SortData


Big by Best Technology & Integration QA

Start the discussion…

LOG IN WITH
OR SIGN UP WITH DISQUS ?

Name

Be the first to comment.

ALSO ON TRESELLE SYSTEMS

Data Analysis Using Apache Hive and Apache Drill vs Amazon Athena – A
Apache Pig Comparison on Data Partitioning
1 comment • 2 years ago 1 comment • 2 years ago
mickael — is it done with cloudera? right Gati Dash — So can we say Apache Drill is
more efficient until and unless we have a
specific usecase for Athena??]

Call Detail Record Analysis – K-means Thingsboard Installation on Windows


Clustering with R 3 comments • 2 years ago
2 comments • 2 years ago Paolo Cortesi — Ciao Stefano, scusa come
Rathnadevi Manivannan — You can download hai risolto il punto 2? Grazie
the sample data file from the GitHub link and Paolo
file name is

PREVIOUS POST NEXT POST

Services Company Contact Us Connect With Us


Big Data Core Values & Beliefs Corporate Office:
Technology & Integration Leadership Campbell, CA - 95008, USA
Quality Assurance Sales Inquiries: (866)-571-3757
Careers
Asia-Pacific
Clients Company Culture
Why Work at Treselle? Headquarters:
Employee Benefits Chennai - 600024, India
Sales Inquiries: 044 - 24730203
Open Positions
Submit Your Resume
US – East Coast
office:
Blog
Durham, North Carolina –
27713, USA
Sales Inquiries: (866)-571-3757

European
Headquarters:
London EC3R 7LP

Asia-Pacific Office:
Singapore, 048624

www.treselle.com/blog/cassandra-datastax-java-driver/ 6/7
9/26/2019 Cassandra – Datastax Java Driver | Treselle Systems

Privacy Policy Site Map Copyright © 2004 - 2019 Treselle CONTACT


Systems, Inc. All
(866)-571-3757 US
Rights Reserv

Big Data Technology & Integration QA

www.treselle.com/blog/cassandra-datastax-java-driver/ 7/7

You might also like