Professional Documents
Culture Documents
Cassandra - Course Curriculum
Cassandra - Course Curriculum
About Edureka
Edureka is a leading e-learning platform providing live instructor-led interactive online training.
We cater to professionals and students across the globe in categories like Big Data &
Hadoop, Business Analytics, NoSQL Databases, Java & Mobile Technologies, System Engineering,
Project Management and Programming.
We have an easy and affordable learning solution that is accessible to millions of learners. With our
students spread across countries like the US, India, UK, Canada, Singapore, Australia, Middle East,
Brazil and many others, we have built a community of over 1 million learners across the globe.
Cassandra is a distributed database from Apache that is highly scalable and designed to manage
very large amounts of structured data. Apache Cassandra Certification Training Covers Gossip
Protocol, Database and Table Operations, Node Operations in a Cluster, Managing and Monitoring
the Cluster, Backup/Restore and Performance Tuning, and Hosting Cassandra Database on Cloud.
You will also learn to integrate Cassandra with Hadoop, Spark, and Kafka.
1
Module 1:
Introduction to Big data, and Cassandra
Goal: In this module you will get a brief introduction of Big Data and how it creates problems for traditional Database Management
Systems like RDBMS. You will also learn how Cassandra solves these problems and understand Cassandra’s features.
Skills:
• Basic concepts of Cassandra
Objectives:
• At the end of this module, you will be able to • Explain what is Big Data • List the Limitations of RDBMS • Define NoSQL and it’s
Characteristics • Define CAP Theorem • Learn Cassandra • List the Features of Cassandra • Get a Tour of Edureka’s VM
Topics:
• Introduction to Big Data and Problems caused by it • 5V – Volume, Variety, Velocity, Veracity and Value • Traditional Database
Management System • Limitations of RDMS • NOSQL databases • Common characteristics of NoSQL databases • CAP theorem • How
Cassandra solves the Limitations • History of Cassandra • Features of Cassandra
Hands On:
• Edureka VM tour
Module 2:
Cassandra Data Model
Goal: In this module, you will learn about Database Model and similarities between RDBMS and Cassandra Data Model. You will also
understand the key Database Elements of Cassandra and learn about the concept of Primary Key.
Skills:
• Data Modelling in Cassandra • Data Structure Design
Objectives:
• At the end of this module, you will be able to • Explain what is Database Modelling and it’s Features • Describe the Different Types of
Data Models • List the Difference between RDBMS and Cassandra Data Model • Define Cassandra Data Model • Explain Cassandra
Database Elements • Implement Keyspace Creation, Updating and Deletion • Implement Table Creation, Updating and Deletion
Topics:
• Introduction to Database Model • Understand the analogy between RDBMS and Cassandra Data Model • Understand following Database
Elements a. Cluster b. Keyspace c. Column Family/Table d. Column
• Column Family Options • Columns • Wide Rows, Skinny Rows • Static and dynamic tables
Hands On:
• Creating Keyspace • Creating Tables
2
Module 3:
Cassandra Architecture
Goal: Gain knowledge of architecting and creating Cassandra Database Systems. In addition, learn about the complex inner workings of
Cassandra such as Gossip Protocol, Read Repairs and so on.
Skills:
• Cassandra Architecture
Topics:
• Cassandra as a Distributed Database • Key Cassandra Elements : a. Memtable b. Commit log c. SSTables
• Replication Factor • Data Replication in Cassandra • Gossip protocol – Detecting failures • Gossip: Uses • Snitch: Uses • Data Distribution
• Staged Event-Driven Architecture (SEDA) • Managers and Services • Virtual Nodes: Write path and Read path • Consistency level •
Repair • Incremental repair
Module 4:
Deep Dive into Cassandra Database
Goal: In this module you will learn about Keyspace and its attributes in Cassandra. You will also create Keyspace, learn how to create a
Table and perform operations like Inserting, Updating and Deleting data from a table while using CQLSH.
Skills:
• Database Operations • Table Operations
Topics:
• Replication Factor • Replication Strategy • Defining columns and data types • Defining a partition key • Recognizing a partition key •
Specifying a descending clustering order • Updating data • Tombstones • Deleting data • Using TTL • Updating a TTL
Hands-on/Demo
• Create Keyspace in Cassandra • Check Created Keyspace in System_Schema.Keyspaces • Update Replication Factor of Previously
Created Keyspace • Drop Previously Created Keyspace • Create A Table Using cqlsh • Create A Table Using UUID & TIMEUUID • Create A
Table Using Collection & UDT Column • Create Secondary Index On a Table • Insert Data Into Table • Insert Data into Table with UUID &
TIMEUUID Columns • Insert Data Using COPY Command • Deleting Data from Table
3
Module 5:
Node Operations in a Cluster
Goal: Learn how to add nodes in Cassandra and configure Nodes using “cassandra.yaml” file. Use Nodetool to remove node and restore
node back into the service. In addition, by using Nodetool repair command learn the importance of repair and how repair operation
functions.
Skills:
• Node Operations
Topics:
• Cassandra nodes • Specifying seed nodes • Bootstrapping a node• Adding a node (Commissioning) in Cluster • Removing
(Decommissioning) a node • Removing a dead node • Repair • Read Repair • What’s new in incremental repair • Run a Repair Operation •
Cassandra and Spark Implementation
Hands On:
• Commissioning a Node • Decommissioning a Node • Nodetool Commands
Module 6:
Managing and Monitoring the Cluster
Goal: The key aspects to monitoring Cassandra are resources used by each node, response latencies to requests, requests to offline nodes,
and the compaction process. Learn to use various monitoring tools in Cassandra such as Nodetool and JConsole in this module.
Skills:
• Clustering
Topics:
• Cassandra monitoring tools • Logging • Tailing • Using Nodetool Utility • Using JConsole • Learning about OpsCenter • Runtime Analysis
Tools
Hands On:
• JMX and Jconsole • OpsCenter
4
Module 7:
Backup/Restore and Performance Tuning
Goal: In this Module you will learn about the importance of Backup and Restore functions in Cassandra and Create Snapshots in Cassandra.
You will learn about Hardware selection and Performance Tuning (Configuring Log Files) in Cassandra. You will also learn about Cassandra
integration with various other frameworks.
Skills:
• Performance tuning • Cassandra Design Principals • Backup and Restoration
Topics:
• Creating a Snapshot • Restoring from a Snapshot • RAM and CPU recommendations • Hardware choices • Selecting storage • Types of
Storage to Avoid • Cluster connectivity, security and the factors that affect distributed system performance • End-to-end performance
tuning of Cassandra clusters against very large data sets • Load balance and streams
Hands On:
• Creating Snapshots • Integration with Kafka • Integration with Spark
Module 8:
Hosting Cassandra Database on Cloud
Goal: In this Module you will learn about Design, Implementation, and on-going support of Cassandra Operational Data. Finally, you will
learn how to Host a Cassandra Database on Cloud.
Skills:
• Security • Design Implementation • On-going support of Cassandra Operational Data
Topics:
• Security • Ongoing Support of Cassandra Operational Data • Hosting a Cassandra Database on Cloud
Hands On:
• Hosting Cassandra Database on Amazon Web Services