Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

NoSQL Databases

By,
Dr. Zartasha Baloch
zartasha.baloch@faculty.muet.edu.pk
OBJECTIVES
In this lecture we will learn about:
❖ What is NoSQL
❖ Types of Databases
❖ Document oriented database
❖ Graph based database
❖ Column based database
❖ Key value database
UNDERSTANDING SQL AND NOSQL
DATABASES
When it comes to managing data, there are two main types of databases:
🢝 SQL
🢝 NoSQL.

While both types of databases are used to store and organize data, they differ in
their structure, scalability, and query complexity.

DEPARTMENT OF COMPUTER SYSTEMS ENGINEERING, MUET JAMSHORO. 3


WHAT IS SQL?
Structured Query Language, or SQL, is a standard language used for relational database
management system (RDBMS) that uses tables, columns, and rows to organize data.
SQL databases are highly structured, supporting ACID transactions and referential
constraints.

ACID transactions ensure data integrity, with properties that guarantee Atomicity,
Consistency, Isolation, and Durability.
🢝 Atomicity ensures that all parts of a transaction are completed or none are,
🢝 Consistency ensures that the database remains in a valid state before and after a transaction,
🢝 Isolation ensures that concurrent transactions do not interfere with each other, and
🢝 Durability ensures that once a transaction is committed, it will remain committed even if there is a
system failure.

Referential constraints maintain the consistency of data by enforcing relationships


between tables.
🢝 For example, if you have a table of customers and a table of orders, you can enforce a referential
constraint that ensures that each order is associated with a valid customer.

DEPARTMENT OF COMPUTER SYSTEMS ENGINEERING, MUET JAMSHORO. 4


WHAT IS NOSQL?
A NoSQL (originally referring to "non-SQL" or "non-relational") database provides a
mechanism for storage and retrieval of data that is modeled in means other than the tabular
relations used in relational databases.
It encompasses a wide variety of different database technologies that were developed in
response to a rise in the volume of data stored about users, objects and products, the
frequency in which this data is accessed, and performance and processing needs.
Consider that you have a blogging application that stores user blogs. Now suppose that you
have to incorporate some new features in your application such as users liking these blog
posts or commenting on them or liking these comments. With a typical RDBMS
implementation, this will need a complete overhaul to your existing database design.
However, if you use NoSQL in such scenarios, you can easily modify your data structure to
match these agile requirements. With NoSQL you can directly start inserting this new data in
your existing structure without creating any new pre-defined columns or pre-defined
structure.

5
CHALLENGES OF RDBMS

RDBMS assumes a well-defined structure of data and assumes that the data is
largely uniform.

It needs the schema of your application and its properties (columns, types, etc.)
to be defined up-front before building the application. This does not match well
with the agile development approaches for highly dynamic applications.

As the data starts to grow larger, you have to scale your database vertically, i.e.
adding more capacity to the existing servers.

6
KEY DIFFERENCES BETWEEN SQL AND
NOSQL DATABASES
The main differences between SQL and NoSQL databases include data structure, scalability, and
query complexity.

SQL databases are highly structured and use a strict schema to define the relationships between
tables. This makes them ideal for handling complex queries and supporting ACID transactions.
However, SQL databases can be less flexible than NoSQL databases, as changing the schema
can be time-consuming and difficult.

NoSQL databases, on the other hand, do not enforce a schema, and data can be added or
changed easily. This makes them ideal for handling large and varied data sets that require high
scalability and availability. However, NoSQL databases can be less suitable for complex queries,
as they do not have the same level of support for joins and other advanced query operations.

Ultimately, the choice between SQL and NoSQL databases depends on the specific needs of your
application. If you need to handle complex queries and require ACID transactions, a SQL
database may be the best choice. If you need to handle large and varied data sets with high
scalability and availability, a NoSQL database may be the way to go.

DEPARTMENT OF COMPUTER SYSTEMS ENGINEERING, MUET JAMSHORO. 7


FEATURES OF NOSQL
➢ Never follows relational model
➢ Never provides tables with fixed columns
➢ Schema free
➢ Provides share nothing environment
➢ Scalability
➢ Has low cost hardware
➢ Faster performance

DEPARTMENT OF COMPUTER SYSTEMS ENGINEERING, MUET JAMSHORO. 8


DEPARTMENT OF COMPUTER SYSTEMS ENGINEERING, MUET JAMSHORO. 9
Sr. No. Key SQL NoSQL

Type SQL database is generally classified as a Relational database While NOSQL database is known as non-relational or
1 i.e. RDBMS. distributed database.
Language As we already know SQL uses structured query language for its NoSQL database on other hand has dynamic schema for
CRUD operation which is defined as SQL. This makes SQL unstructured data.Data stored in this type of database is not
2 database to store data in more structured form and also structured and could be stored in either of forms such as
preferred for more complex operations which could get document-oriented, column-oriented, graph-based or organized
completed with complex SQL queries. as a KeyValue store. This syntax can be varied from DB to DB.

Scalability SQL database can extends its capacity on single server by In order to increase the capacity of NOSQL dbs we required to
increasing things like RAM, CPU or SSD i.e we can say that SQL install new servers parallel to the parent server i.e NOSQL dbs
3 DBs could be scalable in vertical as their storage could be could be scalable in horizontal and this made them more
increase for the same server by enhancing its storage preferable choice for large or ever-changing data sets.
components.

Internal implementation SQL follows ACID properties for its operations which is On other hand NOSQL is based on Brewers CAP theorem which
4 abbreviation of Atomicity, Consistency, Isolation and Durability. mainly focus on Consistency, Availability and Partition tolerance.

Performance and suited for SQL databases are best suited for complex queries but are not NoSQL databases are not so good for complex queries because
preferred for hierarchical large data storage. these are not as powerful as SQL queries but are best suited for
5
hierarchical large data storage.

Examples SQL dbs is implemented in both open source and commercial On other hand NOSQL is purely open source and MongoDB,
6 Database such as like Postgres & MySQL as open source and BigTable, Redis, RavenDB, Cassandra, Hbase, Neo4j, CouchDB
Oracle and Sqlite as commercial. are the main implementation of it.
10
BENEFITS OF NOSQL OVER RDBMS
Schema Less
🢝 NoSQL databases being schema-less do not define any strict data structure.

Dynamic and Agile


🢝 NoSQL databases have good tendency to grow dynamically with changing requirements. It can handle structured,
semi-structured and unstructured data.

Scales Horizontally
🢝 In contrast to SQL databases which scale vertically, NoSQL scales horizontally by adding more servers and using
concepts of sharding and replication. This behavior of NoSQL fits with the cloud computing services such as Amazon
Web Services (AWS) which allows you to handle virtual servers which can be expanded horizontally on demand.

Better Performance
🢝 All the NoSQL databases claim to deliver better and faster performance as compared to traditional RDBMS
implementations.

Talking about the limitations, since NoSQL is an entire set of databases (and not a single database), the
limitations differ from database to database. Some of these databases do not support ACID transactions
while some of them might be lacking in reliability. But each one of them has their own strengths due to
which they are well suited for specific requirements.

11
WHAT D O NOSQL DATABASES HAVE IN COMMON?

Each NoSQL database has its own unique features. At a high level, many NoSQL
databases have the following features:

▪ They use distributed databases, which are based on shared-nothing


architecture.
▪ NoSQL databases can easily be scaled out horizontally, depending on the
volumes of data.
▪ All of the NoSQL databases have a flexible schema.
▪ Process structured, semi-structured as well as non-structured data.
▪ The format of storing data is different from relational databases hence are non-
relational.

DEPARTMENT OF COMPUTER SYSTEMS ENGINEERING, MUET JAMSHORO. 12


TYPES OF NOSQL DATABASES

Document oriented database


Graph based database
Column based database
Key value database
KEY VALUE DATABASES

The key of a key/value pair is a unique value in the set and can be easily looked
up to access the data.
Key/value pairs are of varied types: some keep the data in memory, and some
provide the capability to persist the data to disk.
A simple, yet powerful, key/value store is Oracle’s Berkeley DB.

14
KEY VALUE DATABASES
Key-value databases are the simplest type of NoSQL database.

These NoSQL databases have a dictionary data structure that consists of a set of
objects that represent fields of data.
🢝 Each object is assigned a unique key.
🢝 To retrieve data stored in a particular object, you need to use a specific key.
🢝 In turn, you get the value assigned to the key. This value can be a number, a string, or even
another set of key-value pairs.

Unlike traditional relational databases, key-value databases do not require a


predefined structure. They offer more flexibility when storing data and have
faster performance.

DEPARTMENT OF COMPUTER SYSTEMS ENGINEERING, MUET JAMSHORO. 15


16
17
18
KEY VALUE DATABASES
DOCUMENT-ORIENTED DATABASE

A document-oriented database is a computer program designed for storing, retrieving, and


managing document-oriented, or semi structured data or information.
Documents inside a document-oriented database are like records or rows in relational
databases.
Does not store data in tables with uniform sized fields for each record.
These databases treat a document as a whole and avoid splitting a document in its constituent
name/value pairs. At a collection level, this allows for putting together a diverse set of
documents into a single collection. Document databases allow indexing of documents based on
not only its primary identifier but also its properties.
Few examples are MongoDB, CouchDB, DocumentDB and Oracle NoSQL Datatbase.
21
22
DOCUMENT-ORIENTED DATABASE

23
COLUMN BASED DATABASES

The column-oriented storage allows data to be stored effectively.


It avoids consuming space when storing nulls by simply not storing a column
when a value doesn’t exist for that column.
Each unit of data can be thought of as a set of key/value pairs, where the unit
itself is identified with the help of a primary identifier, often referred to as the
primary key.
Bigtable and its clones tend to call this primary key the row-key.

24
25
COLUMN BASED DATABASES

26
GRAPH BASED DATABASES

A graph database uses graph structures with nodes, edges, and properties to
represent and store data. By definition, a graph database is any storage system
that provides index-free adjacency.
This means that every element contains a direct pointer to its adjacent element
and no index lookups are necessary.
General graph databases that can store any graph are distinct from specialized
graph databases such as triple-stores and network databases.
Indexes are used for traversing the graph.
Some examples include Amazone Neptune, Neo4j, OrientDB, and RedisGraph.

27
28
GRAPH BASED DATABASES

29
POPULAR NOSQL DATABASES
Let us summarize some popular NoSQL databases that falls in the above
categories respectively.
Document Oriented Databases − MongoDB, HBase, Cassandra, Amazon
SimpleDB, Hypertable, etc.
Graph Based Databases − Neo4j, OrientDB, Facebook Open Graph, FlockDB,
etc.
Column Based Databases − CouchDB, OrientDB, etc.
Key Value Databases − Membase, Redis, MemcacheDB, etc.

30
CAP THEOREM
❖ It can be used to explain some of the competing requirements in a distributed
system with replication.
❖ a distributed system can deliver only two of three desired characteristics:
consistency, availability, and partition tolerance (the ‘C,’ ‘A’ and ‘P’ in CAP).

DEPARTMENT OF COMPUTER SYSTEMS ENGINEERING, MUET JAMSHORO. 31


CAP THEOREM
❖ Consistency:
It means that all clients see the same data at the same time, no
matter which node they connect to. For this to happen, whenever
data is written to one node, it must be instantly forwarded or
replicated to all the other nodes in the system before the write is
deemed ‘successful.’
❖ Availability:
This means that any client making a request for data gets a
response, even if one or more nodes are down. Another way to
state this—all working nodes in the distributed system return a
valid response for any request, without exception.
❖ Partition tolerance
A partition is a communications break within a distributed
system—a lost or temporarily delayed connection between two
nodes. Partition tolerance means that the cluster must continue to
work despite any number of communication breakdowns between
nodes in the system.

DEPARTMENT OF COMPUTER SYSTEMS ENGINEERING, MUET JAMSHORO. 32


CAP THEOREM

DEPARTMENT OF COMPUTER SYSTEMS ENGINEERING, MUET JAMSHORO. 33


CAP THEOREM
❖ CP database: A CP database delivers consistency and partition tolerance at the
expense of availability. When a partition occurs between any two nodes, the
system has to shut down the non-consistent node (i.e., make it unavailable) until
the partition is resolved.

❖ AP database: An AP database delivers availability and partition tolerance at


the expense of consistency. When a partition occurs, all nodes remain available
but those at the wrong end of a partition might return an older version of data
than others. (When the partition is resolved, the AP databases typically resync
the nodes to repair all inconsistencies in the system.)

❖ CA database: A CA database delivers consistency and availability across all


nodes. It can’t do this if there is a partition between any two nodes in the
system, however, and therefore can’t deliver fault tolerance.

DEPARTMENT OF COMPUTER SYSTEMS ENGINEERING, MUET JAMSHORO. 34


BASE PROPERTIES OF NOSQL
❖ NoSQL relies upon a softer model known as the BASE model. BASE (Basically
Available, Soft state, Eventual consistency).

❖ Basically Available: Guarantees the availability of the data . There will be a response to
any request (can be failure too).

❖ Soft state: The state of the system could change over time.

❖ Eventual consistency: The system will eventually become consistent once it stops
receiving input.

❖ NoSQL databases give up the A, C and/or D requirements, and in return they improve
scalability.
DEPARTMENT OF COMPUTER SYSTEMS ENGINEERING, MUET JAMSHORO. 35
CONCLUSION

In this lecture, we learnt about what NoSQL database technology is and how it
primarily differs from a RDBMS implementation. We then explored various
types of NoSQL databases, their applications and some of the most popular
databases of each type.
A lot of organizations today are adapting to such databases for their huge
datasets and high-scale applications. This shows that NoSQL is definitely going
to be the next big thing in web and database technologies which has the
potential to break the years long legacy of RDBMS.

53

You might also like