Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Introduction to NoSQL

THS & MSS


Overview
What is NoSQL?
Categories of NoSQL.
Business drivers.
RDBMS vs. NoSQL.

IBAD - 2014 - MSS 2


What is NoSQL?
What do you think it is?

NoSQL is a set of concepts that allows the rapid


and efficient processing of data sets with a focus
on performance, reliability, and agility.

It is not the opposite of relational world!

IBAD - 2014 - MSS 3


Ok, so, what is NoSQL actually?
Its more than rows.
Its free of joins.
Its schema-free.
Its distributed.
Its not about the language, SQL.
Its not always about cloud.

Not only SQL

IBAD - 2014 - MSS 4


Categories of NoSQL

IBAD - 2014 - MSS 5


Business drivers

Volume,
Velocity,
Agility,
Variability,

IBAD - 2014 - MSS 6


RDBMS vs. NoSQL: pros & cons

IBAD - 2014 - MSS 7


RDBMS, pros:
Atomic, Consistent, Isolation, Durable
Security on columns and rows using views.
Most SQL code is portable (standardized).
Typed columns and constraints will validate data
before its added to the database and increase
data quality.
Entity-relational design and SQL are popular.

IBAD - 2014 - MSS 8


RDBMS, cons:
The object-relational mapping can be complex.
Entity-relationship modeling must be completed
before testing begins.
RDBMSs dont scale out when joins are required.
Full-text search requires third-party tools.
It can be difficult to store high-variability data.

IBAD - 2014 - MSS 9


NoSQL, pros:
No ER modeling is required, faster development.
Linear scaling takes place as new processing
nodes are added to the cluster.
Theres no need for an object-relational
mapping layer.
Its easy to store high-variability data.
Designed for performance through data
distribution.

IBAD - 2014 - MSS 10


NoSQL, cons:
ACID transactions can be done only within a
document at the database level. Other
transactions must be done at the application
level.
Document stores dont provide fine-grained
security at the element level.
NoSQL is relatively not popular.
The document store has its own proprietary
nonstandard query language, less portable.

IBAD - 2014 - MSS 11


Data sharding
Is splitting the data into chunks, then distribute
those chunks to neighboring nodes in a
distributed setting.

This action is taken when a node is almost


exceeding its maximum capacity.

IBAD - 2014 - MSS 12


Data sharding
Is splitting the data into chunks, then distribute
those chunks to neighboring nodes in a
distributed setting.

This action is taken when a node is almost


exceeding its maximum capacity.

IBAD - 2014 - MSS 13


The CAP theorem [1]
The CAP theorem is a set of property when
working in a distributed setting.
The properties are:
Consistency: when multiple clients read the
same query result.
High availability: when the system is guaranteed
to response any query.
Partitioning tolerance: the system stays serving
query when part of it is disconnected.
IBAD - 2014 - MSS 14
The CAP theorem [2]
The theorem, introduced by Brewer in 2000,
states that only two from the three properties
can be preserved in the context of distributed
setting.

IBAD - 2014 - MSS 15


EOF

IBAD - 2014 - MSS 16

You might also like