Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Case Study on different NoSQL Data

Models
Introduction
Relational database models have been used for storage and retrieval of data since the 1960’s.
These have provided a suitable platform for recognizing relations between the data stored in
tabulated forms. But as technology progressed, complex data was hard to store at an efficient
cost in a relational model. As databases became larger, some led to isolation of the data, that
is, it became tedious to share information through such complex systems.

Therefore, in the recent years, a new model of database systems was identified which was
called NoSQL database. It stands for Not-SQL. This was developed as the popularity of the
Web 2.0 amplified due to more interactive websites coming such as Google, Facebook and
Amazon.

NoSQL uses multiple data structures to accommodate information such as graphs, wide
column and key values. The ease of working with NoSQL data models give an advantage
over relational models as they are more flexible, they remove the drawback of the horizontal
scaling of machines approach and depend upon the problem they must resolve.

Data Models of NoSQL


There are various methods to classify and store data of NoSQL databases. In this report,
NoSQL has been classified in the basis of its data models with each one’s applications.
NoSQL can be classified in the following data models:

 Key value store


 Graph model
 Document store
 Column store

Key Value data model

In this data model, the data is denoted as a collection of key-value pairs. This has been
specifically designed for handling and extracting data from associative arrays called as a hash
or a dictionary. This is a very simple, yet important way of storing NoSQL data. It is also
used for individually ordering the data in a lexicographical manner. When fetching data from
the database, one has to keep in focus the access pattern as it should match the format of the
given application. Modelling of data takes place in all systems within the application layer
except for a graph database due to its nodes and edges.
This has certain advantages over the relational database system as it does not have a query
processing system. In the key value data model, the data can be found from the entity in the
memory. Thus, making it easier as complexity tends to zero. His works well in case of
distributed database systems. Also, as the query processing does not exist in this model, it
does not have to calculate the amount of data and each relation in the particular database.

Application:

The key value data model is used in Apache Cassandra Database. This is an open source
NoSQL database which is a distributed network intended to handle large databases over
multiple servers or machines. Its main features are as follows:

 It is decentralized, that is, it can be distributed across various machines and each node
contains unique information.
 It has been made to add many machines to the distributed network with no
interruptions in applications running.
 Easy back-up and maintenance process.

Cassandra is a combination of key value and column database system. The data is stored in
tables and partitioned in two ways:

 Random Partitioning: It spreads the key value pairs over a network and balances the
partition equally.
 Order Preserving Partition: It partitions in a way such that keys with similar values are
together resulting in fewer nodes to be searched while finding a pair, although it
causes an unbalanced framework.

This has been used by Facebook, IBM, Netflix, Twitter for some of their storage or searching
techniques.

Graph data model

This is a new data model that is used frequently for storing relations between data in a
NoSQL database. It stores data similar to a graph structure containing edges and nodes. In
today’s world data is highly connected in complex manner and graph database is efficient in
exploring information. This model removes the need for joins and foreign keys. They can be
partitioned easily and be spread over multiple machines, interconnected to one another, hence
making it a cloud platform. This is not possible in relational databases with tables to store
information.

Application:

This type of database model is used in social networking platforms to form communities,
used to link road mappings of a region and store large amounts of data such as the World
Wide Web.
An example of a graph database is Neo4j. It is a database management system that manages
data traversals and query processing. It uses efficient algorithms to search, update and
optimize user experience in managing large graph databases.

Neo4j uses a specific language for operating graph databases called Cypher. It provides a
visual insight into the data stored and specifies certain relations for the end user to easily
comprehend. Neo4j supports atomicity, consistency, isolation and durability (ACID)
properties which very less NoSQL databases do. Neo4j also handles web application
databases which help in exploring through various link-nodes.

Document store

As the name suggests, this type of model stores data in the form of a document. This is an
important aspect of NoSQL. This is a subgroup of the key value store model where the only
difference is the method of processing data. In key value, the application layer processes data
and has no connection to the underlying database, whereas in this model, it depends on the
internal structure of the database to retrieve metadata which is used to manage the database.

Document store model is different in the following ways from the relational model:

 It stores the data for an object in a single case. Therefore, mapping would become
easier. On the other hand, in a relational model, an object can be stored across
multiple tables and linking the data can become a tedious task.
 Documents are encoded and encapsulated in a particular format for improvised
security whereas relational models need to adopt a different method.
 Document data does not follow a specific schema. Each data can be adjusted to fit
into a particular schema unlike the relational model where information contains same
attributes, sometimes resulting in empty fields and wasting space.

Application:

A popular example of such a model is MongoDB. It is an open source NoSQL database that
allows one to store document type data which is suitable for both, developers and end-users.
This allows one to store data in BJSON (Binary JavaScript Object Notation) which provides a
data encoding format. MongoDB is a fully flexible data model which can change with
continuously changing schema allowing applications to evolve over time.

MongoDB can spread across multiple data centres increasing scalability and growth in data
volume and throughput. It includes forms of data sharding which can allow the information to
be on a cloud with a lesser latency than RDBMS.
Similar documents in this application are stored together under a specific instance and
arranged as collections, hence making it a more localised application to run which eliminates
the need for further join operations. There is no need to describe the documents being added
to the database as documents itself describe themselves in their structure. There is no need for
updating the rest of the documents when one needs to be altered.

Queries within this database are divided into various actions it performs such as searching,
aggregation frameworks, key value queries and graph traversals. This reduces the tedious task
of developing complex algorithms to perform basic transactions.

MongoDB also provides a visual representation of the way data is stored such that data can
be easily be comprehended and analysed.

Column store

In this type of model, the data is stored in the form of cells which are further grouped in
columns. Also, similar columns are grouped to form column families. This is in contrary to
data being stored in rows in the relational model. In this method, read-write is done in
columns rather than rows.

This offers advantages such as faster combination, search and access of data in columns
rather than rows. In the relational model, a specific row is stored in the disk, whereas in this,
cells that belong to a particular column are stored in the disk resulting in a faster memory
search.

Application:

Google developed a database to compress and provide a high performance result when
storing large amounts of data called as Bigtable. His was built over the Google File System
and the SS Table. It was proprietary software but was made accessible for end users in the
recent years.

This system uses both row and column store model to accommodate large amount of data. As
large amounts of data are stored in a table, it results in forming a multi-dimensional mapping
between entities. This is designed to stores data in the level of petabytes across hundreds of
machines, thus forming a cloud based service.

All the data stored in this database system is highly compressed and is used in many
applications such as Google Earth, Maps, Google Book Search, YouTube, Gmail and
Blogger.com.

Another application of column store database model is HBase (Hadoop Database). In this
each column or row is a key value type model, where the column acts as the key and its value
is stored in the row. This makes a logical relationship amongst the data stored.
Conclusion
As technology progresses, the amount of data increases drastically and storing sensitive and
essential data becomes a high priority. NoSQL databases are an approach to this problem in
the present day scenario along with cloud storage. This report highlights different data
models of the NoSQL database and its applications with an insight on how they retrieve and
process data. To conclude, this technology is new to us and it will be used in various other
applications in the coming years due to its effectiveness way of accommodating data.

References
[1] Han, Jing, et al. "Survey on NoSQL database." Pervasive computing and applications
(ICPCA), 2011 6th international conference on. IEEE, 2011.

[2] Pokorny, Jaroslav. "NoSQL databases: a step to database scalability in web


environment." International Journal of Web Information Systems 9.1 (2013): 69-82.

[3] Tudorica, Bogdan George, and Cristian Bucur. "A comparison between several NoSQL
databases with comments and notes." Roedunet International Conference (RoEduNet), 2011
10th. IEEE, 2011.

You might also like