Professional Documents
Culture Documents
Topic 2: Unstructured Data
Topic 2: Unstructured Data
UNSTRUCTURED
DATA
OUTLINES
A No SQL database provides a mechanism for storage and retrieval of data that employs less constrained
consistency models than traditional relational database
Key features (advantages):
non-relational
don’t require schema
data are replicated to multiple nodes (so, identical & fault-tolerant) and can be partitioned:
down nodes easily replaced
no single point of failure
horizontal scalable
cheap, easy to implement (open-source)
massive write performance
fast key-value access
reduce complexity of SQL query
WHAT IS SCHEMA-LESS DATA MODEL: SQL vs NoSQL
In relational Databases:
In NoSQL Databases:
You can’t add a record which does
not fit the schema There is no schema to consider
In a Column Store database, data is stored in columns, as opposed to being stored in rows as is done in most
relational database management systems.
A Column Store is comprised of one or more Column Families that logically group certain columns in the
database.
A key is used to identify and point to a number of columns in the database, with a keyspace attribute that defines
the scope of this key.
Each column contains tuples of names and values, ordered and comma separated.
NoSQL DATA MODEL : COLUMN FAMILY
HOW TO WRITE IT?
Row-oriented: Each row is an
aggregate (for example, customer
with the ID of Row1) with column
families representing useful chunks
of data (country, product, sales)
within that aggregate.
Column-oriented: Each column
family defines a record type (e.g.,
country) with rows for each of the
records. You then think of a row as
the join of records in all column
families.
NoSQL DATA MODEL
: GRAPH
A collection of key value pairs but the values stored (referred to as “documents”) provide some structure and
encoding of the managed data i.e. XML, JSON, BSON. A unique key is a simple identifier (string, URI, path).
Instead of storing each attribute of an entity with a separate key, document databases store multiple attributes in
a single document.
While key-value stores require the key to access data value, document store has metadata which allows data
access directly to the attribute instead of through a key.
CouchDB, Apache Cassandra, MongoDB.
NoSQL DATA MODEL
: DOCUMENT