Materialized View

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 29

Materialized view

 A materialized view is a database object that contains the results of a query.

 For example, it may be a local copy of data located remotely, or may be a subset of the
rows and/or columns of a table or join result, or may be a summary using an aggregate
function.

 The process of setting up a materialized view is sometimes called materialization.

 This is a form of caching the results of a query, similar to memorization of the value of a
function in functional languages, and it is sometimes described as a form of pre-
computation.

 Database users typically use materialized views for performance reasons, i.e. as a form of
optimization.

 Materialized views which store data based on remote tables are also known as snapshots.
Materialized View
Difference Between View and Materialized View

Definition of View

 View is a virtual table, created using Create View command.

 This virtual table contains the data retrieved from a query


expression, in Create View command.

 View can be created from one or more than one base tables or views.

 A view can be queried like we query the original base tables.


 It is not that the View is precomputed and stored on the disk instead, a View
is computed each time it is used or accessed.

 Whenever a view is used the query expression in Create View command is


executed at that particular moment.

 Hence, you always get the updated data in a View.

 If you update any content in View, it is reflected in the original table, and if any
changes had been done to the original base table, it would reflect in its View.

 But this makes the performance of a View slower.

 For example, a view is created from the join of two or more tables.

 In that case, you have to pay time to resolve Joins each time a View is used.
 But it has some advantages like it do not require storage
space.

 You can create a customized view of a complex database.

 You can restrict the user from accessing sensitive information in


a database.

 Reduces the complexity of queries by getting data from


several tables into a single customized View.
The syntax of View

C re a t e V ie w V A s < Q u e r y E x p r e s s i o n >
 Note: all View are not updateable.

 Like a View created using DISTINCT clause, Group By clause,


CHECK constraint (if check constraints violate), Read-only option
can’t be updated.
Definition of Materialized View

 Materialized View is the Physical copy of the original base tables.

 The Materialized View is like a snapshot or picture of the original


base tables.

 Like View, it also contains the data retrieved from the query


expression of Create Materialized View command.
 But unlike View, the Materialized View
are precomputed and stored on a disk like an object, and they
are not updated each time they are used.

 Instead, the materialized view has to be updated manually or


with the help of triggers.

 The process of updating the Materialized View is


called Materialized View Maintenance.
 Materialized View responds faster in comparison to View.

 It is because the materialized view is precomputed and hence, it


does not waste time in resolving the query or joins in the query
that creates the Materialized View.

 Which in turn responses faster to the query made on materialized


view.
The syntax of Materialized View:

C r e a t e M a t e ri a l i z e d Vi e w V

Bu i ld [ c l a u s e ] R e f r e sh [ t y p e ]

ON [trigger ]

A s < q u e r y e x p r e s si o n >
 Where Build clause decides, when to populate the Materialized
View.

 Refresh type decides how to update the Materialized View and


trigger decides when to update the materialized View.

 Materalized Views are generally used in the data warehouse.


Difference Between View and Materialized View

 The basic difference between View and Materialized View is that


Views are not stored physically on the disk. Materialized Views
are stored on the disc.
 View can be defined as a virtual table created as a result of the query
expression. Materialized View is a physical copy, picture or snapshot of
the base table.

 A view is always updated as the query creating View executes each time


the View is used. Materialized View is updated manually or by
applying triggers to it.

 Materialized View responds faster than View as the Materialized View is


precomputed.

 Materialized View utilizes the memory space as it stored on the disk


whereas, the View is just a display hence it do not require memory
space.
 View Vs Materialized View: Comparison Chart

BASIS FOR
VIEW MATERIALIZED VIEW
COMPARISON

Basic A View is never stored it A Materialized View is


is only displayed. stored on the disk.

Define View is the virtual table Materialized view is a


formed from one or physical copy of the base
more base tables or table.
views.
Update View is updated each time Materialized View has to be
the virtual table (View) is updated manually or using
used. triggers.
Speed Slow processing. Fast processing.
Memory usage View do not require Materialized View utilizes
memory space. memory space.
Syntax Create View V As Create Materialized View V
Build [clause] Refresh
[clause] On [Trigger] As
Materialized Views: NoSQL

 NoSQL database do not have views, but they have pre-computed


and cached queries.

 This is a central aspect for aggregate-oriented databases since


some queries may not fit with the aggregate structure.

 Often, materialized views are created using map - reduce


computation.
Materialized Views: Approaches

There are two approaches: Eager and Lazy.

Eager

 The materialized view is updated when the base tables are updated.

 This approach is good when we have more frequent reads than writes

Lazy:

 The updates are run via batch jobs at regular interval

 It is good when the data updates are not business critical


 Moreover, we can create views outside the database.

 We can read the data, computing the view, and saving it back to
the database (MapReduce).

 Often the databases support building materialized views themselves


(Incremental MapReduce).

 We provide the need computation and

 The databases execute computation when needed


Distribution Models:
 Aggregate oriented databases make distribution of data easier, since the distribution
mechanism has to move the aggregate and not have to worry about related data, as all
the related data is contained in the aggregate.

There are two styles of distributing data:

 1. Sharding

 2. Replication

 Sharding: Sharding distributes different data across multiple servers, so each server
acts as the single source for a subset of data.

 Replication: Replication copies data across multiple servers, so each bit of data can be
found in multiple places.
Replication comes in two forms:

 Master-slave replication makes one node the authoritative copy


that handles writes while slaves synchronize with the master and
may handle reads.

 Peer-to-peer replication allows writes to any node; the nodes


coordinate to synchronize their copies of the data.
Note:

 Master-slave replication reduces the chance of update conflicts but


peer-to-peer replication avoids loading all writes onto a single
server creating a single point of failure.

 A system may use either or both techniques.

 Like Riak database shards the data and also replicates it based on
the replication factor.
 What is Database Sharding?

 The term "sharding" was coined by Google and popularized


through its publication of the "Big Table" architecture. 

 Database sharding is the process of making partitions of data


in a database or search engine, such that the data is divided into
various smaller distinct chunks, or shards.

 Each shard could be a table, a Postgres schema, or a different


physical database held on a separate database server instance.
 Database sharding can be defined as a partitioning scheme for
large databases distributed across various servers, and is
responsible for new levels of database performance and
scalability.

 It divides a database into smaller part called “shards” and


replicates those across a number of distributed servers.

 Sharding is a method for distributing data across multiple


machines
 Sharding is a method of splitting and storing a single logical dataset in
multiple databases.

 Sharding is a type of database partitioning that separates very large


databases the into smaller, faster, more easily managed parts called
data shards. The word shard means a small part of a whole.

 Sharding is also referred as horizontal partitioning.

 A database can be

 split vertically — storing different tables and columns in a separate


database, or

 horizontally — storing rows of a same table in multiple database nodes


Why Sharding?
 By distributing the data among multiple machines, a cluster of
database systems can store larger dataset and handle additional
requests.
 Sharding is necessary if a dataset is too large to be stored in a single
database. Moreover, many sharding strategies allow additional
machines to be added.
 Sharding allows a database cluster to scale along with its data and
traffic growth.
 The main appeal of sharding a database is that it can help to
facilitate horizontal scaling, also known as scaling out.
 Another reason why some might choose a sharded database
architecture is to speed up query response times
 Some data within the database remains present in all shards
(vertical sharding), but some appear only in single shards
(horizontal sharding).

 The following figure illustrates vertical sharding and horizontal


sharding.
Sharding key

 To shard data, user need to decide a key, called a sharding key, to


partition user data on.

 The shard key is either an indexed field or indexed compound fields


that exist in every document in the collection.

 There is no general rule to select a sharding key; what key user


choose depends on the application.

 For instance, user may choose userID as the shard key in a social
media app.
Queries

 Sharding allows your application to make fewer queries.

 When it receives a request, the application knows where to route


the request and thus it has to look through less data, rather than
going through the whole database.

 It improves the performance of your application, and lets you rest


easier, not having to worry about scalability issues.

You might also like