Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

DEPARTMENT OF INFORMATION TECHNOLOGY

INTERNAL ASSESSMENT TEST I - SET II KEY


COURSECODE/SUBJECTCODE/NAME:C313/CCS341/Data
DATE : 04.03.2024
Warehousing
BRANCH / YEAR/SEMESTER: Information Technology / III / VI TIME : 09.45 to 11.00 AM
ACADEMICYEAR :2023-2024 MARK : 50
CO1: Design data warehouse architecture for various Problems
CO2: Apply the OLAP Technology
CO3: Analyse the partitioning strategy
CO4: Critically analyze the differentiation of various schemas for given problem
CO5: Frame roles of process manager & system manager
CO6: Build a Data Warehouse for real time applications
BLOOM'S TAXONOMY
Remembering Applying Evaluating
Understanding Analyzing Creating
PART A (5 x 2 = 10 Marks)

1. Enlist the components of data warehouse.


Ans.: Users may access the converted data in the data warehouse using business
intelligence tools, SQL clients and spreadsheets once the data has been changed,
transformed and ingested. Data from several sources is merged in a data warehouse to
produce a large database.
Central database: The data warehouse is a database. They were typically local or
cloud- based common relational databases. However, in-memory databases are quickly
gaining acceptance due to Big Data, the need for true, real-time performance and a
sharp drop in RAM prices.
Data integration: Several data integration methods are used to extract data from
source systems, change it and align it for easy analytical consumption. ETL (extract,
transform, load) and ELT procedures are among them, as well as real-time data
replication, bulk-load processing, data transformation and services for data quality and
enrichment.
Meta Data: Data about our data is known as metadata. All of the characteristics of the
data sets in the data warehouse-their origin, purpose, values and other details-
are recorded.Technical metadata describes where and how data should be stored, as
well as how to get it. Business metadata provides the context of data.

2. Describe the tasks that are performed in Data Staging component?


1) Data Extraction: This method has to deal with numerous data sources. We have to employ
the appropriate techniques for each data source.
2) Data Transformation: As we know, data for a data warehouse comes from many different
sources. If data extraction for a data warehouse posture big challenges, data transformation
present even significant challenges. We perform several individual tasks as part of data
transformation.
First, we clean the data extracted from each source. Cleaning may be the correction of
misspellings or may deal with providing default values for missing data elements, or elimination
of duplicates when we bring in the same data from various source systems.
Standardization of data components forms a large part of data transformation. Data
transformation contains many forms of combining pieces of data from different sources. We
combine data from single source record or related data parts from many source records.
On the other hand, data transformation also contains purging source data that is not useful and
separating outsource records into new combinations. Sorting and merging of data take place on
a large scale in the data staging area. When the data transformation function ends, we have a
collection of integrated data that is cleaned, standardized, and summarized.

3) Data Loading: Two distinct categories of tasks form data loading functions. When we
complete the structure and construction of the data warehouse and go live for the first time, we
do the initial loading of the information into the data warehouse storage. The initial load moves
high volumes of data using up a substantial amount of time.

3. Examine the uses of an operational database?


Ans. :
• High-volume transaction processing is supported by operational systems.
• Current data are often an issue for operational systems.
• According ot need operational systems are mainly updated regularly.
• It si created for corporate operations and real-time transactions.
• It is designed to do a limited number of straightforward operations, often
adding orretrieving a single row per table at a time.
• It employs validation data tables and is designed to validate incoming
data duringtransactions.
• It can accommodate thousands of clients at once.

4. List the characteristics of OLAP.

The main characteristics of OLAP are as follows:

1. Multidimensional conceptual view: OLAP systems let business users have a dimensional and
logical view of the data in the data warehouse. It helps in carrying slice and dice operations.
2. Multi-User Support: Since the OLAP techniques are shared, the OLAP operation should
provide normal database operations, containing retrieval, update, adequacy control, integrity,
and security.
3. Accessibility: OLAP acts as a mediator between data warehouses and front-end. The OLAP
operations should be sitting between data sources and an OLAP front-end.
4. Storing OLAP results: OLAP results are kept separate from data sources.
5. Uniform documenting performance: Increasing the number of dimensions or database size
should not significantly degrade the reporting performance of the OLAP system.
6. OLAP provides for distinguishing between zero values and missing values so that aggregates
are computed correctly.
7. OLAP system should ignore all missing values and compute correct aggregate values.
8. OLAP facilitate interactive query and complex analysis for the users.
9. OLAP allows users to drill down for greater details or roll up for aggregations of
metrics along a single business dimension or across multiple dimension.
10. OLAP provides the ability to perform intricate calculations and comparisons.
11. OLAP presents results in a number of meaningful ways, including charts and graphs
5. Compare and Contrast OLAP Vs OLTP.
Parameters OLTP OLAP

Process It is an online transactional system. It manages OLAP is an online analysis and data retrieving process.
database modification.
Characteristic It is characterized by large numbers of short It is characterized by a large volume of data.
online transactions.
Functionality OLTP is an online database modifying system. OLAP is an online database query management system.

Method OLTP uses traditional DBMS. OLAP uses the data warehouse.

Query Insert, Update, and Delete information from the Mostly select operations
database.
Table Tables in OLTP database are normalized. Tables in OLAP database are not normalized.

Source OLTP and its transactions are the sources of Different OLTP databases become the source of data
data. for OLAP.
Data Integrity OLTP database must maintain data integrity OLAP database does not get frequently modified.
constraint. Hence, data integrity is not an issue.
Response time It’s response time is in millisecond. Response time in seconds to minutes.

Data quality The data in the OLTP database is always The data in OLAP process might not be organized.
detailed and organized.
Usefulness It helps to control and run fundamental It helps with planning, problem-solving, and decision
business tasks. support.
Operation Allow read/write operations. Only read and rarely write.

Audience It is a market orientated process. It is a customer orientated process.

Query Type Queries in this process are standardized and Complex queries involving aggregations.
simple.
Back-up Complete backup of the data combined with OLAP only need a backup from time to time. Backup is
incremental backups. not important compared to OLTP
Design DB design is application oriented. Example: DB design is subject oriented. Example: Database
Database design changes with industry like design changes with subjects like sales, marketing,
User type Retail,
It is used
Airline,
by DataBanking,
critical etc.
users like clerk, DBA purchasing,
Used by Dataetc.
knowledge users like workers, managers,
& Data Base professionals. and CEO.
Purpose Designed for real time business operations. Designed for analysis of business measures by category
and attributes.
Performance Transaction throughput is the performance Query throughput is the performance metric.
metric metric
Number of This kind of Database users allows thousands This kind of Database allows only hundreds of users.
users of users.
Productivity It helps to Increase user’s self-service and Help to Increase productivity of the business analysts.
productivity
Challenge Data Warehouses historically have been a An OLAP cube is not an open SQL server data
development project which may prove costly to warehouse. Therefore, technical knowledge and
Process build.
It provides fast result for daily used data. experience
It ensures that
is essential
responsetotomanage
the query
theisOLAP
quicker
server.
consistently.
Characteristic It is easy to create and maintain. It lets the user create a view with the help of a
spreadsheet.
PART B (2*13 = 26 Marks)

6a. Analyze the different components of data warehouse explain in detail.


Components or Building Blocks of Data Warehouse

Architecture is the proper arrangement of the elements. We build a data warehouse with software
and hardware components. To suit the requirements of our organizations, we arrange these
building we may want to boost up another part with extra tools and services. All of these
depends on our circumstances.

The figure shows the essential elements of a typical warehouse. We see the Source Data
component shows on the left. The Data staging element serves as the next building block. In
the middle, we see the Data Storage component that handles the data warehouses data. This
element not only stores and manages the data; it also keeps track of data using the metadata
repository. The Information Delivery component shows on the right consists of all the
different ways of making the information from the data warehouses available to the users.

Source Data Component


Source data coming into the data warehouses may be grouped into four broad categories:
Production Data: This type of data comes from the different operating systems of the
enterprise. Based on the data requirements in the data warehouse, we choose segments of the
data from the various operational modes.
Internal Data: In each organization, the client keeps their "private" spreadsheets, reports,
customer profiles, and sometimes even department databases. This is the internal data, part of
which could be useful in a data warehouse.
Archived Data: Operational systems are mainly intended to run the current business. In every
operational system, we periodically take the old data and store it in achieved files.

External Data: Most executives depend on information from external sources for a large
percentage of the information they use. They use statistics associating to their industry
produced by the external department.

Data Staging Component


After we have been extracted data from various operational systems and external sources, we
have to prepare the files for storing in the data warehouse. The extracted data coming from
several different sources need to be changed, converted, and made ready in a format that is
relevant to be saved for querying and analysis.
We will now discuss the three primary functions that take place in the staging area.

1) Data Extraction: This method has to deal with numerous data sources. We have to employ
the appropriate techniques for each data source.

2) Data Transformation: As we know, data for a data warehouse comes from many different
sources. If data extraction for a data warehouse posture big challenges, data transformation
present even significant challenges. We perform several individual tasks as part of data
transformation.
First, we clean the data extracted from each source. Cleaning may be the correction of
misspellings or may deal with providing default values for missing data elements, or
elimination of duplicates when we bring in the same data from various source systems.
Standardization of data components forms a large part of data transformation. Data
transformation contains many forms of combining pieces of data from different sources. We
combine data from single source record or related data parts from many source records.
On the other hand, data transformation also contains purging source data that is not useful and
separating outsource records into new combinations. Sorting and merging of data take place
on a large scale in the data staging area. When the data transformation function ends, we have
a collection of integrated data that is cleaned, standardized, and summarized.
3) Data Loading: Two distinct categories of tasks form data loading functions. When we
complete the structure and construction of the data warehouse and go live for the first time,
we do the initial loading of the information into the data warehouse storage. The initial load
moves high volumes of data using up a substantial amount of time.

Data Storage Components


Data storage for the data warehousing is a split repository. The data repositories for the
operational systems generally include only the current data. Also, these data repositories
include the data structured in highly normalized for fast and efficient processing.

Information Delivery Component


The information delivery element is used to enable the process of subscribing for data
warehouse files and having it transferred to one or more destinations according to some
customer-specified scheduling algorithm.

Metadata Component
Metadata in a data warehouse is equal to the data dictionary or the data catalog in a
database management system. In the data dictionary, we keep the data about the logical
data structures, the data about the records and addresses, the information about the
indexes, and so on.

Data Marts
It includes a subset of corporate-wide data that is of value to a specific group of users.
The scope is confined to particular selected subjects. Data in a data warehouse should be a
fairly current, but not mainly up to the minute, although development in the data
warehouse industry has made standard and incremental data dumps more achievable. Data
marts are lower than data warehouses and usually contain organization. The current trends
in data warehousing are to developed a data warehouse with several smaller related data
marts for particular kinds of queries and reports.

Management and Control Component


The management and control elements coordinate the services and functions within the
data warehouse. These components control the data transformation and the data transfer
into the data warehouse storage. On the other hand, it moderates the data delivery to the
clients. Its work with the database management systems and authorizes data to be
correctly saved in the repositories. It monitors the movement of information into the
staging method and from there into the data warehouses storage itself.
6b. Explain the architecture of data warehouse in detail.

Production applications such as payroll accounts payable product purchasing and inventory
control are designed for online transaction processing (OLTP). Such applications gather detailed
data from day to day operations.Data Warehouse applications are designed to support the user ad-
hoc data requirements, an activity recently dubbed online analytical processing (OLAP). These
include applications such as forecasting, profiling, summary reporting, and trend analysis.
Production databases are updated continuously by either by hand or via OLTP applications. In
contrast, a warehouse database is updated from operational systems periodically, usually during
off-hours. As OLTP data accumulates in production databases, it is regularly extracted, filtered,
and then loaded into a dedicated warehouse server that is accessible to users.

Data warehouses and their architectures very depending upon the elements of an organization's
situation.

Three common architectures are:


o Data Warehouse Architecture: Basic
o Data Warehouse Architecture: With Staging Area
o Data Warehouse Architecture: With Staging Area and Data Marts

Data Warehouse Architecture: Basic

Operational System
An operational system is a method used in data warehousing to refer to a system that is used
to process the day-to-day transactions of an organization.
Flat Files
A Flat file system is a system of files in which transactional data is stored, and every file in the
system must have a different name.
Meta Data
A set of data that defines and gives information about other data.
Meta Data used in Data Warehouse for a variety of purpose, including:Meta Data summarizes
necessary information about data, which can make finding and work with particular instances
of data more accessible. For example, author, data build, and data changed, and file size are
examples of very basic document metadata.Metadata is used to direct a query to the most
appropriate data source.
Lightly and highly summarized data
The area of the data warehouse saves all the predefined lightly and highly summarized
(aggregated) data generated by the warehouse manager.The goals of the summarized
information are to speed up query performance. The summarized record is updated
continuously as new information is loaded into the warehouse.

End-User access Tools


The principal purpose of a data warehouse is to provide information to the business managers
for strategic decision-making. These customers interact with the warehouse using end-client
access tools.

The examples of some of the end-user access tools can be:

o Reporting and Query Tools


o Application Development Tools
o Executive Information Systems Tools
o Online Analytical Processing Tools
o Data Mining Tools

Data Warehouse Architecture: With Staging Area


We must clean and process your operational information before put it into the warehouse.
We can do this programmatically, although data warehouses uses a staging area (A place where
data is processed before entering the warehouse).A staging area simplifies data cleansing and
consolidation for operational method coming from multiple source systems, especially for
enterprise data warehouses where all relevant data of an enterprise is consolidated.

Data Warehouse Staging Area is a temporary location where a record from source systems is
copied.
Data Warehouse Architecture: With Staging Area and Data Marts

We may want to customize our warehouse's architecture for multiple groups within our
organization.
We can do this by adding data marts. A data mart is a segment of a data warehouses that can
provided information for reporting and analysis on a section, unit, department or operation in the
company, e.g., sales, payroll, production, etc.The figure illustrates an example where purchasing,
sales, and stocks are separated. In this example, a financial analyst wants to analyze historical data
for purchases and sales or mine historical information to make predictions about customer
behavior.
Properties of Data Warehouse Architectures
The following architecture properties are necessary for a data warehouse system:
1. Separation: Analytical and transactional processing should be keep apart as much as possible.
2. Scalability: Hardware and software architectures should be simple to upgrade the data volume,
which has to be managed and processed, and the number of user's requirements, which have to be
met, progressively increase.
3. Extensibility: The architecture should be able to perform new operations and technologies
without redesigning the whole system.
4. Security: Monitoring accesses are necessary because of the strategic data stored in the data
warehouses.
5. Administerability: Data Warehouse management should not be complicated.

7a. Explain ETL process with neat diagram.

Extract, transform, and load (ETL) is the process of combining data from multiple sources into a large,
central repository called a data warehouse. ETL uses a set of business rules to clean and organize raw
data and prepare it for storage, data analytic, and machine learning (ML). You can address specific
business intelligence needs through data analytic.

The mechanism of extracting information from source systems and bringing it into the data warehouse
is commonly called ETL, which stands for Extraction, Transformation and Loading.

The ETL process requires active inputs from various stakeholders, including developers,
analysts, testers, top executives and is technically challenging.
How ETL Works?

ETL consists of three separate phases:

Extraction:

Extraction is the operation of extracting information from a source system for further use in a data
warehouse environment. This is the first stage of the ETL process.

Extraction process is often one of the most time-consuming tasks in the ETL.
 The source systems might be complicated and poorly documented, and thus determining which
data needs to be extracted can be difficult.
 The data has to be extracted several times in a periodic manner to supply all changed data to the
warehouse and keep it up-to-date.

Cleansing

The cleansing stage is crucial in a data warehouse technique because it is supposed to improve
data quality. The primary data cleansing features found in ETL tools are rectification and
homogenization. They use specific dictionaries to rectify typing mistakes and to recognize synonyms,
as well as rule-based cleansing to enforce domain-specific rules and defines appropriate associations
between values.
The following examples show the essential of data cleaning:

 If an enterprise wishes to contact its users or its suppliers, a complete, accurate and up-to-date list
of contact addresses, email addresses and telephone numbers must be available.
 If a client or supplier calls, the staff responding should be quickly able to find the person in the
enterprise database, but this need that the caller's name or his/her company name is listed in the
database.
 If a user appears in the databases with two or more slightly different names or different account
numbers, it becomes difficult to update the customer's information.

Transformation :

Transformation is the core of the reconciliation phase. It converts records from its operational source
format into a particular data warehouse format. If we implement a three-layer architecture, this phase
outputs our reconciled data layer.
The following points must be rectified in this phase:

 Loose texts may hide valuable information. For example, XYZ PVT Ltd does not explicitly show
that this is a Limited Partnership company.
 Different formats can be used for individual data. For example, data can be saved as a string or as
three integers.

Following are the main transformation processes aimed at populating the reconciled data layer:
 Conversion and normalization that operate on both storage formats and units of measure to make
data uniform.
 Matching that associates equivalent fields in different sources.
 Selection that reduces the number of source fields and records.

Cleansing and Transformation processes are often closely linked in ETL tools.
Loading:
The Load is the process of writing the data into the target database. During the load step, it is
necessary to ensure that the load is performed correctly and with as little resources as possible.
Loading can be carried in two ways:

 Refresh: Data Warehouse data is completely rewritten. This means that older file is replaced.
Refresh is usually used in combination with static extraction to populate a data warehouse
initially.

 Update: Only those changes applied to source information are added to the Data Warehouse. An
update is typically carried out without deleting or modifying preexisting data. This method is
used in combination with incremental extraction to update data warehouses regularly.

Selecting an ETL Tool:


 Selection of an appropriate ETL Tools is an important decision that has to be made in choosing
the importance of an ODS or data warehousing application. The ETL tools are required to
provide coordinated access to multiple data sources so that relevant data may be extracted from
them. An ETL tool would generally contains tools for data cleansing, re-organization,
transformations, aggregation, calculation and automatic loading of information into the object
database.

 An ETL tool should provide a simple user interface that allows data cleansing and data
transformation rules to be specified using a point-and-click approach. When all mappings and
transformations have been defined, the ETL tool should automatically generate the data
extract/transformation/load programs, which typically run in batch mode.
Difference between ETL and ELT ETL (Extract, Transform, and Load)
Extract, Transform and Load is the technique of extracting the record from sources (which is
present outside or on-premises, etc.) to a staging area, then transforming or reformatting with business
manipulation performed on it in order to fit the operational needs or data analysis, and later loading
into the goal or destination databases or data warehouse.
Strengths:
 Development Time: Designing from the output backwards provide that only information
applicable to the solution is extracted and processed, potentially decreasing development, delete,
and processing overhead.
 Targeted data: Due to the targeted feature of the load process, the warehouse contains only
information relevant to the presentation. Reduced warehouse content simplify the security regime
enforce and hence the administration overheads.
Tools Availability: The number of tools available that implement ETL provides the flexibility of
approach and the opportunity to identify the most appropriate tool. The proliferation of tools has to lead
to a competitive functionality war, which often results in loss of maintainability.

Weaknesses
Flexibility: Targeting only relevant information for output means that any future requirements
that may need data that was not included in the original design will need to be added to the ETL routines.
Due to the nature of tight dependency between the methods developed, this often leads to a need for
fundamental redesign and development. As a result, this increase the time and cost involved.
Hardware: Most third-party tools utilize their engine to implement the ETL phase. Regardless of
the estimate of the solution, this can necessitate the investment in additional hardware to implement the
tool's ETL engine. The use of third-party tools to achieve the ETL process compels the information of
new scripting languages and processes.

7b.Compare and contrast the types of OLAP.

ROLAP MOLAP HOLAP

ROLAP stands for Relational MOLAP stands for Multidimensional HOLAP stands for Hybrid Online
Online Analytical Online Analytical Processing. Analytical Processing.
Processing.

The ROLAP storage mode The MOLAP storage mode principle the The HOLAP storage mode connects
causes the aggregation of the aggregations of the division and a copy of attributes of both MOLAP and
division to be stored in its source information to be saved in a ROLAP. Like MOLAP, HOLAP
indexed views in the multidimensional operation in analysis causes the aggregation of the division
relational database that was services when the separation is processed. to be stored in a multidimensional
specified in the partition's operation in an SQL Server analysis
data source. services instance.

ROLAP does not because a This MOLAP operation is highly optimize HOLAP does not causes a copy of the
copy of the source to maximize query performance. The source information to be stored. For
information to be stored in storage area can be on the computer where queries that access the only summary
the Analysis services data the partition is described or on another record in the aggregations of a
folders. Instead, when the computer running Analysis services. division, HOLAP is the equivalent of
outcome cannot be derived Because a copy of the source information MOLAP.
from the query cache, the resides in the multidimensional operation,
indexed views in the record queries can be resolved without accessing
source are accessed to the partition's source record.
answer queries.
Query response is frequently Query response times can be reduced Queries that access source record for
slower with ROLAP storage substantially by using aggregations. The example, if we want to drill down to
than with the MOLAP or record in the partition's MOLAP operation an atomic cube cell for which there is
HOLAP storage mode. is only as current as of the most recent no aggregation information must
Processing time is also processing of the separation. retrieve data from the relational
frequently slower with database and will not be as fast as
ROLAP. they would be if the source
information were stored in the
MOLAP architecture.

8a. Implement a modern data warehouse with example.

Snowflake and Oracle Autonomous Data Warehouse are two cloud data warehouses that
provide you with a single source of truth (SSOT) for all the data that exists in your
organization. You can use either of these warehouses to run data through business
intelligence (BI) tools and automate insights for decision-making. But which one should you
add to your tech stack? In this guide, learn the differences between Snowflake vs. Oracle
and how you can transfer data to the warehouse of your choice.
Here’s the key take aways to know about Snowflake vs. Oracle:
Snowflake and Oracle are both powerful data warehousing platforms with their own unique
strengths and capabilities.
Snowflake is a cloud-native platform known for its scalability, flexibility, and performance.
It offers a shared data model and separation of compute and storage, enabling seamless
scaling and cost-efficiency.
Oracle, on the other hand, has a long-standing reputation and offers a comprehensive suite
of data management tools and solutions. It is recognized for its reliability, scalability, and
extensive ecosystem.
Snowflake excels in handling large-scale, concurrent workloads and provides native
integration with popular data processing and analytics tools.
Oracle provides powerful optimization capabilities and offers a robust platform for
enterprise-scale data warehousing, analytics, and business intelligence.
What Is Snowflake?
Snowflake is a data warehouse built for the cloud. It centralizes data from multiple sources,
enabling you to run in-depth business insights that power your teams.
At its core, Snowflake is designed to handle structured and semi-structured data from
various sources, allowing organizations to integrate and analyze data from diverse systems
seamlessly. Its unique architecture separates compute and storage, enabling users to scale
each independently based on their specific needs. This elasticity ensures optimal resource
allocation and cost-efficiency, as users only pay for the actual compute and storage utilized.
Snowflake uses a SQL-based query language, making it accessible to data analysts and SQL
developers. Its intuitive interface and user-friendly features allow for efficient data
exploration, transformation, and analysis. Additionally, Snowflake provides robust security
and compliance features, ensuring data privacy and protection.
One of Snowflake’s notable strengths is its ability to handle large-scale, concurrent
workloads without performance degradation. Its auto-scaling capabilities automatically
adjust resources based on the workload demands, eliminating the need for manual tuning
and optimization.
Another key advantage of Snowflake is its native integration with popular data processing
and analytics tools, such as Apache Spark, Python, and R. This compatibility enables
seamless data integration, data engineering, and advanced analytics workflows.
What Is Oracle?
Oracle is available as a cloud data warehouse and an on-premise warehouse (available
through Oracle Exadata Cloud Service). For this comparison, DreamFactory will review
Oracle’s cloud service.
Like Snowflake, Oracle provides a centralized location for analytical data activities, making
it easier for businesses like yours to identify trends and patterns in large sets of big data.
Oracle’s flagship product, Oracle Database, is a robust and highly scalable relational
database management system (RDBMS). It is known for its reliability, performance, and
extensive feature set, making it suitable for handling large-scale enterprise data
requirements. Oracle Database supports a wide range of data types and provides advanced
features for data modeling, indexing, and querying.
In addition to its RDBMS, Oracle provides a complete ecosystem of data management tools
and technologies. Oracle Data Warehouse solutions, such as Oracle Exadata and Oracle
Autonomous Data Warehouse, offer high-performance, optimized platforms specifically
designed for data warehousing and analytics workloads.
Oracle’s data warehousing offerings come with a suite of powerful analytics and business
intelligence tools. Oracle Analytics Cloud (OAC) provides comprehensive self-service
analytics capabilities, enabling users to explore and visualize data, build interactive
dashboards, and generate actionable insights.
8b. Design a Data Warehouse for a Food Delivery App.

Database Design for a food delivery app like Zomato/Swiggy


Let’s design the database for a food delivery app like Zomato or Swiggy.
Let’s start with all the required tables in the database design.
Here is a basic database design for a delivery app like Zomato:
1. Users: This table will store information about app users, including their name, email address,
password, phone number, and delivery address.
2. Restaurants: This table will store information about restaurants, including their name,
address, phone number, and menu items.
3. Orders: This table will store information about orders, including the user who placed the
order, the restaurant from which the order was placed, the menu items included in the order,
the order total, and the delivery status of the order.
4. Drivers: This table will store information about drivers, including their names, phone
numbers, and current locations.
5. Payment: This table will store information about payments, including the payment method
used (e.g. credit card, cash), the amount paid, and the status of the payment (e.g. pending, paid,
refunded).
6. Rating: This table will store information about user ratings, including the user who left the
rating, the restaurant that is rated, and the rating (e.g. 1–5 stars).
These are the main tables you’ll need to create a functional delivery app. You may also need to
create additional tables to store other information, such as information about promotions, discounts,
and other types of deals. But initially let’s consider only these tables.

Let’s Create all the Relationships between the tables.


Here are the relationships between the tables in the database design:
1. Users — Orders: A user can place multiple orders, and each order is placed by a single user.
This is a one-to-many relationship.
2. Restaurants — Orders: A restaurant can have multiple orders placed for it, and each order is
placed at a single restaurant. This is a one-to-many relationship.
3. Orders — Payment: Each order will have a single payment associated with it, and each
payment is for a single order. This is a one-to-one relationship.
4. Orders — Drivers: Each order will have a single driver associated with it, and each driver is
assigned to multiple orders.
5. Users — Rating: A user can rate multiple restaurants, and each restaurant can be rated by
multiple users. This is a many-to-many relationship.
6. Restaurants — Rating: A restaurant can be rated multiple times by different users, and each
rating belongs to a single restaurant. This is a one-to-many relationship.
7. Users — Address: A user can have multiple delivery addresses, and each address belongs to a
single user. This is a one-to-many relationship.
8. Restaurants — Menu: A restaurant can have multiple menu items, and each menu item
belongs to a single restaurant. This is a one-to-many relationship.
This is a basic relationship for a delivery app like Zomato, you may need to add or update the
relationship based on your requirement.

You might also like