Professional Documents
Culture Documents
INFO8095-22S-SEC1-Big Data Analytics - Group 4
INFO8095-22S-SEC1-Big Data Analytics - Group 4
INFO8095-22S-SEC1-Big Data Analytics - Group 4
PRESENTATION
Team Members;
Syed Isabat Hussain Rizvi
Rohit Tiwari
Vivek Chaudhary
Jvalant Pandya
Project analysisTABLE
slide
OF 2
CONTENTS
PROJECT SUMMARY
MODEL-CLASSIFICATION MODEL-TEXT
PROJECT SUMMARY
This collaborative Originally, as part of The data was gathered Various models are
project exemplifies The data Preparation, in order to extract 50 created based on the
some of the important the team became distinct types of items, data collected, such
stages of the Big Data completely acquainted date of purchase, as:
Lifecycle. It focuses on with the data; in this store, and sales
Data Preparation, case, Grocery Bills information
Model Planning, and were gathered by the encompassing
Model Construction. team. quantity, transaction,
and price, among Entity Relationship Diagram
other things. "ERD in Star Schema format"
is built to depict how the
dimensional model would
appear for the 'Grocery Bill'
data, ensuring all fields have
names and all Primary and
Foreign Keys are labelled
correctly.
Tables
◦ Sales Table – Sales Table is the main table that contains
“DateKey”, “StoreKey”, “ProductKey”, “ProductDepartment”,
“ProductCost”, “TransactionID”, “DollarSales”, “UnitSold”.
◦ Product Table – The product table contains a unique Product key
"ProductKey", A column with the names of the products
"ProductDescription", a column that defines the product’s
department "ProductDepartment" and the cost of each product
"ProductCost".
◦ Date Table – The Date table consist of a unique Date key
"DateKey", Days "Day", Months“ Month", and Years "Year"
columns.
◦ Store Table – This table have the list of Stores and their store keys.
"StoreKey" and "StoreName".
Store Table Sales Table:
DateKey Date Table
StoreKey
DIMENSIONAL ProductKey
MODEL ProductDepartment Product
ProductCost Table
TransactionID
DollarSales
UnitsSold
CLASSIFICATIO
N MODEL
DECISION TREE
Classification Model
Decision Tree
100 80 75 27