Professional Documents
Culture Documents
Designing Data Systems Project - v0.2
Designing Data Systems Project - v0.2
Designing Data Systems Project - v0.2
1. Staging
1.1. Student provides a screenshot after extracting 6 Yelp files into staging schema. The
screenshot should have 6 tables with the correct respective row counts.
1.2. Student provides a screenshot after extracting 2 files into the staging schema. The
screenshot should have two tables - temperature and precipitation.
1.3. The student provides a diagram showing 8 files pointing to staging database to ODS
to DWH to Reporting.
2. Operational Data Store
#Copy data from USER table in Stating layer into USER table in ODS
INSERT INTO USER_ODS_TABLE
SELECT
USERJSON:user_id,
USERJSON:name,
USERJSON:review_count,
USERJSON:yelping_since,
USERJSON:friends,
USERJSON:useful,
USERJSON:funny,
USERJSON:cool,
USERJSON:fans,
USERJSON:elite,
USERJSON:average_stars,
USERJSON:compliment_hot,
USERJSON:compliment_more,
USERJSON:compliment_profile,
USERJSON:compliment_cute,
USERJSON:compliment_list,
USERJSON:compliment_note,
USERJSON:compliment_plain,
USERJSON:compliment_cool,
USERJSON:compliment_funny,
USERJSON:compliment_writer,
USERJSON:compliment_photos
FROM "YELPRATINGPROJECT"."STAGING"."USER_STAGING_TABLE";
#Copy data from BUSINESS table in Stating layer into BUSINESS table in ODS
INSERT INTO BUSINESS_ODS_TABLE
SELECT
BUSINESSJSON:business_id,
BUSINESSJSON:name,
BUSINESSJSON:address,
BUSINESSJSON:city,
BUSINESSJSON:state,
BUSINESSJSON:postal_code,
BUSINESSJSON:latitude,
BUSINESSJSON:longitude,
BUSINESSJSON:stars,
BUSINESSJSON:review_count,
BUSINESSJSON:is_open,
BUSINESSJSON:attributes,
BUSINESSJSON:categories,
BUSINESSJSON:hours
FROM "YELPRATINGPROJECT"."STAGING"."BUSINESS_STAGING_TABLE";
#Copy data from REVIEW table in Stating layer into REVIEW table in ODS
INSERT INTO REVIEW_ODS_TABLE
SELECT
REVIEWJSON:review_id,
REVIEWJSON:user_id,
REVIEWJSON:business_id,
REVIEWJSON:stars,
REVIEWJSON:date,
REVIEWJSON:text,
REVIEWJSON:useful,
REVIEWJSON:funny,
REVIEWJSON:cool
FROM "YELPRATINGPROJECT"."STAGING"."REVIEW_STAGING_TABLE";
#Copy data from CHECKIN table in Stating layer into CHECKIN table in ODS
INSERT INTO CHECKIN_ODS_TABLE
SELECT
CHECKINJSON:business_id,
CHECKINJSON:date
FROM "YELPRATINGPROJECT"."STAGING"."CHECKIN_STAGING_TABLE";
#Copy data from COVID table in Stating layer into COVID table in ODS
INSERT INTO COVID_ODS_TABLE
SELECT
COVIDJSON:"business_id",
COVIDJSON:"highlights",
COVIDJSON:"Delivery or takeout",
COVIDJSON:"Grubhub enabled",
COVIDJSON:"Call To Action enabled",
COVIDJSON:"Request a Quote Enabled",
COVIDJSON:"Covid Banner",
COVIDJSON:"Temporary Closed Until",
COVIDJSON:"Virtual Services Offered"
FROM "YELPRATINGPROJECT"."STAGING"."COVID_STAGING_TABLE";
#Copy data from TEMPERATURE table in Stating layer into TEMPERATURE table in ODS
INSERT INTO TEMPERATURE_ODS_TABLE
SELECT
to_date(date,'yyyymmdd'),
min,
max,
normal_min,
normal_max
FROM "YELPRATINGPROJECT"."STAGING"."TEMPERATURE_STAGING_TABLE";
2.2 Student provides SQL queries that use JSON functions to transform staging data from
a single JSON structure into multiple columns for ODS.
#Copy data from USER table in Stating layer into USER table in ODS
INSERT INTO USER_ODS_TABLE
SELECT
USERJSON:user_id,
USERJSON:name,
USERJSON:review_count,
USERJSON:yelping_since,
USERJSON:friends,
USERJSON:useful,
USERJSON:funny,
USERJSON:cool,
USERJSON:fans,
USERJSON:elite,
USERJSON:average_stars,
USERJSON:compliment_hot,
USERJSON:compliment_more,
USERJSON:compliment_profile,
USERJSON:compliment_cute,
USERJSON:compliment_list,
USERJSON:compliment_note,
USERJSON:compliment_plain,
USERJSON:compliment_cool,
USERJSON:compliment_funny,
USERJSON:compliment_writer,
USERJSON:compliment_photos
FROM "YELPRATINGPROJECT"."STAGING"."USER_STAGING_TABLE";
#Copy data from BUSINESS table in Stating layer into BUSINESS table in ODS
INSERT INTO BUSINESS_ODS_TABLE
SELECT
BUSINESSJSON:business_id,
BUSINESSJSON:name,
BUSINESSJSON:address,
BUSINESSJSON:city,
BUSINESSJSON:state,
BUSINESSJSON:postal_code,
BUSINESSJSON:latitude,
BUSINESSJSON:longitude,
BUSINESSJSON:stars,
BUSINESSJSON:review_count,
BUSINESSJSON:is_open,
BUSINESSJSON:attributes,
BUSINESSJSON:categories,
BUSINESSJSON:hours
FROM "YELPRATINGPROJECT"."STAGING"."BUSINESS_STAGING_TABLE";
#Copy data from REVIEW table in Stating layer into REVIEW table in ODS
INSERT INTO REVIEW_ODS_TABLE
SELECT
REVIEWJSON:review_id,
REVIEWJSON:user_id,
REVIEWJSON:business_id,
REVIEWJSON:stars,
REVIEWJSON:date,
REVIEWJSON:text,
REVIEWJSON:useful,
REVIEWJSON:funny,
REVIEWJSON:cool
FROM "YELPRATINGPROJECT"."STAGING"."REVIEW_STAGING_TABLE";
#Copy data from TIP table in Stating layer into TIP table in ODS
INSERT INTO TIP_ODS_TABLE
SELECT
TIPJSON:user_id,
TIPJSON:business_id,
TIPJSON:text,
TIPJSON:date,
TIPJSON:compliment_count
FROM "YELPRATINGPROJECT"."STAGING"."TIP_STAGING_TABLE";
#Copy data from CHECKIN table in Stating layer into CHECKIN table in ODS
INSERT INTO CHECKIN_ODS_TABLE
SELECT
CHECKINJSON:business_id,
CHECKINJSON:date
FROM "YELPRATINGPROJECT"."STAGING"."CHECKIN_STAGING_TABLE";
#Copy data from COVID table in Stating layer into COVID table in ODS
INSERT INTO COVID_ODS_TABLE
SELECT
COVIDJSON:"business_id",
COVIDJSON:"highlights",
COVIDJSON:"Delivery or takeout",
COVIDJSON:"Grubhub enabled",
COVIDJSON:"Call To Action enabled",
COVIDJSON:"Request a Quote Enabled",
COVIDJSON:"Covid Banner",
COVIDJSON:"Temporary Closed Until",
COVIDJSON:"Virtual Services Offered"
FROM "YELPRATINGPROJECT"."STAGING"."COVID_STAGING_TABLE";
2.3 Student provides screenshot of a table with three columns: raw files, staging, and ODS.
Each column should record the size of the data in the respective format. The table should have
eight rows, one for each file.
UPDATE "YELPRATINGPROJECT"."ODS"."METADATA_TABLE" as m
SET m.ods_size = t.BYTES
FROM
"YELPRATINGPROJECT"."INFORMATION_SCHEMA".tables as t
where CONCAT(UPPER(m.data_file_name),'_ODS_TABLE') = t.table_name;
2.4 The student provides an ER diagram that includes all appropriate model information.
2.5 Submission should include a SQL query that show how the datasets are integrated.
3. Datawarehouse
3.1 Student provides a diagram of star schema with dimensions and fact tables.
3.2 Student provides the SQL queries necessary to move the data from ODS to DWH
CREATE TABLE
"YELPRATINGPROJECT"."DATAWAREHOUSE"."TEMPERATURE_PRECIPITATION_DIM_TABLE" (
date DATE NOT NULL PRIMARY KEY,
"min" FLOAT,
"max" FLOAT,
normal_min FLOAT,
normal_max FLOAT,
precipitation FLOAT,
precipitation_normal FLOAT)
COMMENT = 'Temperature and Precipitationdim table in DWH layer';
3.3 Student provides SQL queries that report business name, temperature, precipitation, and
ratings.
Result query: