Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

SET 01

An online order wine company requires the designing of a data warehouse to record the quantity and sales
of its wines to its customers. Part of the original database is composed by the following tables:

CUSTOMER (Code, Name, Address, Phone, BDay, Gender)


WINE (Code, Name, Type, Vintage, BottlePrice, CasePrice,
Class)
CLASS (Code, Name, Region)
TIME (TimeStamp, Date, Year)
ORDER (Customer, Wine, Time, nrBottles, nrCases)

Note that the tables represent the main entities of the ER schema, thus it is necessary to derive the
significant relationships among them in order to correctly design the data warehouse.

i. Design a conceptual schema (Attribute tree and Fact schema) for sales.
ii. Design a Star Schema and a Snowflake Schema.

SET 02
Suppose following is the set of sales transactions of a super-shop company

TransactionID Itemset
T1 6,7,8,5,4,10
T2 3,8,7,5,4,10
T3 6,1,5,4
T4 6,9,2,5,10
T5 2,8,8,5,4

I. Generate the candidate itemset and frequent itemset with minimum support count 3.
II. Generate Association rules from the frequent itemset you generated.

SET 03
Given the following data (Money possessed by each of the 141 people in an area):
Range of Money 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80
(Thousands)
No. of People 5 6 11 21 35 30 22 11
Deduce the Pearson's coefficient of Skewness. Also interpret the value of the coefficient you derived
SET 04
Divide the following binary featured (X1, X2) data instances into two clusters using k-means algorithm
until convergence
X1 1 2 2 3 4 5
X2 1 1 3 2 3 5

SET 05
Given the following data table of a survey among businessmen. “Business experience”, “Competition”,
“Business Type” are feature attributes while “Profit” is the target class attribute.
Business Competition Business Profit
experience Type
Old Yes Software Down
Old No Software Down
Old No Hardware Down
Mid Yes Software Down
Mid Yes Hardware Down
Mid No Hardware Up
Mid No Software Up
New Yes Software Up
New No Hardware Up
New No Software Up

Construct a decision tree using ID3

SET 06
Given the income (in thousands taka) of 10 farmers in a village are following:
50, 45, 11, 12, 80, 16, 17, 15, 14, 30
Derive the standard deviation and then normalize the above data using z-score.

Given the prices of different types of Pizza in a shop:


100, 30, 120, 35, 28, 24, 35, 31, 38, 32, 64, 128
Smooth the above data using binning (bin means and bin boundaries)
SET 07
Suppose that, a renowned sports supershop company has got following data (Choice of youth and
non-youth between Cricket and Football) through a survey among 1100 people in an area adjacent to our
university:
Cricket Football
Youth 520 120
Non-Youth 60 400

Deduce the contingency table and (by calculating Pearson chi-square statistic) find out whether any
correlation prevails between choosing sports type and age of the people.

SET 08
Following are two relational schemas from two data sources:
Source 01:
ProductProduced (ProductProducedCode, Name, Description, Warnings, Notes, CatalogueID)
ProductVersion (ProductProducedCode, ProductVersionCode, Size, Color, Name, Description, Stock,
Price)
Source 02:
Commodity (CommodityCode, Name, Size, Color, CommodityIntro, Type, Price, InventoryQuantity)
Item (ItemCode, Name, Description)

Perform a view-based integration operation (create a global relational schema) using Local as View
(LAV).

You might also like