Professional Documents
Culture Documents
SMDM Assignment: Problem 1
SMDM Assignment: Problem 1
SMDM Assignment: Problem 1
ASSIGNMENT
By:- Manas Vikram Singh
Problem 1
We imported the ‘Wholesale Customer data’ dataset in python to analyze the
spend under each store items across regions and channel to find solutions to
each problem. Below is the detailed approach and answer.
1.1 Use methods of descriptive statistics to summarize data. Which Region
and which Channel seems to spend more? Which Region and which
Channel seems to spend less?
Solution:
The data set is of 440 buyer\spenders across different region in Portugal
mainly divided into 3 categories Lisbon, Oporto and others region. It is also
divided into 2 different channel i.e. hotel and retail.
In jypyter notebook we created a summary to it.
• Hotel channel spend amount is 7999569 with the highest spend amount and,
• Retail spend amount 6619931 has least spend amount based on Channel.
Below is the output from Python
Channel
Hotel 7999569
Retail 6619931
Similarly we grouped totals by region to get totals by region.
Other regions spend amount is 10677599 with the highest spend amount and
Oporto region spend amount is 1555088 and has least spend amount by
Region.
Below is the output from Python –
Region
Lisbon 2386813
Oporto 1555088
Other 10677599
1.2Problem 1.2 There are 6 different varieties of items are considered.
Do all varieties show similar behavior across Region and Channel?
Provide justification for your answer.
Solution:
Using bar graph for each category and checking spend across Channel we
get the following outputs from python –
Looking at the above graphs, we see that Milk, Grocery, Detergents paper
and delicatessen have higher spent in the Retail channel as compared to
Hotel, across all regions. On the other hand, fresh and frozen have higher
consumption in the Hotel channel as compared to retail, across all regions.
Similarly, using bar graph for each category and checking spend across
regioni ng
By looking at the above graph we see that grocery, frozen and detergents
paper is most consumed in Oporto as compared to other regions.
Delicatessen, fresh and milk are consumed more in other region than Lisbon
and Oporto.
Male 3 3 7 4 4 3 9 0
Female 4 1 4 2 6 4 5 3
Solution:
Total male=29
Using contingency tables of Gender and Majors we got the total numbers of
males and number of males opting for different majors
Below is the output from Python –
Probability of male opting for accounting is 13.79%
Probability of male opting for CIS is 3.45%
Probability of male opting for Economics/Finance is 13.79%
Probability of male opting for International business is 6.90%
Probability of male opting for management is 20.69%
Probability of male opting for other is 13.79%
Probability of male opting for Retailing/Marketing is 17.24%
Probability of male opting for Undecided is 10.34%
Using contingency tables of Gender and Grad Intension we got the total
numbers of males and number of males who intends to graduate
Given below is the output from python-
Probability that a randomly chosen student is male and intends to graduate is
27.419%
Using contingency tables of Gender and computer, we got total number of
females and does NOT have laptop.
Given below is the output from python-
Probability that a randomly chosen student is female and does NOT have a
laptop is 6.452%
2.5. Assume that the sample is representative of the population of CMSU. Based
on the data, answer the following question:
2.5.1. Find the probability that a randomly chosen student is either a male or has
full-time employment?
2.5.2. Find the conditional probability that given a female student is randomly
chosen, she is majoring in international business or management.
2.5.1 Solution
Probability of male=0.468
Probability of student being male and has full time employment =0.112
= 51.7%
So, probability that a randomly selected student will either be a male or has
full-time employment.
2.5.2 Solution
Using contingency table of gender and major we got total number of female
and number of females opting for different major
Total female=33
=8/33
=0.242
2.7.1. If a student is chosen randomly, what is the probability that his/her GPA is
less than 3?
2.7.2. Find the conditional probability that a randomly selected male earns 50 or
more. Find the conditional probability that a randomly selected female earns 50 or
more.
2.6.1 Solution:
Using python, the number of student having GPA less than 3=17
Total student=62
2.6.2 Solution
2.8. Note that there are four numerical (continuous) variables in the data
set, GPA, Salary, Spending, and Text Messages. For each of them
comment whether they follow a normal distribution. Write a note
summarizing your conclusions.
Problem 3
An important quality characteristic used by the manufacturers of ABC asphalt
shingles is the amount of moisture the shingles contain when they are
packaged. Customers may feel that they have purchased a product lacking in
quality if they find moisture and wet shingles inside the packaging. In some
cases, excessive moisture can cause the granules attached to the shingles for
texture and coloring purposes to fall off the shingles resulting in appearance
problems. To monitor the amount of moisture present, the company conducts
moisture tests. A shingle is weighed and then dried. The shingle is then
reweighed, and based on the amount of moisture taken out of the product, the
pounds of moisture per 100 square feet is calculated. The company would like
to show that the mean moisture content is less than 0.35 pound per 100
square feet.
The file (A & B shingles.csv) includes 36 measurements (in pounds per 100
square feet) for A shingles and 31 for B shingles.