Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Sabyasachi Sahu

DMBD Assignment

1501100

Question 1:
See file Q1.R
1. Read Data from the nifty50 file
2. Get the prices and adjusted close prices starting from 2014-1-11
using get.hist.quote()
3. Calculate nifty returns from difference of adjusted close prices
4. Start getting prices and adjusted close prices for each company in
the nifty 50 file using get.hist.quote()
5. Calculate returns for each stock from adjusted close prices
6. Use lm (Linear Model) function to calculate coefficients and plot the
regression line
Results:

Sabyasachi Sahu

DMBD Assignment

1501100

Plot:

Question 2:
See file: Q2.R
1. Define number of overs and number of trials and other relevant
variables
2. Calculate runs scored for each trials based on the probabilities of
the outcome given in the question
3. The above has to calculated if the match was played for 1 over and
if played for 2 overs
4. Based on the runs scored, who won can be calculated at the end of
each round

Sabyasachi Sahu

DMBD Assignment

1501100

5. If the batsman is out, i.e probabilities fall within the respective


outcome, the match proceeds to the next trial.
6. After the respective 100000 trials who won in each of the scenarios
is calculated and shown as below.
Results:

Question3
See file: Q3.R
Code:
setwd("C:\\Users\\shyam\\Downloads\\1501089")

carseats_data = read.csv("carseats.csv")
summary(carseats_data)
plot(carseats_data$Price,carseats_data$Sales)
plot(carseats_data$Income,carseats_data$Sales)
plot(carseats_data$Advertising,carseats_data$Sales)
plot(carseats_data$Age,carseats_data$Sales)
#Calculating Regression
Regression = lm(Sales~Price+Urban+US,data=carseats_data)

summary(Regression)
plot(Regression)
(ii) The output from the regression model can be interpreted in the
equation form as
Sales= 13. 04 0.05 * Price 0.02* UrbanYes + 1.200* USYes
Model Summary
For every 1 dollar decrease in price sales increase by 500 units.
The same is true for other co=efficients

Sabyasachi Sahu

DMBD Assignment

1501100

No significant conclusion from this graph

The above figures shows that Sales and Price are inversely related
Question4:
A/B testing also called split testing/ bucket testing is a tool used for
comparing two different versions of a webpage/app (say A and B) against
each other. The two variants A and B are called control and variation and
the experiment conducted is a controlled experiment

Sabyasachi Sahu

DMBD Assignment

1501100

Here the two variants (A and B) are shown to users at random and
through statistical analysis the better of the two (given relevant
parameters on which being tested) can be determined.
For example: The different versions of a webpage (A and B) are shown
randomly to different users and their engagement with each experience is
measured through a statistical engine. Based on relevant parameters such
as conversion rate, bounce rate etc. one can determine whether changing
the experience i.e from the control to the variation had a positive negative
or neutral experience
Similarity to hypothesis testing: A/B testing is a form of two-sample
hypothesis testing. The current webpage (control) is associated with the
null hypothesis. A/B testing is a way to compare the control and the
variation (optimized webpage) and test whether the change in
engagement in the variation is through random chance and A/B testing
also quantifies that confidence
Null hypothesis: Difference between conversion rate/bounce rate
between webpage A and webpage B is caused by chance
Alternate Hypothesis: Difference between conversion rate/bounce rate
between webpage A and webpage B is caused by their differences
(design/ads etc.)
If the statistical analysis shows that the probability of difference between
B and A on the given parameter occurring due to chance is <5%, then we
reject the null hypothesis.
Question5:
See file: Q5.R
Summary of data set:

Sabyasachi Sahu

DMBD Assignment

Univariate Analysis: Boxplots and histograms of different variables

1501100

Sabyasachi Sahu

DMBD Assignment

1501100

Bivariate Analysis: Duration vs Credit_amount

Base Case Model Classification Matrix (Accuracy: 70%)

Summary of Model:

Model Validation (Accuracy: 71.67%)

Question6:
Problem: I worked as a consultant for Cognizant Business Consulting
where I had to come up with the trends in the technological sector
affecting the CIO across five different verticals, analyse CBCs offerings,
up with a gap analysis of features provided by CBC and recommend the
prospective changes that can be applied

Sabyasachi Sahu

DMBD Assignment

1501100

Analysis: This could be done by collating data about technological factors


affecting CIO, their importance, the maturity score across different
parameters, the cost overlay and the revenue generation capabilities of
the said trends . We could then apply a classification problem (possibly
factor analysis) and determine the correlation of the trends
(importance and maturity) with suitable cost and revenue structure and
thus determine which projects/trends the CIO should focus upon
(Regression). Also trends affecting all five verticals can be determined
along with their respective importance, thus giving CBC the areas it needs
to develop products and consulting services.
Data Needed: Survey of importance given to different technologies,
maturity scores (based on an average scores of myriad of parameters of
some companies belonging to each vertical), Possible Cost Overlay and
Total Cost of Ownership (from reports) and Estimated Revenue structure

You might also like