10678.dat

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

CHRIST (Deemed to be University), Bangalore – 560 029

Department of Statistics and Data Science


END SEMESTER EXAMINATION – MAY 2023
PG II Semester

Programme Name: MSc Data Science Max. Marks: 100


Course Name: Regression Analysis Time: 3 Hrs
Course Code: MDS232
General Instructions

● All rough work should be done in the answer script. Do not write or scribble in the question
paper except your register number.
● Verify the Course code / Course title & number of pages of questions in the question paper.
● Make sure your mobile phone is switched off and placed at the designated place in the hall.
● Malpractices will be viewed very seriously.
● Answers should be written on both sides of the paper in the answer booklet. No sheets should
be detached from the answer booklet.
● Answers without the question numbers clearly indicated will not be valued. No page should
be left blank in the middle of the answer booklet.

Course Outcomes (COs): The students will be able to


CO1: Explain the concept of regression models.
CO2: Assess regression models using various criteria
CO3: Analyse the robustness of the regression models
CO4: Build different types of a regression model based on the real-life problem of
global or national importance

SECTION A

Answer all the questions 5 X 20 = 100 marks

Q. No Questions CO RBT

1 a) i. What is the role of the error term in a simple linear regression model, 1 L3
and how does it affect the estimation of the slope and intercept? Explain the
corresponding estimation procedure. (4+6 marks)
ii. A researcher is interested in the effect of temperature on the growth rate
of a certain type of bacteria. The researcher wants to test whether the
intercept and slope of the actual regression line of growth rate on
temperature are 1.2 and 0.5, respectively. Develop the corresponding testing

Page 1 of 3
procedures at the significance level. (10 marks)
(OR)
b) i. What are the assumptions used in the simple linear regression model?
Show that the estimators of intercept and slope are unbiased in a simple
linear regression model. (3+7 marks)

ii. A company is interested in the relationship between the number of years


of work experience and the salary of their employees. Construct

confidence intervals for the slope and intercept of the actual


regression line of salary on work experience. (10 marks)

a) i. Show that fitted Y and residuals are linear transformations on observed


Y in the case of multiple linear regression. (6 marks)
ii. Discuss the properties of in the case of multiple linear regression
model
. (14 marks)
(OR)
2 1 L2
b) i) Construct a confidence interval for the future Y value
based on a multiple linear regression model for the given X,

. (10 marks)

ii) Stating the required assumptions, obtain the test procedure for testing the
significance of the multiple linear regression model. (10 marks)

3 a) i. Define intrinsic linear relationship in the linear regression model. 2 L3


Examine different intrinsic linear relationships and the corresponding
remedies. (12 marks)
ii. Consider a regression problem with more independent variables.
Illustrate adding significant variables using the stepwise procedure. (8
marks)
(OR)
b) i. Consider a linear regression problem where the normality and
homoscedasticity assumptions are violated. As a data scientist, suggest a
remedy to correct this. Explain the corresponding procedure. (8 marks)

ii. Distinguish between forward selection and backward elimination

Page 2 of 3
procedures. (12 marks)

a)i. Differentiate between studentised residuals and PRESS residuals. 4


marks)
ii. Illustrate the effect of multicollinearity in a linear regression model.
Discuss the diagnostic procedures for multicollinearity. (16 marks)
4 (OR) 3 L5
b) i. Explain the problem of heteroskedasticity. Formulate a test procedure
for testing heteroskedasticity. (12 marks)

ii. Discuss different shapes and implications of residuals against fitted


values plot. (8 marks)

a) i. Examine the components of a generalised linear model. (8 marks)


ii. Discuss the logistic regression model. (12 marks)
(OR)
5 4 L4
b) Explain a regression model which is used to model count data. Discuss
the corresponding test procedure to check the goodness of fit of the model.
(20 marks)

Revised Bloom’s Taxonomy (RBT) Levels :

L1 – Remembering L2 – Understanding L3 – Applying

L4 – Analyzing L5 – Evaluating L6 - Creating

Page 3 of 3

You might also like