Interview Questions On Data Analytics

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Q1. You use Normal Equation Method to calculate the parameters of predicting model?

What are the shortcomings of Normal Equation Method?


Ans:
1. The process to calculate (X’X) and then its invert form is expensive and costs O(N³)
where N is the number of rows/observations in X matrix.
2. (X’X) might be non-invertible, a violation of our assumption. In such cases, linearly-
dependent redundant features might be removed, or regularization methods like lasso
be used to lessen the number of features.

Q2. Why do you fit a second-order model with interaction terms? What is the significance
of each of these terms in this equation? How do you choose which parameter to retain &
which to neglect?
Ans:

Q3. The range mention in the question was 200-250 for temperature and 15-25 for
concentration. But you used out of range of data points to calculate.

Team ID - OP074

Q1. Assumed that 250k is the temp at which max rate can be achieved, then varied the
concentration to get a predicting model. Which is wrong. Because you considered the
temperature in discrete points not in the whole range of temperature. That’s why optimal
temperature deviates a lot.
There might be an interaction between temperature and concentration which is included
in the final model.
No 2D,3D plotting to see the overall variation of both temp and concentration.

Team ID - OP070

Q1. You provided no reason for the dataset you used to fit the model. What is the reason behind
selecting these points?
Q2. You used a derivative approach to find out the maxima and the corresponding input
parameters. What is condition for the maxima of a two-variable function?
Let f be a function with two variables with continuous second order partial derivatives f xx , f yy
and f xy at a critical point (a,b). Let
D = fxx(a,b) fyy(a,b) - fxy2(a,b)
a) If D > 0 and f xx (a,b) > 0, then f has a relative minimum at (a,b).
b) If D > 0 and f xx (a,b) < 0, then f has a relative maximum at (a,b).
c) If D < 0, then f has a saddle point at (a,b).
d) If D = 0, then no conclusion can be drawn.

The team did not mention about the jacobian matrix.

Q3. What is the significance of the ‘xy’ term in your final predicting model?
Q4. What is ‘SSe” and how it is calculated?
SSe = y’y - b’X’y

Team ID - OP065
Q1. You used a derivative approach to find out the maxima and the corresponding input
parameters. What is condition for the maxima of a two-variable function?

Let f be a function with two variables with continuous second order partial derivatives f xx , f yy
and f xy at a critical point (a,b). Let
D = fxx(a,b) fyy(a,b) - fxy2(a,b)
a) If D > 0 and f xx (a,b) > 0, then f has a relative minimum at (a,b).
b) If D > 0 and f xx (a,b) < 0, then f has a relative maximum at (a,b).
c) If D < 0, then f has a saddle point at (a,b).
d) If D = 0, then no conclusion can be drawn.

The team did not mention about the jacobian matrix.

Q2. Ask general question about the statistics terms like SSR, SSe, R^2_adjusted, parameter
significance

Some general question asked by the prof


1. The optimum points obtained is not a part of the selected points to fit a model. It
works on the training data. What about the unseen data points?
2. Is the predicting model linear or non-linear in the parameters?
3. How many parameters are there in the predicting model?
4. Why the predicting model is quadratic, not cubic

You might also like