Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Nhóm 11 restaurant rating

Trương Hoàng Huy - K224131625


Phạm Thị Thu Huyền - K224131626
Nguyễn Đặng Anh Quốc - K224131639
Phạm Mai Minh Huy - K224131624

Midterm project

1. Determine dependent and independent variables.


Behold the illustrious ranking tableau of dining establishments, wherein the scoring is meticulously
forged through the crucible of price juxtaposed with the culinary milieu of the respective venues,
encompassing the delectable realms of Italian fare and the opulent realms of seafood and steakhouse
offerings.
3
𝑦 = 𝑠𝑐𝑜𝑟𝑒 𝑥2 = 𝑝𝑟𝑖𝑐𝑒

𝑥1 = 𝑡𝑦𝑝𝑒

2. Check for outliers, influential points


Outliers:

Price
Score:

Type:
Histogram:
Price:

Score:
3. Develop an estimated regression equation

𝑦 = 57754. 35𝑥1 + 98527. 53𝑥2 + 55249. 67

4. Did the estimated regression equation provide a good fit to the data? Explain
→ R-squared (R2) of the new model equals 0.5434, which is lower than 0.7 (70%). Hence, the
estimated regression equation provides not a good fit to the actual data.

5. At the .05 level of significance, test for a significant relationship.

- Test for overall significance: F-test (p-value = 0.0009 < the significance level, 0.05). → We reject
the null hypothesis, which means that the model regression is significant between variables.
- Test for individual significance: T-test
● x1 = 0.014 < 0.05
● x2 = 0.001 < 0.05
Test for individual significance: T-test
● x1 = 0.014 < 0.05
● x2 = 0.001 < 0.05
→ All p-values of independent variables are lower than the significance level (0.05),
which means that all independent variables are individually significant.

6. Perform residual analysis to check for 4 assumptions

- Assumption 1: The error ε is a random variable with a mean of zero.

The p-value is 0.9825, which is greater than the significance level (0.05). Therefore, we do not reject
the null hypothesis, which means that the residuals have a mean of zero.
=> Assumption 1 satisfied.

- Assumption 2: The variance of ε (residuals) is the same for all values of the
independent variable.

The p-value equals 0.8692, which is greater than the significance level (0.05). Therefore, we
do not reject the null hypothesis, which means that the variance of the standardized residuals
is constant.
→ Assumption 2 satisfied.
- Assumption 3: The error ε is a normal distributed random variable.

The p-value in both tables are greater than the significance level (0.05). Therefore, we do not
reject the null hypothesis which means the standardized residuals are normally distributed.

- Assumption 4: The values of ε are independent.

→ The scatter plot of the standardized residuals has no trend. We can conclude that
the values of residuals are independent.
7. Give examples of confidence intervals and prediction intervals

Example:
The 95% confidence interval and prediction interval of the random restaurant which
has x1 = 1; x2 = 5
^ α/2
+ 95% confidence interval 𝑦 ± 𝑡𝑛−𝑝−1 𝑠𝑡𝑑𝑒𝑟𝑟𝑜𝑟

0.025
605641. 63 ± 𝑡𝑛−3 × 23012. 18

= 605641. 63 ± (2. 101 × 23012. 18) = (557293. 0398 ; 653990. 2202 )


+ 95% prediction interval of the random restaurant
^ α/2
𝑦 ± 𝑡𝑛−𝑝−1 𝑠𝑡𝑑 𝑓𝑜𝑟𝑒𝑐𝑎𝑠𝑡

0.025
605641. 63 ± 𝑡𝑛−3 × 53916. 36

= 605641. 63 ± (2. 101 × 53916. 36)


=> (492363.3576 ; 718919.9024)

You might also like