Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

PRACTICAL – 03 CORRELATION AND REGRESSION

PROBLEM STATEMENT:
Beer is considered as most widely consumed alcoholic drink in the world and 3rd largest beverage,
after tea & water. Besides whisky and wine, the country also imports beer to meet the consumption.
As per latest market research reports, import of beer in India is third largest among all alcoholic
drinks. Beer has become a one of the most popular alcoholic beverages in the country only over the
past two decades. At present there are more than 60 beer brands available in the Indian alcohol
beverage market.

Most of the beer’s market growth is driven by young consumers and the consumers who consider beer
a trendy drink, as compared with other traditional spirits. There is also a significant demand for
foreign beer in the urban markets. The maximum levels of beer consumption in India are observed in
the southern states. The healthy growth rate for the beer industry is an indication of the huge potential
of opportunities open for breweries and beer brands marketing and/or manufacturers in India. Most of
the major distilleries and breweries around the world have now created a base in India, either in the
form of manufacturing unit or through distributors and joint ventures.

In all, 127 samples of beer were tested. The alcohol and calorie content for each sample are reported.
The data related to the above stated problem is mentioned on page number in table no. 1.

OBJECTIVES:
To derive the correlation and regression for the calories and alcohol content of the imported beers in
India.

DATA COLLECTION:
The data collected is a secondary data. In all 127 beers were tested and the alcohol and calories content
were reported. Samples were collected at wholesale beer distributors in Mumbai by inspectors of the
Tax Division of the Department of Revenue Services during April and May 1987. Analysis for alcohol
content was by AOAC methods using gas chromatography

Table 1: Alcohol and calories content of the imported beers in India.


Brand/Brewer % Alcohol Calories/100ml
Anchor Porter Anchor 5.66 59
Anchor Steam Beer Anchor 4.63 43
Asahi Draft Beer Asahi 5.21 41
Ballantine Prlvate Stock Malt Liquor Narragansett 6.01 47
Ballantine Indla Pale Ale Falstaff 6.17 53

1|Page
Ballantine Premium Lager Beer Falstaff 4.82 43
Ballantine XXX Ale Falstaff 5.08 46
Bass & Co's Bale Ale Bass 5.51 45
Beamish Irish Cream Stout Beamish - Crawford 4.95 41
Beck's Beer Brauerei Beck 5.13 42
Big Barrel Australian Lager Cooper & Sons 4.66 39
Black Horse Premium Draft Beer Black Horse 4.74 45
Blatz Beer G. Heileman 4.86 43
Blatz Milwaukee 1851 Beer Blatz 4.48 38
Boulder Porter Boulder 6.07 53
Budwelser King of Beers Anheuser Busch 4.82 40
Busch Beer Anheuser Busch 5.19 43
Carling Black Label Canadian Style Beer G.
Heileman 4.38 39
Cerveza Carta Blanca Cerveceria Cauhtemoc 4.02 36
Cerveza Tecate Beer CervecerlaCauhtemoc 4.49 41
Chester Golden A]e Greenall Whitley 5.43 44
Colt 45 Malt Liquor G. Heileman 6.11 49
Coors Banquet Beer Adolph Coors 5.03 41
Corona Extra Beer CereveriaModela SA 4.84 45
Dos Equis XX Imported Beer Cauhtemoc 4.79 42
Dos Equis XX Special Lager Cerveceria Montezuma 4.96 44
Dragon Stout Desnoes - Goeddes 6.79 62
Foster's Lager Garlton& United 5.25 42
Furstenberg German Beer FustlichFerstenbergische 4.43 39
Genesee 12 Horse Ale Genesee 4.98 44
Genesee Beer Genesee 5.03 43
Genesee Cream Ale Genesee 4.7 42
George Killian's Irish Red Ale Adolph Coors 5.79 50
George Killian's Irish Red Brand Beer Adolph Coors 5.54 49
Great Wall Imported Chinese Beer Green Bamboo 4.63 45
Greenall's Cheshire English Pub Beer Grecnall
Whitley PLC 5 40
Grizzly Canadian Lager Hamilton 5.4 43
Grolsch Lager Beer GrolschBierbrouweri 5.17 44
Guinness Extra Stout Gulnness 4.27 43
Harfenrefrer Private Stock Malt Llquor Narragansett 6.87 50
Hamm's Beer Pabst 4.53 40

2|Page
Harp imported Lager Beer Harp 4.55 40
Heineken Lager Beer Heineken 5.41 45
Heineken Special Dark Beer Heineken 5.17 48
Hofenperle Special Feldschlosschen Bier 5.28 45
Kaiserdom Rauchbier-Smoked Bavarlan Dark Beer 5.88 49
Kirin Beer Kirin 6.85 53
Knickerbocker Natural Beer Ruppert 4.16 38

Kronenbourg Beer Kronenbourg 5.11 43


Kronenbourg Imported Dark Beer Kronenbourg 5.08 46
Kuppers Kolsch Kuppers 5.38 45
LA Anheuser Busch Premlum Pilsner Beer Anheuser
Busch 2.29 26
Labatt's 50 Canadian Ale 5.34 43
Liberty Ale Anchor 6.12 53
Lord Chesterfield Ale D.G. Yuengling & Son 5.57 44
Lowenbrau Dark Special Beer Miller 5 45
Lowenbrau Special Beer Miller 5.12 45
McEwans Scotch Ale Scottish & Newcastle 9.5 83
Michelob Beer Anheuser Busch 4.99 45
Michelob Classic Dark Beer Anheuser Busch 4.76 45
Michelob Classic Dark Beer Anheuser Busch 4.93 45
Mickeys Fine Malt Liquor G. Heileman 5.7 45
Miller High Life Beer Miller 4.8 43
Miller High Life Genuine Draft Beer, Miller 5.02 43
Molson Canadian Beer Molson 5.19 43
Molson Golden Beer Molson 6.04 48
Moosehead Canadian Lager Beer Moosehead 5.08 43
O'Keefe Canadian Beer O'Keefe 4.96 40
Olde English Brand 800 Malt Liquor Pabst 6.13 48
Old Milwaukee Beer Stroh 4.51 41
Olympia Premium Lager Beer Pabst 4.78 41
Pabst Blue Ribbon Beer Pabst 5.01 43
Piels Premium Draft Style Beer Stroh 4.23 39
PilsenerUrquell Beer PilsenerUrquellPilzen 4.25 45
Red Strlpe Lager Beer Desnoes& Geddes 5.04 43
Red White& Blue Special Lager Beer G. Heileman 5.15 43
Rheingold Premium Beer Rheingold 4.78 42
Rolllng Rock Extra Pale Premium Beer Latrobe 4.64 40

3|Page
Rolling Rock Premlum Beer Latrobe 4.51 34
Samuel Adams Boston Lager Boston Beer 4.88 48
Schaefer Beer Stroh 4.66 40
Schlitz Beer Stroh 4.7 41
Schlitz Malt Liquor Stroh 6.29 52
Sheaf Stout Carlton & United 5.28 49
Sierra Nevada Pale Ale Sierra Nevada 4.82 45
Sierra Nevada Porter Sierra Nevada 5.34 48
Sierra Nevada Stout Sierra Nevada 5.1 56
Signature Stroh Beer Stroh 4.84 43
Sol Cerveza Especial Cerveceria Montezuma 4.13 37
Spaten Munich Special Dark Beer Spaten-Brau 6.63 52
St. Pauli Girl Beer St. Pauli 5 39
St. Pauli Girl Dark Beer St. Pauli 5.02 45
Stroh's Beer Stroh 4.68 42
Suntory Draft Beer Suntory 4.64 39
Superior Imported Beer Cerveceria Moctezuma 4.34 43
Thos Cooper & Sons Adelaide Lager Cooper & Sons 4.27 36
Thos Cooper & Sons Naturally Brewed Real Ale
Cooper & Sons 6.77 45
Thos Cooper & Sons Naturally Brewed Stout Cooper
& Sons 7.1 58
TollyOriglnal Premium Ale Tollei-ache &Cobbold 4.85 41
Tsingtao Beer Tsingtao 4.79 43
Tuborg Deluxe Dark Export Quality Beer G.
Heileman 5.11 46
Tuborg Export Quality Beer G. Heileman 5.02 44
Tusker Malt Lager Bia Ni Bora 5.24 42
Utica Club Pilsener Lager Beer West End 4.82 27
Watney's Red Barrel Beer Stag 3.92 40
Wurzburger Hofbrau Pilsner Beer
WurtzburgerHofbrauag 5.42 45
Yuengling Porter D.G. Yuengllng& Son 4.13 40
Yuengllng Premium Beer D.G. Yuengling & Son 4.65 39
Amstel Light Bier Amstel Brouwerij B.V. 3.96 28
Anheuser Busch Natural Light Beer Anheuser Busch 4.12 31
Bud Light Beer Anheuser Busch 3.88 33
Coors Llght Beer Adolph Coors 4.36 30
Dribeck's Light LowCalorie Beer Brauerei Beck 3.39 28
Genesee Light Beer Genesee 3.55 27

4|Page
Michelob Light Beer Anheuser Busch 4.52 39
Miller Lite Pilsner Beer Mlller 4.4 29
Molson Light Beer Molson 2.41 23
Nordik Wolf Light Imported Beer A.B.
PrippsBryggerier 4.7 31
Old Milwaukee Premium Light Beer Stroh 3.82 32
Pabst Extra Light Low Alcohol Beer Pabst 2.5 19
Piels Naturally Light Beer Stroh 4.49 40
Rheingold Extra Light Beer Rheingold 4.32 27
Schaeler Light Lager Beer Stroh 4.07 34
Schlitz Light Pilsner Beer Stroh 4.28 31
Stroh Light Beer Stroh 4.45 35
Watney's London Light Beer Watney Combe Reid 3.56 29
WurtzburgerHofbrau Pure Bavarian Light Beer
WurtzburgerHofbrau Ag 5.44 43

SOURCE: http://www.theraven.com/beer.html. The data was collected in April and May, 1987.
THEORY:
Regression analysis involves identifying the relationship between a dependent variable and one or
more independent variables. A model of the relationship is hypothesized and estimates of the
parameter values are used to develop an estimated regression equation. Various tests are then
employed to determine if the model is satisfactory. If the model is deemed satisfactory, the estimated
regression equation can be used to predict the value of the dependent variable given values for the
independent variables.

Regression model:

In simple linear regression, the model used to describe the relationship between a single dependent
variable y and a single independent variable x is

y = a0 + a1x + k.

Where a0and a1 are referred as the model parameters, and is a probabilistic error term that accounts
for the variability in y that cannot be explained by the linear relationship with x. If the error term were
not present, the model would be deterministic; in that case, knowledge of the value of x would be
sufficient to determine the value of y.

Least squares method:

Either a simple or multiple regression model is initially posed as a hypothesis concerning the
relationship among the dependent and independent variables. The least squares method is the
most widely used procedure for developing estimates of the model parameters.

5|Page
Assumptions of regression:

However, before we conduct linear regression, we must first make sure that four assumptions are
met:

1. Linear relationship: There exists a linear relationship between the independent variable, x, and
the dependent variable, y.

2. Independence: The residuals are independent. In particular, there is no correlation between


consecutive residuals in time series data.

3. Homoscedasticity: The residuals have constant variance at every level of x.

4. Normality: The residuals of the model are normally distributed.

Correlation:

Correlation and regression analysis are related in the sense that both deal with relationships among
variables. The correlation coefficient is a measure of linear association between two variables. Values
of the correlation coefficient are always between -1 and +1.

• A correlation coefficient of +1 indicates that two variables are perfectly related in a positive
linear sense,
• A correlation coefficient of -1 indicates that two variables are perfectly related in a negative
linear sense,
• A correlation coefficient of 0 indicates that there is no linear relationship between the two
variables.
For simple linear regression, the sample correlation coefficient is the square root of the coefficient of
determination, with the sign of the correlation coefficient being the same as the sign of b1, the
coefficient of x1 in the estimated regression equation .They indicate how or to what extent variables
are associated with each other. The correlation coefficient measures only the degree of linear
association between two variables. Any conclusions about a cause-and-effect relationship must be
based on the judgment of the analyst.

Assumptions of correlation:

However, before we calculate the Pearson correlation coefficient between two variables we should
make sure that five assumptions are met:

1. Level of Measurement: The two variables should be measured at the interval or ratio level.
2. Linear Relationship: There should exist a linear relationship between the two variables.
3. Normality: Both variables should be roughly normally distributed.
4. Related Pairs: Each observation in the dataset should have a pair of values.
5. No Outliers: There should be no extreme outliers in the dataset.

6|Page
JUSTIFICATION:
Through the correlation analysis, we evaluate correlation coefficient that tells us how much one
variable change when the other one does. Correlation analysis provides us with a linear
relationship between two variables.

Advantages of Correlation:

1. A correlation can demonstrate the presence or absence of a relationship between two factors so is
good for indicating areas where experimental research could take place and show further results.

2. In Correlation Analysis we can also find out the direction of correlation as positive or negative

3. The Calculation of coefficient of correlation becomes easy while using correlation analysis

4. A correlation study can be conducted on variables that can be measured and not manipulated, for
example when an experimental method would be impractical or unethical to conduct.

Regression Analysis is used to efficiently assess the relationship between an outcome variable and one
or more risk factors or confounding variables. Regression analysis is highly useful when we'll have to
identify the impact of a unit change in the known variable (x) on the estimated variable (y).

Advantages of Regression:

1. Easy Error identification


2. It showcases increased operational efficiency.
3. Helps analysts to predict future forecasting like opportunities and risks.
4. Helps in making better data-informed decision making

DATA ANALYSIS:
SUMMARY OUTPUT

Regression Statistics
0.87589968
Multiple R 6
0.76720025
R Square 9
0.76417688
Adjusted R Square 6
0.48159910
Standard Error 4
Observations 79

ANOVA
Significanc
  df SS MS F eF

7|Page
58.8556 253.756
7 4
Regression 1 58.85567 4.39E-26
0.23193
Residual 77 17.8592 8
Total 78 76.71487      

Standard Upper Lower Up


  Coefficients Error t Stat P-value Lower 95% 95% 95.0% 95
0.86883661 3.41285 0.00102 1.37576
Intercept 6 0.254577 9 8 0.361908 5 0.361908 1.3
0.09703406 15.9297 0.10916
X Variable 1 9 0.006091 3 4.39E-26 0.084905 4 0.084905 0.1

2
1.5 X Variable 1 Residual Plot
1
Residuals

0.5
0
-0.5 10 20 30 40 50 60 70 80 90
-1
-1.5
X Variable 1

CORRRELATION

Column Column
  1 2
Column
1 1
Column
2 0.8759 1

8|Page
10 | P a g e

You might also like