Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Regression Analysis using STATA

Submitted by
Prashant Jain (547)
Master of Business Economics
Batch of 2013

Introduction

The basic objective of this work is to get familiar with the statistical tool called stata and
also understand the use of available databases from PROWESS and Capital Line apart from
developing basic understanding of regression.
Regression has been chosen as a subject of this analysis. In this particular study , data for
sales , Profit after tax (PAT) , earnings per share (EPS) has been taken.
Here, PAT will be regressed w.r.t Sales.
We will then try to understand the significance of values of , & R2.
Comparative study is done to understand how similarly or differently regression relationship
between PAT & Sales of two popular FMCG firms vary over the years.

Regression Analysis of Dabur India Ltd.


Data has been taken for past 10 years.

DATA:

Year

PAT

Sales

EPS

2011

471.41

3264.37

2.52

2010

433.33

2855.96

4.65

2009

373.55

2396.16

4.02

2008

316.77

2083.4

3.41

2007

252.08

1600.43

2.72

2006

189.08

1342.79

3.05

2005

148.01

1226.23

4.83

2004

101.2

1082.58

3.28

2003

84.92

1158.93

2.86

2002

65.03

1102.58

2.23

Source : Capital Line database , www.capitaline.com

Graphical Interpretation:

For Dabur , relationship between PAT and Sales is more or less linear.

100

200

PAT
300

400

500

Scatter plot of the estimated values and fitted regression line is not representing a huge
difference.

1000

1500

2000

2500
Sales

PAT

Fitted values

3000

3500

Mathematical Interpretation using STATA

Output from Stata for Regression between Sales

&

PAT:

----------------------------------------------------------------------------------------------------------------------------------name:
log:
log type:
opened on:

<unnamed>
C:\Users\Prashant Jain\Desktop\Ecotrix\DaburOutput.log
text
1 Feb 2012, 15:56:57

. describe

Contains data from C:\Users\Prashant Jain\Desktop\Ecotrix\DaburProfit.dta


obs:

10

vars:

size:

1 Feb 2012 15:56

160 (99.9% of memory free)

----------------------------------------------------------------------------------------------------------------------------------storage
variable name

type

display

value

format

label

variable label

----------------------------------------------------------------------------------------------------------------------------------PAT

float

%8.0g

Sales

float

%8.0g

EPS

float

%8.0g

----------------------------------------------------------------------------------------------------------------------------------Sorted by:

. summarize

Variable |

Obs

Mean

Std. Dev.

Min

Max

-------------+-------------------------------------------------------PAT |

10

243.538

148.9451

65.03

471.41

Sales |

10

1811.343

794.705

1082.58

3264.37

EPS |

10

3.357

.8823964

2.23

4.83

. regress PAT Sales

Source |

SS

df

MS

Number of obs =

-------------+------------------------------

F(

1,

10

8) =

153.97

Model |

189800.193

189800.193

Prob > F

0.0000

Residual |

9861.6202

1232.70252

R-squared

0.9506

Adj R-squared =

0.9444

-------------+-----------------------------Total |

199661.813

22184.6459

Root MSE

35.11

-----------------------------------------------------------------------------PAT |

Coef.

Std. Err.

P>|t|

[95% Conf. Interval]

-------------+---------------------------------------------------------------Sales |

.1827347

.0147266

12.41

0.000

.1487752

.2166943

_cons |

-87.45728

28.89325

-3.03

0.016

-154.0852

-20.82933

------------------------------------------------------------------------------

. regress EPS PAT Sales

Source |

SS

df

MS

Number of obs =

-------------+-----------------------------Model |

1.22758031

.613790157

Residual |

5.78002978

.82571854

F(

-------------+-----------------------------Total |

7.0076101

2,

10

7) =

0.74

Prob > F

0.5096

R-squared

0.1752

Adj R-squared = -0.0605

.778623344

Root MSE

.90869

-----------------------------------------------------------------------------EPS |

Coef.

Std. Err.

P>|t|

[95% Conf. Interval]

-------------+---------------------------------------------------------------PAT |

.0100804

.0091504

1.10

0.307

-.0115569

.0317178

Sales |

-.0016429

.001715

-0.96

0.370

-.0056982

.0024124

_cons |

3.877828

1.095279

3.54

0.009

1.287905

6.46775

------------------------------------------------------------------------------

. twoway (line PAT Sales)

. graph save Graph "C:\Users\Prashant Jain\Desktop\Ecotrix\Graphdabur.gph"


(file C:\Users\Prashant Jain\Desktop\Ecotrix\Graphdabur.gph saved)

. twoway (line PAT Sales) (scatter PAT Sales)

. graph save Graph "C:\Users\Prashant Jain\Desktop\Ecotrix\Graphdab2.gph"


(file C:\Users\Prashant Jain\Desktop\Ecotrix\Graphdab2.gph saved)

Theoretical Interpretation
The above mentioned regression analysis using STATA was done to recognize the regression pattern
and relationship between PAT of Dabur and Sales of Dabur for past 10 years.
From the data given and also by business logic , we can say Profit after Tax is directly and
almost linearly related to Sales of a firm (in this case Dabur).
If we denote Profit after tax of Dabur by PD
And let Sales of Dabur be denoted by

SD

Then according to linear regression model:


PD = SD
Value of Regression Parameters.
After processing data through STATA, we have following value of regression parameters:
Estimated value of -87.45728
Estimated value of .1827347

Estimated value of R2 = 0.9506

* All values taken at 95% confidence interval.

Analysis of
From PD = SD
We know PD = -87.45728 + .1827347 S D
This shows us that when the firm will suffer a condition of zero sales, PAT will be non-zero but
a negative quantity . This means the firms will be suffering losses. This is because even when
the firm is having no sales and also production is at halt , it still suffers some amount of
fixed cost of production.

Analysis of
From PD = SD
Taking a first degree derivative on both sides.
PD/SD = 0.182
This indicates that 1 unit change in sales can lead to 0.182 change in PAT.
Value of R2 is 0.9506 for this.

Regression Analysis for Proctor & Gamble

Data has been taken for past 10 years.

DATA:

Year

PAT

Sales

EPS

2011

150.88

1001.91

42.83

2010

179.77

904.45

51.65

2009

178.85

774.22

51.28

2008

131.41

646.01

37.09

2007

89.82

539.64

24.27

2006

139.51

567.59

39.47

2005

124.61

684.71

32.78

2004

92.17

577.24

25.78

2003

68.04

442.39

28.88

2002

77.01

409.42

35.59

Source : Capital Line database , www.capitaline.com

Just looking at the data of P&G we can say that unlike Dabur India Ltd., Sales & PAT for P&G is
suffering a fluctuation.

Graphical Interpretation:
For Proctor & Gamble , relationship between PAT and Sales is not exactly linear.
Scatter plot of the estimated values and fitted regression line is representing a huge
difference.

50

100

150

200

P&G

400

600

800
Sales

PAT

Fitted values

1000

Mathematical Interpretation using STATA

----------------------------------------------------------------------------------------------------------------------------------name:
log:
log type:

<unnamed>
C:\Users\Prashant Jain\Desktop\Ecotrix\P&G.log
text

opened on:

1 Feb 2012, 16:20:05

. describe

Contains data from C:\Users\Prashant Jain\Desktop\Ecotrix\PnG.dta


obs:

10

vars:

size:

1 Feb 2012 16:19

160 (99.9% of memory free)

----------------------------------------------------------------------------------------------------------------------------------storage
variable name

type

display

value

format

label

variable label

----------------------------------------------------------------------------------------------------------------------------------PAT

float

%8.0g

Sales

float

%8.0g

EPS

float

%8.0g

----------------------------------------------------------------------------------------------------------------------------------Sorted by:

. summarize

Variable |

Obs

Mean

Std. Dev.

Min

Max

-------------+-------------------------------------------------------PAT |

10

123.207

40.34835

68.04

179.77

Sales |

10

654.758

191.5596

409.42

1001.91

EPS |

10

36.962

9.616517

24.27

51.65

. regress PAT Sales

Source |

SS

df

MS

Number of obs =

-------------+-----------------------------Model |

10265.3727

10265.3727

Residual |

4386.53304

548.31663

F(

-------------+-----------------------------Total |

14651.9058

1627.98953

1,

10

8) =

18.72

Prob > F

0.0025

R-squared

0.7006

Adj R-squared =

0.6632

Root MSE

23.416

-----------------------------------------------------------------------------PAT |

Coef.

Std. Err.

P>|t|

[95% Conf. Interval]

-------------+---------------------------------------------------------------Sales |

.176304

.0407465

4.33

0.003

.0823424

.2702656

_cons |

7.770545

27.68766

0.28

0.786

-56.07732

71.61841

------------------------------------------------------------------------------

. twoway (line PAT Sales)

. twoway (line PAT Sales) (lfit PAT Sales)

Comparative Analysis of Dabur & P&G:

Dabur
Value of = -87.45728
Value of = .1827347

P&G
Value of = 7.770545
Value of = .176304

R2= .9506

R2 = .7006

Regression Eq.
P = -87.45 + 0.182S

Regression Eq.
P = 7.77 + 0.176S

Comparing the above data tells us that , In case of Dabur linear regression model is a good fit
to represent relationship and pattern between PAT & Sales. Value of 0.9506 of coefficient of
determination is a statistically sufficient to explain a linear relationship between profit and
Sales of Dabur.

In case of P&G , value of coefficient of determination is 0.7 which is relatively far from 1.
So it is difficult to portray linear regression model as goodness of fit in this case

You might also like