Statistics/Data Analysis

Project: Summarize dataset and conduct simple linear regression

(Discrimination in Salaries)

2 . describe

obs: 52 Discrimination in Salaries
vars: 6 16 Sep 2009 22:21
size: 1,248 (_dta has notes)

storage display value

variable name type format label variable label

sx float %9.0g sx Sex (coded 1 for female)

rk float %9.0g rk Rank
yr float %9.0g Years in current rank
dg float %9.0g Highest degree earned
yd float %9.0g Years since highest degree earned
sl float %9.0g Academic year salary in dollars

Sorted by:

3 . summarize sl

Variable Obs Mean Std. Dev. Min Max

sl 52 23797.65 5917.289 15000 38045

4 . tabulate sx

Sex (coded
1 for
female) Freq. Percent Cum.

Male 38 73.08 73.08

Female 14 26.92 100.00

Total 52 100.00
5 . tabulate sx rk

Sex (coded
1 for Rank
female) Assistant Associate Full Total

Male 10 12 16 38
Female 8 2 4 14

Total 18 14 20 52

6 . generate monthly_salary =sl/12

7 . regress sl sx rk yr dg

Source SS df MS Number of obs = 52

F(4, 47) = 64.33
Model 1.5099e+09 4 377479917 Prob > F = 0.0000
Residual 275810189 47 5868301.88 R-squared = 0.8455
Adj R-squared = 0.8324
Total 1.7857e+09 51 35014310.9 Root MSE = 2422.5

sl Coef. Std. Err. t P>|t| [95% Conf. Interval]

sx 608.1003 819.8035 0.74 0.462 -1041.132 2257.332

rk 4753.169 458.3069 10.37 0.000 3831.174 5675.164
yr 391.8404 76.05306 5.15 0.000 238.8413 544.8394
dg -134.2195 715.4535 -0.19 0.852 -1573.526 1305.087
_cons 11101.27 1087.116 10.21 0.000 8914.279 13288.27
