Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 22

1

Finance & Acc. Assignment – STATA

Name
Course-Code
Faculty/Dept.
Date
2

Finance & Acc. Assignment – STAT

Generating a Usable Dataset

In proceeding to merge the datasets using the unique identifiers "Ticker" and "Year," I

began by loading them separately to confirm their properties in STATA. I ran them in the

software as follows:

I accessed the directory: cd "C:\Users\MYPC\Desktop\ASSIGNMENT\finance"


C:\Users\MYPC\Desktop\ ASSIGNMENT\finance
I then imported the first dataset: import excel using
"datforhomework1_2023.xls", sheet("Sheet1") firstrow clear
I then viewed a summative description of the data using the command: describe
The command gave the following output:
Contains data
obs: 7,008
vars: 12
size: 406,464
-----------------------------------------------------------------
---------------------------
storage display value
variable name type format label variable label
-----------------------------------------------------------------
---------------------------
ticker str7 %9s ticker
year int %10.0g year
permid long %10.0g permid
agefirm int %10.0g agefirm
meanagef double %10.0g meanagef
assets double %10.0g assets
3

bs_volatility double %10.0g bs_volatility


roa double %10.0g roa
founderCEO byte %10.0g founderCEO
Q double %10.0g Q
digit2_in byte %10.0g digit2_in
hightech byte %10.0g hightech
-----------------------------------------------------------------
---------------------------
Sorted by:
Note: dataset has changed since last saved
I did the same for the other dataset:
. import excel using "famfirms.xls", firstrow clear

. describe

Contains data
obs: 2,686
vars: 4
size: 131,614
-----------------------------------------------------------------
---------------------------
storage display value
variable name type format label variable label
-----------------------------------------------------------------
---------------------------
Ticker str6 %9s Ticker
Year int %10.0g Year
Coname str40 %40s Coname
FamFirm byte %10.0g FamFirm
4

-----------------------------------------------------------------
---------------------------
Sorted by:
Note: dataset has changed since last s
I then merged the data using the unique identifiers "Ticker" and "Year" with the following
commands and outputs:
cd "C:\Users\MYPC\Desktop\ASSIGNMENT\finance
C:\Users\MYPC\Desktop\ASSIGNMENT\finance

.
.
.
. import excel using "datforhomework1_2023.xls", sheet("Sheet1")
firstrow clear

.
. keep if !missing(Ticker) & !missing(Year)
(318 observations deleted)

.
. save "datforhomework1_2023.dta", replace
(note: file datforhomework1_2023.dta not found)
file datforhomework1_2023.dta saved

.
.
.
. import excel using "famfirms.xls", firstrow clear

.
5

. save "famfirms.dta", replace


(note: file famfirms.dta not found)
file famfirms.dta saved

.
.
.
. use "datforhomework1_2023.dta", clear

.
. merge m:1 Ticker Year using "famfirms.dta"

Result # of obs.
-----------------------------------------
not matched 4,932
from master 4,468 (_merge==1)
from using 464 (_merge==2)

matched 2,222 (_merge==3)


-----------------------------------------
. keep if _merge == 3
(4932 observations deleted)drop _merge
. save "merged_dataset.dta", replace
(note: file merged_dataset.dta not found)
file merged_dataset.dta saved
. use "C:\Users\MYPC\Desktop\ASSIGNMENT\finance\
merged_dataset.dta", clear

Generating the dummy variable "nonfounderfam”:


6

To generate the dummy variable "nonfounderfam" as 1 for family firms whose founder is not
also the CEO, and 0, I opened the merged dataset "merged_dataset.dta" in Stata using the
command:
use "C:\Users\MYPC\Desktop\ASSIGNMENT\finance\merged_dataset.dta", clear

I then generated the "nonfounderfam" variable using the gen command and conditional logic as
follows:
gen nonfounderfam = 0
replace nonfounderfam = 1 if FamFirm == 1 & founderCEO == 0
I then saved the dataset:
save "merged_dataset.dta", replace

Understanding the Data – Data Description

I loaded the merged dataset using the STATA GUI.


The dataset 'merged_dataset.dta' contains 2222 entries or observations providing
information on Fortune 500 companies regarding the presence of founding family over the years,
the earliest year being 1992 and the latest one 1999, but entries for the firms do not necessarily
start or end the same year. The data is a merger of two datasets using the unique identifiers
"Ticker" and "Year," where the "Ticker" variable serves as an abbreviation of each firm's name.
The dataset consists of 2222 observations and the variables of interest are FarmFirm representing
founding family presence status.
Each observation corresponds to a specific Fortune 500 company in a particular year,
allowing the tracking of the presence of founding families over time. Therefore, the earliest year
represented in the dataset is 1992, and the latest is 1999, and some firms do not have entries for
all years in this range. The variable "FamFirm" is a binary indicator that takes value = 1 if a
founding family is present in the company and 0 if otherwise. Therefore, it possible to
distinguish family firms from non-family firms over the years.

The merged dataset contains the following key variables:

The variables available in the merged dataset are as follows:

 Ticker: This is a unique identifier for each firm, representing the company's ticker
symbol.
 year: The year is that for which the data is recorded (ranging from 1992 to 1999).
 permid: This represents a unique identifier for each firm, representing the company's
permanent identifier.
7

 agefirm: The age of the firm in years, indicates how long the company has been in
operation.
 meanagef: The average age of the firm's founders is regardless of whether they are
currently working for the company.
 assets: The book value of assets is asset value measured in millions of dollars, serving as
a proxy for firm size.
 bs_volatility: The measure of uncertainty in the firm's environment is calculated as the
standard deviation of the firm's previous 60-month stock returns.
 roa: This is the return on Assets, an accounting measure of the firm's performance.
 founderCEO: The binary variable (0 or 1) indicates whether the current CEO of the firm
is one of its founders. (1: Founder is also the CEO, 0: Founder is not the CEO)
 Q: The proxy for Tobin's Q is used as a measure of firm performance.
 digit2_in: The two-digit industry code represents the industry in which the firm operates.
 hightech: This is the dummy variable (0 or 1) indicating whether the firm operates in a
high-tech industry. (1: High-tech industry, 0: Non-high-tech industry)
 Coname: The variable is the name of the company (i.e., the full or less abbreviated
name).
 FamFirm: This binary variable represents family firm status (i.e., 1: Family firm, 0: Non-
family firm)
 nonfounderfam: The binary variable indicates whether a firm is a family firm whose
founder is not also the CEO. (1: Family firm with non-founder CEO, 0: Founder is also
the CEO)

Table I
digit2_in FounderCEOFirms NonFounderFirms TotalFirms Percentage
10 16 16 32 50
13 45 45 90 50
15 7 11 18 61.11111
16 15 15 30 50
20 54 113 167 67.66467
21 9 9 18 50
22 0 14 14 100
23 8 13 21 61.90476
24 9 19 28 67.85714
25 8 8 16 50
26 49 71 120 59.16667
27 27 85 112 75.89286
28 151 223 374 59.62567
29 60 65 125 52
30 40 36 76 47.36842
31 9 15 24 62.5
32 7 7 14 50
33 64 75 139 53.95683
34 35 53 88 60.22727
35 102 142 244 58.19672
36 92 102 194 52.57732
8

37 93 124 217 57.14286


38 80 84 164 51.21951
39 15 23 38 60.52632
40 34 34 68 50
42 2 7 9 77.77778
45 42 35 77 45.45454
48 31 33 64 51.5625
49 12 9 21 42.85714
50 21 29 50 58
51 41 43 84 51.19048
52 22 15 37 40.54054
53 43 52 95 54.73684
54 10 40 50 80
55 0 7 7 100
56 29 29 58 50
57 15 15 30 50
58 7 14 21 66.66666
59 16 32 48 66.66666
60 0 8 8 100
61 21 21 42 50
62 8 8 16 50
63 92 106 198 53.53535
64 9 11 20 55
70 0 15 15 100
72 7 14 21 66.66666
73 61 52 113 46.0177
75 8 8 16 50
78 10 16 26 61.53846
79 12 12 24 50
80 13 13 26 50
87 7 7 14 50
99 19 19 38 50

Averaging variables for each firm and then taking averages of these firm-level averages
can lead to misleading results where variability is disregarded and the panel collapsed to a cross-
sectional dataset. The observations are meant to be over time for the firms, with potential time-
invariant heterogeneity and time-variance in variables like family ownership and firm
performance. Therefore, averaging variables within a firm and then calculating means across
firms leads to losing the information on within-firm variation when ownership changes. With
such an approach, we may accurately capture the effects of family ownership presence on firm
performance, but we’d be capturing differences across firms resulting from their unique
characteristics that may be beyond family ownership. A better approach would be the fixed-
effects models accounting for individual firms’ heterogeneity over time.
9

Variable Obs Mean Std. Dev. Min Max


agefirm 1453 69.80385 29.68611 0 147
meanagef 1434 90.8145 11.04385 39 95
assets 2222 14305.87 33813.18 313.932 405200
bs_volatil~y 2193 .2769357 .0923717 .115 1.052
roa 2222 5.388212 5.810912 -49.401 46.206
Q 2222 1.889099 .9334732 .6353764 9.847485

t-statistic
Variable t-statistic p-value degrees of freedom
agefirm 6.6412401 4.385e-11 1451
meanagef 11.145291 1.005e-27 1432
assets 4.0903122 .00004462 2220
bs_volatility -1.9035795 .05709589 .
roa -4.9590917 7.619e-07 2220
Q -4.747815 2.187e-06 2220

Correlations:
agefir meanage assets bs_volatil~ roa Q NonfamilyFouder
m f y s
agefirm 1.0000
meanagef 0.5325 1.0000
assets 0.1503 0.0868 1.000
0
bs_vol~y -0.2890 -0.3359 - 1.0000
0.109
7
roa -0.0035 -0.1139 - -0.1995 1.000
0.094 0
3
Q 0.0222 -0.1372 - -0.1287 0.618 1.000
0.075 4 0
0
NonfamFounder 0.1801 0.2844 0.064 -0.0526 - - 1.0000
s 8 0.131 0.106
7 0

Multivariate Analysis
Regression
10

Q = 3.743127 + 0.1453296 * FamFirm - 1.72228 * bs_volatility - 0.1686439 * ln_assets_new + 0.1795389


* hightech - 0.1254456 * Year_1993 - 0.1923268 * Year_1994 - 0.1284221 * Year_1995 - 0.0397223 *
Year_1996 + 0.109091 * Year_1997 + 0.2578259 * Year_1998 + 0.2311502 * Year_1999

Coefficient Std. Error t-value P-value [95% Conf.


Interval]
FamFirm 0.1453296 0.044961 3.23 0.001 [0.0571587,
0.2335004]
bs_volatility -1.72228 0.1941171 -8.87 0.000 [-2.102953, -
1.341606]
ln_assets_new -0.1686439 0.0155738 -10.83 0.000 [-0.199185, -
0.138103]
hightech 0.1795389 0.0477526 3.76 0.000 [0.0858936,
0.2731842]
Year 1993 -0.1254456 0.0764861 -1.64 0.101 [-0.2754389,
0.0245477]
Year 1994 -0.1923268 0.0763782 -2.52 0.012 [-0.3421085, -
0.0425451]
Year 1995 -0.1284221 0.0782054 -1.64 0.101 [-0.2817871,
0.0249428]
Year 1996 -0.0397223 0.0817238 -0.49 0.627 [-0.199987,
0.1205423]
Year 1997 0.109091 0.0835631 1.31 0.192 [-0.0547805,
0.2729625]
Year 1998 0.2578259 0.0986513 2.61 0.009 [0.0643655,
0.4512864]
Year 1999 0.2311502 0.0972718 2.38 0.018 [0.040395,
0.4219053]
_cons 3.743127 0.1799629 20.80 0.000 [3.39021,
4.096043]

To reestimate Model 1 while accounting for heteroskedasticity, I used the White heteroskedastic-
consistent standard errors:
regress Q FamFirm bs_volatility ln_assets_new hightech i.Year,
robust

Variable Coefficient Std. Error t-value P-value 95% Conf. 95% Conf.
Lower Upper
FamFirm 0.1453296 0.044961 3.23 0.001 0.0571587 0.2335004
11

bs_volatility -1.72228 0.1941171 -8.87 0.000 -2.102953 -1.341606


ln_assets_new -0.1686439 0.0155738 -10.83 0.000 -0.199185 -0.1381028
hightech 0.1795389 0.0477526 3.76 0.000 0.0858936 0.2731842
1993 -0.1254456 0.0764861 -1.64 0.101 -0.2754389 0.0245477
1994 -0.1923268 0.0763782 -2.52 0.012 -0.3421085 -0.0425451
1995 -0.1284221 0.0782054 -1.64 0.101 -0.2817871 0.0249428
1996 -0.0397223 0.0817238 -0.49 0.627 -0.199987 0.1205423
1997 0.109091 0.0835631 1.31 0.192 -0.0547805 0.2729625
1998 0.2578259 0.0986513 2.61 0.009 0.0643655 0.4512864
1999 0.2311502 0.0972718 2.38 0.018 0.040395 0.4219053
_cons 3.743127 0.1799629 20.80 0.000 3.39021 4.096043

In reestimating the specification in Column IT with the variable "founderCEQ" instead of "FamFirm,":
gen NonfamilyFirms = 1 - founderCEO
gen ln_assets_new = ln(assets)
regress Q founderCEO bs_volatility ln_assets_new hightech i.Year, robust

Coefficient Standard t-stat p-value Lower CI Upper CI


Error
founderCEO 0.3178567 0.0980374 3.24 0.001 0.1255971 0.5101163
bs_volatility -1.943758 0.2105822 -9.23 0.000 -2.356727 -1.530788
ln_assets_new -0.1883617 0.0160095 -11.77 0.000 -0.2197577 -0.1569657
hightech 0.1498709 0.0486702 3.08 0.002 0.0544245 0.2453172
1993 -0.1193032 0.0761144 -1.57 0.117 -0.26857 0.0299636
1994 -0.1800086 0.0762835 -2.36 0.018 -0.3296071 -0.0304101
1995 -0.1148532 0.0780121 -1.47 0.141 -0.2678415 0.0381351
1996 -0.0372633 0.0817711 -0.46 0.649 -0.1976234 0.1230969
1997 0.1137391 0.0837981 1.36 0.175 -0.050596 0.2780742
1998 0.2913357 0.1001666 2.91 0.004 0.0949004 0.4877709
1999 0.2859362 0.0991209 2.88 0.004 0.0915518 0.4803207
Number of obs F(11, 2118) Prob > F R-squared Root MSE
2130 28.41 0.0000 0.0939 0.89915

Re-estimating Column 11 specification after replacing lamflirm with nonfounderfam.

Variable Coefficient Standard Error t-value p-value Variable


Estimate
nonfounderfam 0.0793353 0.0474 1.67 0.094 nonfounderfam
bs_volatility -1.676718 0.1964725 -8.53 0.000 bs_volatility
ln_assets_new -0.1755872 0.0152699 -11.50 0.000 ln_assets_new
hightech 0.1769027 0.0479077 3.69 0.000 hightech
Variable Coefficient Standard Error t-value p-value Variable
Estimate
12

1993 -.1237143 .0766866 -1.61 0.107 -.2741007


1994 -.1908646 .0765518 -2.49 0.013 -.3409868
1995 -.1255478 .0783935 -1.60 0.109 -.2792815
1996 -.039884 .0820008 -0.49 0.627 -.2006918
1997 .109281 .0838894 1.30 0.193 -.0552304
1998 .2557717 .0990283 2.58 0.010 .0615721
1999 .2267759 .0973687 2.33 0.020 .0358308
_cons 3.817959 .178421 21.40 0.000 3.468066

Number of obs F(11, 2181) Prob > F R-squared Root MSE

2193 27.28 0.0000 0.0845 0.89649

Column II: Replacing Q with ROA fo V, VI, & VII


V:

Variable Coefficient Std. Err. t-stat P-value Lower CI Upper CI


FamFirm 0.8725552 0.2712304 3.22 0.001 0.3406583 1.404452
bs_volatility -17.49335 1.589865 -11.00 0.000 -20.61115 -14.37554
ln_assets_new -1.085033 0.1025711 -10.58 0.000 -1.28618 -0.8838852
hightech 0.6805309 0.3429308 1.98 0.047 0.0080257 1.353036
1993 -1.141379 0.4694491 -2.43 0.015 -2.061993 -0.2207646
1994 0.0555542 0.4350387 0.13 0.898 -0.7975793 0.9086878
1995 -0.4464318 0.464891 -0.96 0.337 -1.358107 0.4652438
1996 -0.4938149 0.4533737 -1.09 0.276 -1.382904 0.3952745
1997 -1.043172 0.5340777 -1.95 0.051 -2.090526 0.0041823
1998 -0.5699293 0.5529228 -1.03 0.303 -1.65424 0.5143812
1999 0.8916283 0.5034029 1.77 0.077 -0.0955712 1.878828
_cons 19.60219 1.164299 16.84 0.000 17.31894 21.88544
VI:

NonfamilyFirms -.8725552 .2712304 -3.22 0.001 -1.404452 -.3406583


bs_volatility -17.49335 1.589865 -11.00 0.000 -20.61115 -14.37554
ln_assets_new -1.085033 .1025711 -10.58 0.000 -1.28618 -.8838852
hightech .6805309 .3429308 1.98 0.047 .0080257 1.353036
1993 -1.141379 .4694491 -2.43 0.015 -2.061993 -.2207646
1994 .0555542 .4350387 0.13 0.898 -.7975793 .9086878
1995 -.4464318 .464891 -0.96 0.337 -1.358107 .4652438
1996 -.4938149 .4533737 -1.09 0.276 -1.382904 .3952745
1997 -1.043172 .5340777 -1.95 0.051 -2.090526 .0041823
13

NonfamilyFirms -.8725552 .2712304 -3.22 0.001 -1.404452 -.3406583


1998 -.5699293 .5529228 -1.03 0.303 -1.65424 .5143812
1999 .8916283 .5034029 1.77 0.077 -.0955712 1.878828
_cons 20.47474 1.108753 18.47 0.000 18.30042 22.64907

VII:

Variable Coefficient Std. Error t-stat p-value 95% CI Lower 95% CI Upper
FamFirm 0.8725552 0.2712304 3.22 0.001 0.3406583 1.404452
bs_volatility -17.49335 1.589865 -11.00 0.000 -20.61115 -14.37554
ln_assets_new -1.085033 0.1025711 -10.58 0.000 -1.28618 -0.8838852
hightech 0.6805309 0.3429308 1.98 0.047 0.0080257 1.353036
1993 -1.141379 0.4694491 -2.43 0.015 -2.061993 -0.2207646
1994 0.0555542 0.4350387 0.13 0.898 -0.7975793 0.9086878
1995 -0.4464318 0.464891 -0.96 0.337 -1.358107 0.4652438
1996 -0.4938149 0.4533737 -1.09 0.276 -1.382904 0.3952745
1997 -1.043172 0.5340777 -1.95 0.051 -2.090526 0.0041823
1998 -0.5699293 0.5529228 -1.03 0.303 -1.65424 0.5143812
1999 0.8916283 0.5034029 1.77 0.077 -0.0955712 1.878828
_cons 19.60219 1.164299 16.84 0.000 17.31894 21.88544

Columns II, III, and IV after replacing "Q" with "ln_Q" (Natural log of Q):

II:

Variable Coefficient Std. Err. t-stat P-value 95% Conf. 95% Conf.
FamFirm 0.3734733 0.2191741 1.70 0.089 -0.0563387 0.8032853
bs_volatility -9.100198 1.404039 -6.48 0.000 -11.85359 -6.346803
ln_assets_new -0.3281843 0.0834746 -3.93 0.000 -0.4918824 -0.1644863
hightech -0.2300397 0.2893818 -0.79 0.427 -0.7975326 0.3374532
1993 -0.8252379 0.3309317 -2.49 0.013 -1.474212 -0.1762634
1994 0.7038533 0.2847899 2.47 0.014 0.1453653 1.262341
1995 -0.0588121 0.3253736 -0.18 0.857 -0.696887 0.5792627
1996 -0.4333491 0.2998699 -1.45 0.149 -1.02141 0.1547116
1997 -1.620595 0.4284931 -3.78 0.000 -2.460892 -0.7802972
1998 -1.454903 0.4406229 -3.30 0.001 -2.318988 -0.5908181
1999 0.0659321 0.3565427 0.18 0.853 -0.633267 0.7651312
ln_Q 8.37467 0.2784779 30.07 0.000 7.82856 8.92078
_cons 6.584117 0.9693769 6.79 0.000 4.683117 8.485116
14

III:

Standard t- p- 95% CI Lower 95% CI Upper


Variable Coefficient
Error value value Bound Bound
NonfamilyFirms -0.3734733 0.2191741 -1.70 0.089 -0.8032853 0.0563387
bs_volatility -9.100198 1.404039 -6.48 0.000 -11.85359 -6.346803
ln_assets_new -0.3281843 0.0834746 -3.93 0.000 -0.4918824 -0.1644863
hightech -0.2300397 0.2893818 -0.79 0.427 -0.7975326 0.3374532
1993 -0.8252379 0.3309317 -2.49 0.013 -1.474212 -0.1762634
1994 0.7038533 0.2847899 2.47 0.014 0.1453653 1.262341
1995 -0.0588121 0.3253736 -0.18 0.857 -0.696887 0.5792627
1996 -0.4333491 0.2998699 -1.45 0.149 -1.02141 0.1547116
1997 -1.620595 0.4284931 -3.78 0.000 -2.460892 -0.7802972
1998 -1.454903 0.4406229 -3.30 0.001 -2.318988 -0.5908181
1999 0.0659321 0.3565427 0.18 0.853 -0.633267 0.7651312
ln_Q 8.37467 0.2784779 30.07 0.000 7.82856 8.92078
_cons 6.95759 0.9367351 7.43 0.000 5.120603 8.794577

Column IV: Similar Operation:

Variable Coefficient Std. Error t-stat P-value Lower 95% CI Upper 95% CI
FamFirm 0.3734733 0.2191741 1.70 0.089 -0.0563387 0.8032853
bs_volatility -9.100198 1.404039 -6.48 0.000 -11.85359 -6.346803
ln_assets_new -0.3281843 0.0834746 -3.93 0.000 -0.4918824 -0.1644863
hightech -0.2300397 0.2893818 -0.79 0.427 -0.7975326 0.3374532
1993 -0.8252379 0.3309317 -2.49 0.013 -1.474212 -0.1762634
1994 0.7038533 0.2847899 2.47 0.014 0.1453653 1.262341
1995 -0.0588121 0.3253736 -0.18 0.857 -0.696887 0.5792627
1996 -0.4333491 0.2998699 -1.45 0.149 -1.02141 0.1547116
1997 -1.620595 0.4284931 -3.78 0.000 -2.460892 -0.7802972
1998 -1.454903 0.4406229 -3.30 0.001 -2.318988 -0.5908181
1999 0.0659321 0.3565427 0.18 0.853 -0.633267 0.7651312
ln_Q 8.37467 0.2784779 30.07 0.000 7.82856 8.92078
_cons 6.584117 0.9693769 6.79 0.000 4.683117 8.485116
15

Column XI: Reestimating the specification in Column XI for


famfirm after eliminating observations with founderCEO==1

Variable Coefficient Std. Error t-statistic p-value Lower CI Upper CI


FamFirm 0.0765235 0.0472807 1.62 0.106 -0.0162002 0.1692472
bs_volatility -1.837279 0.2103509 -8.73 0.000 -2.249805 -1.424753
ln_assets_new -0.1848216 0.015414 -11.99 0.000 -0.2150505 -0.1545926
hightech 0.1172806 0.0467779 2.51 0.012 0.025543 0.2090181
1993 -0.0964994 0.0757139 -1.27 0.203 -0.2449844 0.0519857
1994 -0.1488384 0.0762286 -1.95 0.051 -0.2983328 0.0006559
1995 -0.0914921 0.0778195 -1.18 0.240 -0.2441066 0.0611223
1996 -0.0139749 0.0815187 -0.17 0.864 -0.1738439 0.1458941
1997 0.1336772 0.0827546 1.62 0.106 -0.0286156 0.2959701
1998 0.2722961 0.0976816 2.79 0.005 0.0807294 0.4638628
1999 0.2904096 0.0981402 2.96 0.003 0.0979437 0.4828755
_cons 3.900681 0.1802996 21.63 0.000 3.547089 4.254272

Column XII: Reestimating the specification in Column IT for


founderCEQ after eliminating observations with nonfounderfam==1

Variable Coefficient Std. Error t-stat p-value Lower 95% CI Upper 95% CI
founderCEO 0.3339915 0.0995308 3.36 0.001 0.1387632 0.5292197
bs_volatility -1.928315 0.2261622 -8.53 0.000 -2.371929 -1.484701
ln_assets_new -0.181764 0.0174298 -10.43 0.000 -0.2159522 -0.1475758
hightech 0.1805779 0.052912 3.41 0.001 0.0767918 0.284364
1993 -0.0818767 0.0874207 -0.94 0.349 -0.2533512 0.0895978
1994 -0.1573232 0.0862384 -1.82 0.068 -0.3264786 0.0118322
1995 -0.0815038 0.0893594 -0.91 0.362 -0.2567809 0.0937734
1996 -0.0015462 0.0933073 -0.02 0.987 -0.1845671 0.1814747
1997 0.1311439 0.0957358 1.37 0.171 -0.0566405 0.3189282
1998 0.244607 0.1080777 2.26 0.024 0.0326141 0.4565999
1999 0.2326735 0.1074327 2.17 0.030 0.0219458 0.4434012
_cons 3.900178 0.1998218 19.52 0.000 3.508231 4.292126
16

Column XIII: Re-estimating model 1 assuming heteroskedasticity, but including firm dummies:

Variable Coefficient Std. Error t-value P-value Lower 95% CI Upper 95% CI
FamFirm 0.0322847 0.0864707 0.37 0.709 -0.1373052 0.201875
bs_volatility -1.335765 0.2755755 -4.85 0.000 -1.876234 -0.7952949
ln_assets_new -0.3606776 0.0605294 -5.96 0.000 -0.4793902 -0.241965
hightech 0 (omitted) - - - - -
1993 0.0284773 0.0483336 0.59 0.556 -0.0663165 0.123271
1994 -0.0393587 0.045634 -0.86 0.389 -0.1288579 0.0501405
1995 0.0693375 0.0461064 1.50 0.133 -0.0210882 0.159763
1996 0.1586567 0.0480367 3.30 0.001 0.0644453 0.252868
1997 0.3647576 0.0489925 7.45 0.000 0.2686716 0.460844
1998 0.4937141 0.0569458 8.67 0.000 0.3820297 0.605399
1999 0.4774547 0.0665994 7.17 0.000 0.3468373 0.608072
_cons 5.18374 0.5531294 9.37 0.000 4.09892 6.26856

To address the endogeneity of founderCEO in Column III, I apply instrumental variable (IV)
estimation. I used the variable meanagef as the instrument for founderCEO, but which is
measured only in 1994, and I treat it as an exogenous variable for all other years. Here is the
output:

Variable Coefficient Std. Error t-stat p-value Lower CI Upper CI


founderCEO 1.089932 .1472939 7.40 0.000 .8012408 1.378622
bs_volatility -2.654723 .3192537 -8.32 0.000 -3.280449 -2.028997
ln_assets_new -.0774117 .0250533 -3.09 0.002 -.1265152 -.0283081
hightech .0826618 .0636002 1.30 0.194 -.0419923 .2073159
1993 -.099724 .1026846 -0.97 0.331 -.3009822 .1015341
1994 -.1745344 .1023552 -1.71 0.088 -.3751469 .0260781
1995 -.0917414 .102323 -0.90 0.370 -.2922907 .108808
1996 -.0031174 .1026249 -0.03 0.976 -.2042586 .1980238
1997 .2063444 .1031139 2.00 0.045 .0042448 .4084439
1998 .4414195 .104714 4.22 0.000 .2361838 .6466552
1999 .3719673 .1087275 3.42 0.001 .1588654 .5850693
_cons 3.160796 .2612286 12.10 0.000 2.648797 3.672795

In performing the first stage regression for the model in Column III using meanagef as the instrument
for founderCEO, I used the ivregress command with bs_volatility, ln_assets_new,
hightech, and i.Year as independent variables.
17

Output:

Instrument Variable Estimates


Estimated coefficient on the instrument and the associated t-statistic

founderCEO Coef. Std. Err. z P> z [95% Conf. Interval]


meanagef -0.0160032 0.001168 -13.70 0.000 -0.0182925 -0.013714
_cons 1.518479 0.1061368 14.31 0.000 1.310454 1.726503

95% Conf.
Variable Coefficient Std. Err. t-statistic P-value 95% Conf. Upper
Lower
bs_volatility -42.86471 3.498083 -12.25 0.000 -49.72674 -36.00269
ln_assets_new 0.8933977 0.2891044 3.09 0.002 0.3262749 1.46052
hightech -1.989932 0.7376537 -2.70 0.007 -3.436953 -0.5429105
1992 0 (empty) (empty) (empty) (empty) (empty)
1993 0.6294254 1.190973 0.53 0.597 -1.706852 2.965703
1994 0.5692188 1.187144 0.48 0.632 -1.759546 2.897984
1995 -0.4005572 1.186799 -0.34 0.736 -2.728646 1.927531
1996 -1.091232 1.190296 -0.92 0.359 -3.42618 1.243716
1997 -0.96853 1.195973 -0.81 0.418 -3.314616 1.377556
1998 0.1587945 1.213241 0.13 0.896 -2.221164 2.538753
1999 2.842454 1.254228 2.27 0.024 0.3820932 5.302815
_cons 94.34776 3.028984 31.15 0.000 88.40594 100.2896

The estimated coefficient on the instrument meanagef is approximately -0.0160032, and the associated
t-statistic is -13.70.

Considering founderage for founderCEO:

meanagef is statistically significant (p < 0.001) and has a negative coefficient. This suggests that there is
a strong relationship between meanagef and founderCEO. Therefore, it is a good instrument, and
around 13.1% of the variation in founderCEO is explained by meanagef.
18

Hausman Test
ivregress 2sls founderCEO (meanagef = bs_volatility ln_assets_new
hightech i.Year), first

founderCEO Coefficient Std. Error t-value p-value Lower CI 95% Upper CI 95%
meanagef -0.0160032 0.001168 -13.70 0.000 -0.0182925 -0.013714
_cons 1.518479 0.1061368 14.31 0.000 1.310454 1.726503

~ coefficient on residuals = 0.9164 with a t-statistic of 17.18

regress founderCEO FamFirm bs_volatility ln_assets_new hightech


i.Year residuals, robust
Estimates:

Variable active
FamFirm .11070634
bs_volatility .67814427
ln_assets_weighted -.00359012
hightech .03843777
1993 -.01094484
1994 -.0101905
1995 .00333516
1996 .01857966
1997 .0163442
1998 -.00373617
1999 -.04617963
residuals .91636499
_cons -.11976399

The coefficient on founderCEO in the second-stage regression is about 0.1107, which is

considerably different from the coefficient obtained in the first-stage regression (approximately

0.0160). The t-statistic of the predicted residuals is also highly significant (17.18), indicating that

the instrumental variable is strongly correlated with the endogenous variable. As a result, we

reject the null hypothesis of exogeneity and conclude that founderCEO is endogenous in

Column III. The significance level used for the decision is less than 1% (p < 0.01).
19

5.3: Coefficients

The estimated coefficient on founderCEO using the IV procedure is -8.835202, and the

associated t-statistic is -6.69:

Coefficient Std. Error t-value p-value 95% CI Lower 95% CI Upper


NonfamilyFirms -8.835202 1.321635 -6.69 0.000 -11.42556 -6.244845
_cons 11.36341 0.9035988 12.58 0.000 9.592386 13.13443

Comparatively, the Second Stage of the Hausman test: Estimated Coefficient on founderCEO ≈ 0.1107.

Therefore, the magnitude of the coefficient on founderCEO from the IV procedure is much larger in

magnitude than the coefficient from the Hausman test.

In Column II, the coefficient on founderCEO was estimated to be approximately 0.0334, and the

associated t-statistic was around 2.96. The coefficient was positive and statistically significant.

When using the IV procedure in Column III, the magnitude of the coefficient on founderCEO is

substantially larger in magnitude (-8.835202) and is statistically significant with a much larger t-

statistic (-6.69). Furthermore, the sign of the coefficient indicates a negative relationship with the

NonfamilyFirms endogenous variable.

The inference I gain from here is that there is a significant and negative relationship between founderCEO

and NonfamilyFirms. Additionally, the IV procedure provides a better approach to address endogeneity

concerns in the relationship between founderCEO and NonfamilyFirms, given that the standard OLS

estimation in Column II might suffer from endogeneity bias.

5.4: Instrument Quality – PT2


20

Applying the age of a company as the instrument for founderCEO variable in Column III

seems reasonable. The idea is that older firms are less likely to have a founder as their CEO,

suggesting a potential correlation between a company's age and the founderCEO status. This

correlation is essential for the instrument to be reliable. If the company's age is independent of

the error term in the main regression equation (Column III), it can serve as a suitable instrument

for founderCEO. The caveat is to look out for confounding factors that might simultaneously

influence company age and founderCEO status thus tainting instrument validity.

Assessing the validity of using the company's age as an instrument requires conducting

tests for instrument relevance and over-identification. The tests could help evaluate whether the

instrument has strong correlation with founderCEO and the ruling out of bias in estimations. .

Discussion

Correcting for heteroskedasticity: The effect of such correction is potentially more reliable

coefficient estimates. The effect is due to the notion that heteroskedasticity biases standard OLS

estimates and makes standard errors of the coefficients inefficient, creating incorrect inferences.

Robust standard errors for correcting heteroskedasticity improve the estimates and confidence

intervals’ account for data variability.

On different performance measures: ROA provides is effective for showing level of

efficiency in asset utilization to generate profits while Tobin's Q helps indicate whether a firm’s

market value is faring well against its replacement cost. ROA has more value for accounting-

based performance, thus being more informative to internal evaluations, while Tobin's Q is good

for assessing market-based performance, i.e., it effectively informs investors’ perspectives.


21

About firm dummies: Firm dummies account for unobservable factors that may be specific

to firms with potential impact on a dependent variable. Therefore, their inclusion helps control

for sources of time-invariant heterogeneity that firms may depict, and which, if unaccounted for,

may cause variations that appear to be from independent variables. Therefore, the dummies help

isolate independent variable effects. However, the inclusion lowers the degrees of freedom,

resulting in larger models, thus complicating coefficients in ways that make their interpretation

more difficult.

The main result drivers in Anderson & Reeb: The alignment of ownership with farm

management is the main driver of results for firms in this study. Founding families introduce

long-term orientation and stronger stewardship, thus creating grounds for better planning and

decision-making. Higher ROA and lower volatility is evident in such firms.

Founder CEO Firm Performance: While such firms generally tend to outperform others in

the long term, the gains may be slow and significant performance differences in shorter and

medium terms may show some of the firms failing (Do et al., 2022). Succession issues can also

cause failure of the firms that may have had long-term success. The performance measurement

of Founder-CEO may also vary based on ownership concentration and managerial control.
22

References

Anderson, R. C., & Reeb, D. M. (2003). Founding-family ownership and firm performance:

evidence from the S&P 500. The Journal of Finance, 58, 1301-1327.

https://doi.org/10.1111/1540-6261.00567.

Do, T. N. M., Ha, N. M., Bao, D., & Ngo, T. (2022). The impact of family ownership on firm

performance: A study on Vietnam. Cogent Economics & Finance.

https://doi.org/10.1080/23322039.2022.2038417.

You might also like