Empirical Tests of Asset Pricing Models With Individual Assets Resolving The Errors-In-Variables Bias in Risk Premium Estimation

Empirical Tests of Asset Pricing Models with Individual Assets:
Resolving the Errors-in-Variables Bias in Risk Premium Estimation
by
Narasimhan Jegadeesh, Joonki Noh,

Kuntara Pukthuanthong, Richard Roll, and Junbo Wang
September 15, 2015
Abstract
To attenuate an inherent errors-in-variables bias, portfolios are widely employed for risk
premium estimation; but portfolios might diversify away and thus mask relevant risk- or
return-related features of individual assets. We propose a resolution that allows the use of
individual assets while avoiding the bias. It hinges on specific instrumental variables, factor
sensitivities (β’s) calculated from alternate observations. Closed-form asymptotics are
provided for large cross-sections and time-series. Simulations indicate that the IV method
delivers unbiased risk premium estimates and well-specified tests with adequate power in
small samples. Empirical implementation finds some evidence of significant risk premiums
for the size and book-to-market. However, when controlling for non-β characteristics,
estimated risk premiums are insignificant for the CAPM, size, book-to-market,
investments, profitability, and the liquidity-adjusted CAPM.
Co-Author Affiliation Voice E-Mail

Emory University Jegadeesh@
Jegadeesh 404-727-4821
Atlanta GA 30322 Emory.Edu
Case Western Reserve University Joonki.Noh@
Noh 216-368-3737
Cleveland OH 44106 Case.Edu
University of Missouri PukthuanthongK@
Pukthuanthong 619-807-6124
Columbia MO 65211 Missouri.Edu
Caltech
Roll 626-395-3890 RRoll@Caltech.Edu
Pasadena CA 91125
Louisiana State University
Wang 781-258-8806 junbowang@lsu.edu
Baton Rouge LA 70808
Key Words: Risk Premium Estimation, Errors-in-Variables Bias, Instrumental Variables,

Individual Stocks, Asset Pricing Models
Electronic copy available at: http://ssrn.com/abstract=2664332

1. Introduction
A fundamental precept of financial economics is that investors earn higher average
returns by bearing systemic risks. While this idea is well accepted, there is little agreement
about the identity of systematic risks or the magnitude of the supposed rewards. This is not
due to a lack of effort along two lines of enquiry. First, numerous candidates have been
proposed as underlying risk factors. Second, empirical efforts to estimate risk premiums
also have a long and varied history.
Starting with the single-factor CAPM (Sharpe, 1964; Lintner, 1965), and the multi-
factor APT, Ross (1976), the first line of enquiry has brought forth an abundance of risk
factor candidates. Among others, these include the Fama and French size and book-to-
market factors, human capital risk (Jagannathan and Wang, 1996), productivity and capital
investment risk (Cochrane, 1996; Chen, Novy-Marx and Zhang, 2011; Eisfeldt and
Papanikolaou, 2013), different components of consumption risk (Lettau and Ludvigson,
2001; Ait-Sahalia, Parker, and Yogo, 2004; Li, Vassalou, and Xing, 2006), cash flow and
discount rate risks (Campbell and Vuolteenaho, 2004) and illiquidity risk (Pastor and
Stambaugh, 2003; Acharya and Pedersen, 2005). Harvey, Liu and Zhu (2015) survey the
literature and report that more than 300 factors have been proposed.
The second line of enquiry has produced empirical estimates of risk premiums for
many among, what Cochrane (2011) terms as “zoo” of factors. Most estimation methods
have followed those originally introduced by Black, Jensen and Scholes (1972), (BJS), and
refined by Fama and Macbeth (1973), (FM). Their most prominent feature is the use of
portfolios rather than individual assets in testing asset pricing models. This has long been
considered essential because of an error-in-variables (EIV) problem inherent in estimating
risk premiums.
The EIV problem is best appreciated by tracing through the BJS and FM methods. It
involves two-pass regressions: the first pass is a time series regression of individual asset
returns on the proposed factors. This pass provides estimates of factor loadings, widely
called “betas” in the finance literature. 1 The second pass regresses asset returns cross-
1
Hereafter, we will adopt the shorthand nomenclature “Beta” to mean “factor sensitivity” or “factor
Electronic copy available at: http://ssrn.com/abstract=2664332

sectionally on the betas obtained from the first pass regression. Since the explanatory
variables in the second pass are estimates, rather than the true betas, the resulting risk
premium estimates are biased and inconsistent; and the direction of the bias is unknown
when there are multiple factors involved in the two-pass regressions.
With a large number (=N) of individual assets, the EIV bias can be reduced by working
with portfolios rather than individual assets. This process begins by forming diversified
portfolios classified by some individual asset characteristic such as a beta estimated over a
preliminary sample period. It then estimates portfolio betas on the factors using data for
second period. Finally it runs the cross-sectional regressions on estimated portfolio betas
using data for a third period. BJS, Blume and Friend (1973), and FM note that portfolios
have less idiosyncratic risk; so the errors-in-variables bias is reduced (and can be entirely
eliminated as N grows without bound).
But using portfolios, rather than individual assets, has its own defects. There is an
immediate issue of test power since dimensioniality is reduced; i.e., there are unavoidably
fewer explanatory variables with portfolios than with individual assets.
Diversification into portfolios can mask cross-sectional phenomena in individual
assets that are unrelated to the portfolio grouping procedure. For example, advocates of
fundamental indexation (Arnott, Hsu and Moore (2005)) argue that high market value
assets are overpriced and vice versa, but any portfolio grouping by an attribute other than
market value itself could diversify away such mispricing, rendering it undetectable.
Another troubling result of portfolio masking involves the cross-sectional relation
between average returns and factor exposures (“betas”). Take the single-factor CAPM as
an illustration (though the same effect is at work for any linear factor model). The cross-
sectional relation between expected returns and betas holds exactly if and only if the market
index used for computing betas is on the mean/variance frontier of the individual asset
universe. Errors from the beta/return line, either positive or negative, imply that the index
is not on the frontier. But if the individual assets are grouped into portfolios sorted by
portfolio beta and the individual errors are not related to beta, the analogous line fitted to
loading.”
3
portfolio returns and betas will display much smaller errors. This could lead to a mistaken
inference that the index is on the efficient frontier.
Test portfolios are typically organized by firm characteristics related to average
returns, e.g., size and book-to-market. Sorting on characteristics that are known to predict
returns helps generate a reasonable variation in average returns across test assets. But
Lewellen, Nagel, and Shanken (2010) point out sorting on characteristics also imparts a
strong factor structure across test portfolios. Lewellen et al. (2010) show that as a result
even factors that are weakly correlated with the sorting characteristics would explain the
differences in average returns across test portfolios regardless of the economic merits of
the theories that underlie the factors.
Finally, the statistical significance and economic magnitudes of risk premiums could
depend critically on the choice of test portfolios. For example, the Fama and French size
and book-to-market risk factors are significantly priced when test portfolios are sorted
based on corresponding characteristics, but they do not command significant risk premiums
when test portfolios are sorted only based on momentum.
In an effort to overcome the deficiencies of portfolio grouping while avoiding the EIV
bias, we develop a new procedure to estimate risk premiums and to test their statistical
significance using individual assets. Our method adopts the instrumental variables
technique, a standard econometric solution to the EIV problem. We define a particular
set of well-behaved instruments and hereafter refer to our approach as the IV method.
To be specific, our IV method first estimates betas for individual assets from a portion
of the observations available in the data sample. These become the “independent” variables
for the second-stage cross-sectional regressions. Then, using completely different sample
observations, it re-estimates the same betas, which become the “instrumental” variables in
the second-stage cross-sectional regressions.
We explore several variants of this basic scheme. One variant estimates betas and beta
instruments using observations from sequential subsamples; betas from the first T
observations and beta instruments from observations T+1 to 2T. A cross-sectional
regression then is run for observations in 2T+1. The entire procedure is rolled forward by
4
one period and repeated up until the last available observation, thereby generating a time
series of risk premium estimates for statistical testing.
Another variant of the basic scheme is to estimate betas from observations in even-
numbered periods, e.g., months 2, 4, …2T, and beta instruments from observations in odd-
numbered periods, e.g., months 1, 3, … 2T-1. The second-stage cross-sectional regressions
are then run using returns in 2T+1. The roles of betas and their instruments are
interchangeable in the second-stage cross-sectional regressions.
The IV method produces N-consistent risk premium estimates for a finite length of
time-series. 2 In a more general case that allows both the size of cross-section N and the
length of time-series T to grow without bounds, we prove that the IV method provides NT-
consistent risk premium estimates. We also develop the asymptotic distributions of the IV
estimator for those two different scenarios of N and T. 3
While large sample properties can provide some guidance, it is important to examine
the small sample performance of various estimators for practical applications. To do so,
we conduct a number of simulation experiments. We choose simulation parameters
matched to those in the actual data. Simulation results verify that the IV method produces
unbiased risk premium estimates even for relatively short time-series used to estimate
factor sensitivities. In contrast, we find that the standard approach that fits the the second
stage regressions using OLS (hereafter we will refer to this standard approach as the OLS
method) suffers from severe EIV biases. The simulations also show that the root-mean-
squared errors of the IV method are substantially lower than those of the OLS method. For
example, in simulations with a single factor model under time-constant factor sensitivities,
we find that the OLS estimator, if used with individual stocks, is significantly biased
toward zero even when betas are estimated with 2640 time series observations. In contrast,
the IV estimator yields nearly unbiased risk premium estimates when only 264 time-series
observations are available (see Figure 1).
2
According to Shanken (1992), N-consistent risk premium estimator converges to the ex-post risk
premium as N goes to infinity for fixed T.
3
In our simulations and empirical analyses, we employ a truncated version of the IV estimator as an
adjustment for finite N. For detailed discussion, see Section 3.1.
5
In terms of test size (i.e., type I error) and power (i.e., type II error), we find that the
conventional t-tests based on the IV estimator are well specified (under the null hypothesis
that true risk premiums are zero) and they are reasonably powerful (under the alternative
hypothesis that true risk premiums equal the sample means of factor realizations) in small
samples for estimating betas. For the Fama-French three-factor models, similar results are
found.
With actual data, we apply the IV method to estimate the risk premiums for several
risk factors proposed in the literature, which include the CAPM, the three-factor and five-
factor models of Fama and French (1993 and 2014), the q-factor asset pricing model of
Hou, Xue, and Zhang (2014), and the liquidity-adjusted capital asset pricing model
(LCAPM) of Acharya and Pedersen (2005). These risk factors have been empirically
successful when they were tested with portfolios. In contrast to the original papers, when
controlling for corresponding non-β characteristics, we find that none of these factors is
associated with a significant risk premium in the cross-section of individual stock returns.
This failure to find significant risk premiums is not due to the lack of test power of the
IV method. We present simulation evidence that the t-tests based on the IV method provide
reasonably high power under the alternative hypotheses that the true risk premiums equal
the sample means of factor realizations. For example, when the true HML risk premium is
positive, the rejection rates of the null hypothesis (i.e., zero risk premium for HML) are
84.2% and 89.6% under time-constant and time-varying factor sensitivities, respectively.
When analyzing real data, in the absence of non-β characteristics, we find some evidence
that SMB and HML command significant risk premiums in the cross-section of individual
stocks returns. However, this pricing evidence of SMB and HML betas is substantially
weakened when corresponding non-β characteristics are included in the cross-sectional
regressions. This stark difference in pricing evidence without and with non-β
characteristics indicates that insignificant risk premiums are not due to the lack of test
power of the IV method.
Our paper also contributes to a large literature on testing asset pricing models. As the
length of time-series grows indefinitely, Shanken (1992) shows that the EIV bias becomes
6
negligible because the estimation accuracy of betas improves. He also derives an
asymptotic adjustment for the FM standard errors of the OLS method. Jagannathan and
Wang (1998) extend the Shanken’s asymptotic analysis to the case of conditionally
heterogeneous errors in the time series regression. Shanken and Zhou (2007) and Kan,
Robotti and Shanken (2013) extend the result to misspecified models. The evidence and
analyses in those papers mainly focus on test portfoilos. Our paper focuses on individual
stocks as test assets and proposes the IV method to mitigate the EIV bias in testing asset
pricing models, which is likely more severe with individual stocks than with portfolios.
Accordingly, we provide a “double” asymptotic theory of the IV estimator, in which the
size of cross-section and the length of time-series grow simultaneously without bounds.
The double asymptotics reflects a recent development in the econometrics, e.g., Bai (2003)
and it is more appropriate for individual stocks than single asymptotics. 4
Using individual stocks in testing asset pricing models is a recent development in the
literature. Kim (1995) corrects the EIV bias using lagged betas to derive a closed-form
solution for the MLE estimator of market risk premium. The solution proposed by Kim is
based on the adjustment by Theil (1971). Other methods proposed by Litzenberger and
Ramaswamy (1979), Kim and Skoulakis (2014), and Chordia et al. (2015) are similar,
producing the EIV correction terms to obtain N-consistent risk premium estimators. In
contrast, the IV method does not require any correction term for the EIV bias. To avoid the
EIV bias, Brennan et al. (1998) advocate risk-adjusted returns as dependent variable in the
second-stage regressions. However, their method does not estimate the risk premiums of
factors. None of these existing papers provides double asymptotic theories as in our paper.
2. Risk-Return Models and IV Estimation
4
Existing papers in the literature analyze “single” asymptotic theories, i.e., one of two dimensions (cross-
section and time) goes to infinity while the other dimension is fixed. A notable exception is Gagliardini
et al. (2011) who show that the EIV bias in the BJS method converges to zero when N and T grow to
infinity.
7
2.1. Multifactor Asset Pricing Models
A number of asset pricing models predict that expected returns on risky assets are
linearly related to their covariances with certain risk factors. A general specification of a
K-factor asset pricing model can be written as:
K
E (r i ) = γ 0 + ∑ β ik × γ k , (2.1)
k =1
where E (r i ) is the expected excess return on stock i, β ik is the sensitivity of stock i to
factor k, and γ k is the risk premium on factor k. γ 0 is the excess return on the zero-beta
asset. If riskless borrowing and lending are allowed, then the zero-beta asset earns the risk-
free rate and its excess return is zero, i.e. γ 0 = 0 . The CAPM predicts that only the market
risk will be priced in the cross-section. Several multifactor models identify additional risk
factors based on empirical findings or based on variations of models such as ICAPM. APT
also predicts a multifactor pricing model starting with a factor structure for returns.
Empirical tests of asset pricing models typically use the Fama-MacBeth (FM) two-
stage regression procedure to estimate factor risk premiums. The first stage estimates factor
sensitivities using the following time-series regressions with T periods of data:
K
rti = α i + ∑ β ik × f k, t + ε ti , (2.2)
k =1
where f k, t is the realization of factor k in time t, ε ti is the regression residual. We will
assume that the factors and residuals are stationary process, the residuals are cross-sectional
and time-series uncorrelated, and uncorrelated with factors. The time series estimates of
factor sensitivities, say β̂ ik are the independent variables in the following second-stage
cross-sectional regressions used to estimate factor risk premiums:
K
rti = γ 0, t + ∑ β̂ ik × γ k, t + ξ ti , (2.3)
k =1
where excess return rti is the dependent variable. The standard FM approach fits OLS
regression to estimate the parameters of regression (2.3). These OLS estimates are biased
due to the EIV problem since β̂ ik s are estimated with errors. To mitigate such bias, the
8
literature typically uses selected portfolios as test assets rather than individual stocks since
portfolio betas are estimated more precisely than individual stock betas.
The use of test portfolios, however, presents a different set of problems. The test
portfolios are typically sorted on a few characteristics such as size and book-to-market.
Sorting on characteristics known to predict returns helps generate a reasonable variation in
average returns across test assets. Lewellen, Nagel, and Shanken (2010) point out sorting
on characteristics also imparts a strong factor structure across the test portfolios. They show
that as a result even factors that are weakly correlated with the sorting characteristics would
explain the differences in average returns across the test portfolios regardless of the
economic merits of the theories that underlie the factors.
Moreover, the statistical significance and economic magnitudes of risk premiums
estimated using regression (2.3) could critically depend on the choice of tests portfolios.
For example, the Fama and French size and book-to-market risk factors are significantly
priced when test portfolios are sorted based on these characteristics, but they do not
command significant risk premiums if test portfolios are sorted only based on momentum.
This paper proposes a method that uses individual stocks as test assets, which addresses
these problems. The use of individual assets preserves the dimensionality of the variation
in expected returns that we observe in the stock market. Also, since our tests use all listed
stocks individually (after reasonable criteria for screening individual stocks), the results
are not dependent on subjective choices made to construct the test portfolios.
2.2. Instrumental Variables Estimator
Define βˆ1e ≡ [1; βˆ even ] and βˆ1o ≡ [1; βˆ odd ] where β̂ even and β̂ odd ( K × N
matrix) are estimated factor senstivities (loadings) based on even and odd months,
respectively. “^” indicates an estimate. 1 is 1× N vector whose entities are 1 and the
operator “;” is to stack the first row vector on top of second matrix. Similarly, we define
βˆ1IV ≡ [1; βˆ IV ] and βˆ1EV ≡ [1; βˆ EV ] where the “IV” subscript indicates the beta
9
instruments and the “EV” subscript denotes the corresponding explanatory variables,
respectively. Rewrite regression (2.3) in matrix form as:
rt = γˆβˆ1sample + ξ t
where rt is the 1× N vector of realized excess returns in month t, γ̂ is the 1× (K + 1)

vector of risk premiums, β̂1sample is the (K + 1) × N matrix that contains the intercept and
K factor loadings, and ξ t denotes the 1× N vector of return residuals. We then propose
the following IV estimator for risk premiums in month t:
γˆ t ' = ( βˆ1IV βˆ1EV ' ) −1 ( βˆ1IV rt ' ),
where β̂1IV ( β̂1EV ) can be either β̂1e (even-month betas) or β̂1o (odd-month betas).
For example, if β̂1IV contains odd-month betas, β̂1EV would have even-month betas,
and vice-versa.
In principle, we can use two sets of betas estimated over any non-overlapping periods
as instruments and independent variables. In our empirical tests, we employ even- and odd-
month betas so that they are estimated within the same overall sample period using about
the same number of time-series observations. In addition, we use odd-month beta estimates
as instruments when month t is even and the even-month beta estimates as instruments
when month t is odd.
3. Asymptotics, Adjustment for Finite N, and Standard Errors

The proposed IV estimator in month t is:
γˆ t = ( βˆ1IV βˆ1EV ' ) −1 βˆ1IV rt' (3.1)
Let γ̂ ' be the sample average of γˆ t ' over the sample period. Proposition 1 below shows
the N-consistency and NT-consistency of γ̂ ' . We also provide the asymptotic distributions
of the IV estimator for two different secnarios of N and T interaction.
10
Proposition 1: Suppose that asset returns follow a factor structure described as in
equation (2.1). Assume that (1) The residual process ε s = [ε s , ε s ,  , ε s ] is stationary.

1 2 N
The elements in ε s are cross-sectionally uncorrelated, and ε s and ε t are

uncorrelated when s is not equal to t. (2) Factor process also is stationary. If the number
of individual assets in the cross-section is sufficiently large then, under mild regularity
conditions, the sample average of the estimator in equation (3.1), γ̂ ' , is N-consistent for
any finite T and it is also NT-consistent when both N and T go to infinity simultaneously.
The asymptotic distributions for N-consistent and NT-consistent IV estimators can be
derived. 5
Proof: See Appendix 1.

In contrast, the standard OLS estimator is biased toward zero as N goes to infinity when
T is fixed. It is straightforward to show that both IV and OLS estimators converge to the
true risk premiums as T goes to infinity when N is fixed.
3.1. Adjustment for Finite Number of Stocks (N)

As the number of assets grows infinitely, we show that the IV estimator is normally
distributed. However, since the cross-product term of β̂1IV and β̂1EV in equation (3.1)
might not be positive definite when N is finite, there is tiny but still positive probability
that the IV estimator has an unreasonably large value, which can make its moments not
exist. To avoid this ill-behaved property for finite N, we truncate the IV estimator based
on the sample means and standard deivations of risk factors. Shanken and Zhou (2007) also
employ a truncation for ML estimator of risk premiums when portfolios are employed as
test assets. Specially, suppose that γ̂ j, t is the risk premium estimate of factor j (j=1,…,K)
5
Shanken (1992) defines a risk-premium estimator as N-consistent if for a finite T the estimator converges
to the realized risk premium in the sample as N (the number of stocks in the cross-section) increases
indefinitely.
11
in γ̂ t in month t. We treat γ̂ j, t as a missing value if the deviation of γ̂ j, t from the
sample mean of factor j realizations is greater than six times of their sample standard
deviation. Excluding the extreme values of γ̂ j, t ensures that all moments of the IV
estimator exist even for finite N. 6 For all results in our simulations and empirical analyses,
we employ the truncated IV estimator.
3.2. Standard Errors for Risk Premium Estimates

Equation (3.1) gives the IV risk premium estimate for one cross-section at time t. The
rolling beta method pioneered by FM re-estimates betas for a sequence of successive time
period samples and repeats the cross-sectional regression in each time period. The
analogous IV procedure is simply to repeat the regression in (3.1) for a sequence of periods,
and then takes the averages of the cross-sectional coefficients. These averages produce the
final point estimates of the risk premiums. To assess their statistical significance, we need
to compute their standard errors.
In Appendix 1, we derive the asymptotic distributions of the IV estimator and the
corresponding asymptotic covariance matrices for two different scenarios: 1) when N goes
to infinity for a fixed T and 2) when N and T go to infinity together. The first scenario is
the standard in the literature and consistent with the aysmptotic analysis in Shanken (1992).
The double asymptotics in the second scenario is novel in the literature and related to the
recent development in the econometrics literature, e.g., among others, see Bai (2003). One
can employ these theoretical asymptotic covariance matrices to compute the standard errors
of IV risk premium estimates. A reasonable alternative is the sample covariance of the IV
risk premium estimates as in the standard FM procedure. We call them IV-FMSE and OLS-
FMSE, respectively, and they are straightforward to compute.
In our simulations, when tested under the CAPM and Fama-French three-factor model,
∫
is defined as E ( x ) = x f(x)dx, where
m m
6
The 𝑚𝑚-th population moment of random variable x
f(x) is the probability density function of x . If f(x) has positive values only for a finite interval,
m
then E ( x ) exists for all m.
12
we find that IV-FMSEs are fairly close to the root-mean-squared errors (RMSEs) of the IV
estimator, which capture the variability of risk premium estimates relative to the true risk
premiums. In constrast, it turns out that OLS-FMSEs significantly overestimate the
accuracy of OLS risk premium estimates. This simulation evidence supports that the IV-
FMSEs provide accurate standard error estimates. In our empirical analyses, we thus
employ the FM standard errors (FMSEs) to compute the corresponding t-statistics.
4. Small Sample Properties of the IV Method - Simulation Evidence

To evaluate the small sample properties of the IV method (i.e., when T is fixed and
small relative to N), we conduct a battery of simulations based on real data. We first
investigate the bias and RMSE of the IV estimator and then examine the size and power of
the t-test statistic based on the IV estimator
4.A. Bias and RMSE

We fix the simulation parameters to correspond with actual data during the sample
period for empirical analyses in Section 5: January 1956 through December 2012. For a
single factor model, simulation parameters are matched to the average market risk premium,
the risk-free rate, the cross-sectional distribution of betas, and volatility of firm-specific
returns.
The CRSP value-weighted index is the market return and the short-term T-bill rate
is the risk-free rate. For each stock, a market model regression provides the beta and
residual returns. We conduct simulations with the cross-sectional size of N=2000 stocks,
which is matched to the real data. 7 We randomly generate daily returns using the following
procedure:
1) For each stock, randomly generate beta and standard deviation of return
residuals σ ε from normal distributions with means and standard deviations equal
i
to the corresponding sample means and standard deviations from the data. 8 We
7
In our empirical analyses, an average month has 1934 individual stocks (see Table 3).
8
If the random draw of 𝜎𝜎𝜀𝜀𝑖𝑖 is negative, we replace it with its absolute value.
13
generate betas and σ ε s in the beginning of each simulation and keep them constant
i
across 1000 repetitions.

2) For each day, generate market excess return as a random draw from a
normal distribution with mean and standard deviation equal to the sample mean and
standard deviation from the data.
3) For each stock and each day, generate residual return ε τ from independent
i
normal distributions with mean zero and standard deviation corresponding to the
value generated in step (1).
For stock i, the excess return on day τ is defined as
rτi = β i rMKT, τ + ετi , (4.1)
where rMKT, τ is the market excess returns.

For the first-stage regression in the simulation, we estimate betas using the
following market model regression with daily excess returns for each stock: 9
rτi = α i + β i rMKT, τ + ετi . (4.2)
Each “month” in the simulation has 22 trading days and we use two years of daily returns
(T=528) to fit the time-series regression (4.2). For the IV method, we use daily returns
from odd and even months during the two-year estimation period to compute independent
and instrumental variables.
We fit the second stage regression with monthly returns, following the common
practice in the literature. We could have fit the second stage regression with daily returns
as well, but this method will not help us improve the precision of the second stage estimates.
To see this intuitively, compare fitting one cross-sectional regression for month t with
fitting 22 separate daily regressions for the month and averaging the daily regression
estimates over the month. With the same set of firms in both regressions and same betas
9
We use daily returns rather than monthly returns to obtain more precise beta estimates in the first stage
regression.
14
for the month, the slope coefficient of the monthly regression would be exactly 22 times
the average slope coefficient of the daily regressions and the standard error of the monthly
regression would also be 22 times the standard error of average daily regression coefficient.
As a result, both specifications would yield exactly the same t-statistic for the slope
coefficient. There would be some differences between the two specifications if daily
returns are compounded to compute monthly returns but such differences are likely small.
We compound daily stock and factor returns to compute corresponding monthly
returns. We fit the cross-sectional IV regression equation (3.1) for each month t to estimate
γ 0, t and γ 1, t . We then roll the two-year estimation window forward by one month and
repeat the two-stage IV estimation procedure over 660 months (=55 years). 10 Finally, we
take the averages of the monthly slope coefficients across 660 months and then compare
them to their FMSEs.
We conduct the three-factor model simulations analogously, but in addition to
market returns and market betas, additional factors and betas correspond to the Fama-
French SMB and HML factors and betas. We match means and standard deviations of the
simulation parameters to the actual data, then carry out the two-stage IV estimation
procedure to estimate γ 0 , γ MKT , γSMB , and γ HML . Appendix 2 describes the

simulation parameters and the simulation experiment design in more detail.
For each repetition, we compare the true factor risk premiums used to generate
returns and the corresponding sample estimates. The averages of these differences over the
1000 repetitions are the biases in risk premium estimates relative to the true risk premiums,
which are the EIV-induced ex-ante biases. Since all risk premium estimates within a
sample are conditional on particular factor realizations, we also report the biases relative
to the average realized risk premiums in that particular sample, which are the EIV-induced
ex-post biases (see Shanken, 1992).
Panel A of Table 1 presents the ex-ante and ex-post biases, as a percentage of the true
10
Out of 684 months (=57 years) that correspond to the sample period from Jan. 1956 to Dec. 2012, the
first 24 months are deducted.
15
market premium. The OLS estimates are biased towards zero by about 28%, because of
the errors-in variable problem. In contrast, the differences between average IV estimates
and both ex-ante and ex-post risk premiums are less than 1%, and statistically not different
from zero.
The next two columns in Panel A present the ex-ante and ex-post root-mean squared
errors (RMSEs). The bias and standard error of risk premium estimates contribute to
RMSE. It is well known that OLS standard error tends to be smaller than IV standard error.
Because of the tradeoff between bias and standard error, it is important to examine the
RMSE to assess the overall performances of the OLS and IV estimators. The ex-ante
RMSE for both OLS and IV estimators are about equal. The ex-post RMSE for the OLS
estimator is .156 for the OLS estimator, compared with .088 for the IV estimator. These
results indicate that because of the bias, the overall accuracy of the IV estimator would be
better than the OLS estimator conditional on factor realizations.
Figure 1 plots the biases of the IV and OLS estimators as a function of the number of
time-series observations, with N=2000 stocks. The vertical axis reports the ex-ante and ex-
post biases as percentages of the true market risk premium. The bias of the OLS estimator
is fairly large at -43% when T=264 observations. 11 The bias is larger than 5% even when
T=2640 days, or 10 years. In contrast, the bias is fairly close to zero for the IV estimator
even for T=264 days, or 1 year.
Panel B of Table 1 presents the results for the Fama-French three-factor model. The
EIV problem always biases OLS slope coefficient estimates towards zero in univariate
regressions, but in theory the direction of the bias is indeterminate in multivariate
regressions. The results in Panel B indicate that the OLS estimates of the slope coefficients
in the case of the Fama-French model are all biased towards zero. For example, the ex-
ante biases of the OLS estimates are -64.7% and -66.3% for SMB and HML, respectively.
In contrast the biases of the IV estimates are all less than 2.1%. The last two columns in
Panel B indicate that the IV method outperforms the OLS method substantially in terms
of ex-ante and ex-post RMSEs.
11
Since the simulation assumes 22 days per month, T=264 corresponds to one year.
16
4.B. Size and Power of t-Test
Our tests follow the Fama-MacBeth approach to test whether the risk premiums
associated with various common factors are reliably different from zero. For example, in
the case of a single factor model, the tests statistic is:
γ̂
tγ = , (4.3)
σ̂ γ
where γ̂ is the time-series average of monthly IV risk premium estimates and σ̂ γ is the
corresponding Fama-MacBeth standard error (FMSE).

To examine the small sample distribution of the t-statistic in equation (4.3) under the
null hypotheses, we follow the same steps as above to generate simulated data, but we set
all true risk premiums equal to zero. We then examine the percentage of repetitions (out of
1000 total repetitions) when the t-statistics are positively significant at the various levels
(one-sided) using critical values based on the standard normal distribution.
Panels A and B of Table 2 present the test sizes under the CAPM and the Fama-French
three-factor model for N=2000 stocks, respectively. The results indicate that the tests are
well specified when T=528 days (=two years of daily data) are used for rolling beta
estimation. For example, the test sizes for all risk premiums at the 5% significance level
are between 4.7% and 5.3% and those at the 10% significance level are between 9.8% and
10.3%. In untabulated results, we found that the distribution of the test statistic was closer
to the theoretical distribution as we increased T. These results indicates that we can draw
reliable statistical inferences about risk premiums based on conventional t-test statistics
with the IV approach. .
We now investigate the power of the IV tests to reject the null hypotheses when the
alternative hypotheses are true. To evaluate power, we modify the simulation experiments
by adding risk premiums equal to the average risk premiums that we observe in the sample.
All the other simulation parameters are the same as in the simulations under the null
hypotheses. We fix the size of IV tests at the 5% significance level.
Panel C of Table 2 shows that the power of the IV test to reject the null hypothesis
17
under the CAPM is 82.8%. Under the three-factor model (Panel D), we find that the
frequency of rejection of the null of zero market risk premium is 78.1% and that of zero
HML risk premium is 84.2%. The test power is somewhat weaker to detect the positive
SMB premium but it is still greater than 50%. We also find that in 98.2% of the simulations,
at least one of the three factor risk premiums is different from zero. Overall, these results
indicate that our IV tests are reasonably powerful to detect non-zero risk premiums.
4.C. Time-varying Factor Sensitivities

Our simulations so far assume that betas are constant over time. In practice,
however, betas may vary over time. Therefore, we also conduct the simulations to
investigate the small sample properties and power of our tests with time-varying betas.
When we allow betas to follow AR(1) processes, we find that the small sample properties
of IV risk premium estimates and the size and power of the IV tests are similar to what we
report with constant betas in Tables 1 and 2. For brevity, we report the details of this
simulation and the results in Appendix 2.
5. IV Risk Premium Estimates for Selected Asset Pricing Models

The simulations of the previous section indicate that our IV method accurately
assesses risk premiums when they are present and does not falsely indicate their presence
when they are absent. In this section, we apply the IV method to ascertain whether risk
premiums seem to be present for several of the most prominent asset pricing models. We
first describe the data, then carry out a battery of examination, and finally provide evidence
that the weak instruments are not a problem in these applications.
5.A. Data
Stock return and market capitalization data from CRSP and balance sheet data from
COMPUSTAT are compiled from January 1956 through December 2012. 12 With respect
12
The sample periods vary depending on asset pricing models tested.
18
to equities, we include only common shares (CRSP share codes of 10 or 11) 13 and also
exclude stocks with prices below $1 and market capitalizations less than $500,000 at the
end of a month from the sample in the following month. Since daily returns are employed
to estimate betas, we restrict the sample to stocks with at least 200 daily observations per
year. 14
Table 3 presents summary statistics for stocks included in our empirical analyses. A
total of 7508 distinct stocks enter the sample at different points in time; 1934 stocks are
available in an average month.
5.B. The CAPM and the Fama-French Three-Factor Model

This section tests the CAPM and the Fama-French three-factor model. We first test
whether the estimated factor risk premiums under the CAPM and the Fama-French three-
factor models are different from zero using the IV method and individual stocks as test
assets. We then examine whether the risk premiums are present after controlling for stock
characteristics.
Early empirical tests of the CAPM by Fama and MacBeth (1973) and others find strong
support for the CAPM. However several subsequent papers find that market betas are not
priced after controlling for other characteristics. For instance, Jegadeesh (1992) and Fama
and French (1993) conclude that the market risk premium is not significantly different from
zero after controlling for the firm size.
The inability of the CAPM to account for any of the cross-sectional differences in
expected returns reinvigorates the search for alternative asset pricing models. The arbitrage
pricing theory, Ross (1976) provides the general multifactor framework, but the Fama-
French three-factor version of the APT is perhaps the most widely used alternative. This
model identifies size and book-to-market factors in addition to the market factor.
The empirical support for the Fama-French three-factor model is mixed. Fama and
13
Excluded are American depository receipts (ADRs), shares of beneficial interest, Americus Trust
components, close-end funds, preferred stocks, and real estate investment trusts (REITs).
14
We repeat the asset pricing tests with different thresholds for the number of observations per year, i.e.,
100 and 150 observations per year, and find that our conclusions on the asset pricing tests do not change.
19
French (1992) estimate factor risk premiums using the portfolios sorted on size and book-
to-market and show that both premiums are significantly positive. But the loadings on these
risk factors are highly correlated with the size and book-to-market characteristics of the
test portfolios. Therefore, as Lewellen et al. (2010) point out, it is hard to reliably conclude
that these risk premiums are indeed compensation for systematic risks rather than for
portfolio characteristics. This highlights the low dimensionality problem when portfolios
are used as test assets in testing asset pricing models.
The conflicting results of the empirical tests in Daniel and Titman (1997) and Davis,
Fama, and French (2000) further illustrate the difficulty in making reliable inferences with
portfolios as test assets. Daniel and Titman argue that the differences of average returns in
size and book-to-market sorted portfolios are due to their characteristics and are not
necessarily related to factor risks. However, Davis, Fama, and French (2000) extend the
sample period back to 1925 and argue, based on this extended sample period, that the SMB
and HML factor risks are priced significantly. They counter that the differences in average
returns across the test portfolios are due to factor risks and not due to non-risk portfolio
characteristics.
This subsection uses individual stocks in the tests and avoids the low dimensionality
problem inherent in the tests that employ characteristics-sorted portfolios as test assets. We
use daily rolling windows from month t-36 to month t-1 to estimate betas for month t. In
unreported tests, we find similar asset pricing test results when betas are estimated with
60-, 24-, and 12-month rolling windows.
To account for non-synchronous trading effects in daily returns, beta estimation is
supplemented with one-day lead and lag of the independent variables (Dimson, 1979). For
example, the following regression estimates betas for the CAPM: for firm i and day 𝜏𝜏,
1
rτi = α i + ∑β
k = −1
i
MKT.k rMKT,τ - k + ετi , (5.1)
β̂iMKT = β̂iMKT,-1 + β̂iMKT,0 + β̂iMKT,1.
We estimate odd- and even-month betas separately using returns on days belonging to odd
20
and even months, respectively. Because of the non-synchronous trading adjustment in (5.1),
the first and the last days of each month are excluded to avoid overlap. 15 An analogous
multivariate regression estimates the three betas for the Fama-French three-factor model.
For each stock and month, the Size characteristic is the natural logarithm of market
capitalization at the end of the previous month. BM is the book value divided by the market
value where book value is the sum of book equity value plus deferred taxes and credits
minus the book value of preferred stock. We compute correlations between each pair of
firm-specific variables each month and Table 4 presents the average cross-sectional
correlations among betas and characteristics. The CAPM beta estimated using the market
model is negatively correlated with both Size and BM. In the Fama-French model, the
correlation between market betas and the betas on other factors are positive. The correlation
between Size and SMB betas is negative, and the correlation between HML factor and BM
is positive, which reflect the fact that the SMB and HML factors are constructed using
these characteristics.
To see an impact of portfolio formation, Table 4 also presents the average cross-
sectional correlations for 25 Fama-French size and book-to-market sorted portfolios. For
each portfolio, we compute Size and BM each month as the value-weighted averages across
all stocks in the portfolios. The correlation among portfolio betas and characteristics is
much larger; between the SMB betas and Size it is -.97 and between the HML beta and
BM it is .88.
We next estimate factor risk premiums using the IV method. Table 5 presents the factor
risk premium estimates for several different specifications of the second stage regressions.
We first test the CAPM using betas estimated with the univariate regression. The market
risk premium estimate is -.189%, which is not reliably different from zero, (Table 5,
column (1).) Therefore, we do not find support for the CAPM with individual stocks.
For the Fama-French three-factor model, betas come from multivariate time-series
15
We find almost identical results while including the first and last days of each month. Also, the results
are qualitatively similar when there is no adjustment for non-synchronous trading, i.e., β̂ iMKT = β̂ iMKT,0 .
21
regressions with all three factors. The market risk premium estimate is now an insignificant
-.315% and the SMB and HML risk premiums are .311% and .504%. The risk premiums
of SMB and HML are significant at conventional levels (column (2).)
The significance of SMB and HML suggests that these risks are priced, but this
inference might be compromised by the correlation between SMB and HML betas and the
underlying size and book-to-market characteristics documented in Table 4. To examine
this issue, the Size and BM characteristics are included as additional independent variables
in the second stage cross-sectional regressions. With the single-factor (CAPM) model,
column (3), the mean coefficients of the Size and BM characteristics are -.152% and .163%,
respectively, and both are statistically significant at the 1% level. The market’s risk
premium estimate is .010%, which is still not significantly different from zero. In the
regression, column (4), that includes betas and characteristics for the FF three-factor model,
none of the risk premiums is significant at the 5% level, including the previously significant
SMB and HML betas. Both Size and book-to-market characteristics are significant at any
conventional levels.
Table 5 also reports on two roughly equal subperiods. The factor risk premiums are not
significant in any subperiod when Size and book-to-market characteristics are included in
the cross-sectional regressions. The Size characteristic is significant in both subperiods,
while the BM is significant only in the first subperiod at the 5% level.
Overall, using the IV method to correct the EIV problem while still relying on
individual stocks, factor risk premium estimates are not significant for the CAPM or any
of the Fama-French factors. However, the individual firm characteristics, Size and BM, are
significant for the entire 1956-2012 sample and Size is significant in both subsamples.
Given that the IV method works very well with simulated data, there are several
interpretations possible concerning these empirical results. First, something in the real data
compromises the IV method; i.e., something that is missing from the simulated data. For
example, although our simulation evidence indicates that the conventional t-tests based on
the IV method are reasonably powerful, they might not in the real data. This interpretation
does not seem convincing due to the following observations. First, without controlling for
22
characteristics, Panel A finds that the risk premiums of SMB and HML are significant.
This evidence indicates that the test power is not a big issue when an average month has
about 2000 stocks. Second, even with 40% larger cross-sections on average, i.e., with more
powerful t-tests, none of the risk premiums becomes significant in the second subperiod.
Second, the Size and BM characteristics are associated with loadings on risk factors that
are badly captured by the Fama-French SMB and HML portfolios while the CRSP value-
weighted index is a poor proxy for the true aggregate market. Third, Size and BM
characteristics represent anomalies that offered the opportunity to earn sizeable returns
without bearing much risk. For the BM characteristic, this third interpretation is buttressed
by its disappearance from the 1986-2012 data after being strong from 1956 through 1985.
Size, in contrast, is impressively persistent and is even more significant in the second
subperiod than in the first subperiod.
5.C. The Fama-French Five-Factor Model

Novy-Marx (2013) and Aharoni, Grundy, and Zeng (2013) among others find that stock
returns are significantly related to profitability and investment after controlling for Fama-
French three factors. Fama and French (2014) propose the following five-factor model that
adds factors to capture these anomalies as well:
E (rti ) = βiMKT γ MKT + βSMB

i
γSMB + βiHML γ HML + βiRMW γ RMW + βiCMA γ CMA (5.2)
where β MKT , βSMB , β HML , β RMW and β CMA are the betas with respect to market, size,
i i i i i
book-to-market, profitability, and investment factors, and γ MKT , γSMB , γ HML , γ RMW and
γ CMA are the corresponding risk premiums. The RMW factor is the difference between
the returns on diversified portfolios of stocks with robust and weak operating profitability
and the CMA factor is the difference between the returns on diversified portfolios of the
stocks of low and high investment.
We use the same procedure as in Fama and French (2014) and construct daily HML,
RMW, and CMA factors. For example, to construct the construct the RMW factor we first
independently sort of stocks into two Size groups and three operating profitability groups.
23
We compute the value-weighted returns for the six size-profitability portfolios. The
average of the small and big high profitability portfolio return minus the average of the
small and big low profitability portfolio return is the RMW factor.
Following Fama and French (1993 and 2014), we use the annual balance sheet data to
compute the levels of book-to-market, operating profitability and investment and allow six-
month delay when combining with financial variables. 16 As in Fama and French (2014),
the sample period for the tests in this subsection is from 1964 through 2012.
Panel A of Table 6 presents the results of asset pricing tests of the Fama-French
five-factor model. Consistent with Table 5 (although the sample periods are different and
an average month has larger cross-section in Table 6), columns (1) to (3) indicate that SMB
and HML risks are priced, while RMW and CMA are not priced in the cross-section of
individual stock returns. In column (6), the pricing evidence of SMB and HML risks
disappears when we control for firm characteristics in the regressions. The slope
coefficients of characteristics, especially for investment/total asset, are highly significant
and reliable. We find similar results in the subperiods as well.
5.D. The q-factor Asset Pricing Model

Cochrane (1991) and Liu, Whited and Zhang (2009) present production-based asset
pricing models in which productivity shocks are tied to the changes in the investment
opportunity set, which is consistent with Merton’s (1973) ICAPM framework. Since
shocks to productivity are difficult to accurately measure, Hou, Xue, and Zhang (2015)
(HXZ) propose an investment factor and an ROE factor to capture productivity shocks.
The q-factor model is specified as:
E (rti ) = βiMKT γ MKT + βiME γ ME + βiI/A γ I/A + βiROE γ ROE (5.3)
where β MKT , β ME , β I/A and β ROE are the betas with respect to market, size, investment
i i i i
16
The investment for June of year t is the change in total assets from the fiscal year ending in year t-2 to the
fiscal year ending in year t-1, divided by total assets in year t-2. The operating profitability for June of year
t is annual revenues minus cost of goods sold, interest expense, and selling, general, and administrative
expenses divided by book equity for the last fiscal year end in year t-1.
24
and ROE factors, respectively, and γ MKT , γ ME , γ I/A and γ ROE are the corresponding
risk premiums.
The investment factor captures the level of investments and the ROE factor captures
the return on investments, i.e., profitability. The investment factor is constructed as the
return difference between firms with low and high levels of investment and the ROE factor
is constructed as the return difference between firms with high and low return on
investment. Following HXZ, we control for size when constructing the investment and
ROE factors. Intuitively, investments and rates of return on investments are likely to reflect
sensitivity to unanticipated productivity shocks, and these factors are supposed to capture
the price impact of such shocks. HXZ argue that their factors better explain cross-sectional
return differences across portfolios constructed based on various firm-level anomalies, e.g.,
book-to-market, size, momentum, and earnings surprise than the Fama-French three-factor
model and the Carhart four-factor model.
The HXZ model is appealing since an underlying theory rather than empirical
regularities suggests the factors. Also, HXZ’s empirical approach employs a variety of
different common factors and test portfolios. For instance, their tests of size and book-to-
market uses the 25 Fama-French size and book-to-market sorted portfolios, the test of
momentum uses 10 portfolios formed based on momentum, and the test of the earning
surprises (SUE) uses 10 SUE sorted portfolios. However, all their tests employ portfolios
and are subject to potential low dimensionality problem.
We examine whether the HXZ factors are priced using individual stocks as test
assets. Here, we follow procedure in HXZ to construct daily market, size, investment and
ROE factors. For example, we first sort firms by size, investment as a fraction of total
assets (I/A), and ROE based on the NYSE breakpoints. We then assign stocks to groups
according to the top and bottom 50% of size and the top and bottom 30% and the middle
40% of I/A and ROE, producing a total of 18 (=2x32) groups. We form value-weighted
portfolios of stocks in each of the 18 groups. The investment factor is the equal-weighted
portfolio that is long the six low I/A portfolios and short the six high I/A portfolios. The
ROE factor is the equal-weighted portfolio that is long the six high ROE portfolios and
25
short the six low ROE portfolios.
We use the last announced quarterly financial statement data to compute the level
of investments and ROE each month. 17 The HXZ apply earnings announcement dates to
determine when financial data become available to the market. Since earnings
announcement dates on COMPUSTAT are available only after 1972, as in HXZ (2015),
the sample period for this portion of the study is from 1972 to 2012.
Table 7, Panel A, reports average cross-sectional correlations among estimated
factor sensitivities (betas) and firm characteristics. Sensitivities to I/A and ROE factors are
positively correlated across stocks. I/A beta is negatively correlated with size and positively
correlated with BM, and the ROE beta is positively correlated with size and negatively
correlated with BM. The correlations between these betas and the characteristics are
smaller than those for the SMB and HML factors in Table 4. In Panel B, Table 7 also
reports analogous correlations for the 25 Fama-French size and book-to-market sorted
portfolios. For these portfolios, the correlation between I/A beta and BM is .88 and the
correlation between ROE beta and Size is .74. Such high correlations suggest that the issues
discussed in Lewellen et al. (2010) could influence results of tests that use the 25 Fama-
French portfolios.
Table 8 presents results of asset pricing tests with individual stocks and the IV
method. For comparison, (since the sample period is different,) column (1) reports the
single-factor market risk premium; it is quantitatively similar to the premium reported in
Table 5 and is still insignificant statistically. Column (5) reports the slope coefficients of
HXZ’s q-factor loadings without controlling for characteristics. In this case, both I/A risk
premium and ROE risk premium are negative and the former is insignificant and the latter
is significant at the 5% level. The mean of the ROE factor during the sample is .7% per
month, which is significantly positive, so if the ROE factor reflected risk, its premium
should be positive as well. Columns (6) to (8) in Table 8 indicate that the inclusion of Size
and BM does not change the results of asset pricing tests. The slope coefficients of Size
17
Following HXZ (2015), the investment to total assets is defined as the annual change in total assets
(COMPUSTAT annual item AT) divided by 1-year-lagged total assets. ROE is income before
extraordinary items (COMPUSTAT quarterly item IBQ) divided by book equity lagged by one quarter.
26
and BM are significant and have the signs consistent with existing studies.
5.E. The Liquidity-Adjusted CAPM

This subsection examines the liquidity-adjusted capital asset pricing model (LCAPM)
proposed by Acharya and Pedersen (2005), which accounts for the impact of illiquidity-
based trading frictions on asset pricing. 18 According to the LCAPM, the level of illiquidity
and the covariances of return and illiquidity innovation with the market return and
illiquidity innovation vary across assets. The unconditional expected return in excess of the
i
risk-free rate ( E (rt ) ) under the LCAPM is:
E (rti ) = E (c it ) + λ(β 1i + β i2 − β i3 − β i4 ), (5.4)

i
where c t is the illiquidity cost, the risk premium is the market excess return minus
aggregate illiquidity cost (i.e., λ = E (rMKT, t - c MKT, t ) ), and the betas are
Cov(rti , rMKT, t − E t −1 (rMKT, t ))

β =
i
, (5.5)
Var(rMKT, t − E t −1 (rMKT, t ) − [c MKT, t − E t −1 (c MKT, t )])
1
Cov(c it − Et −1 (c it ), c MKT, t − Et −1 (c MKT, t ))

β = i
,
Var(rMKT, t − Et −1 (rMKT, t ) − [c MKT, t − Et −1 (c MKT, t )])
2
Cov(rti , c MKT, t − E t −1 (c MKT, t ))

β = i
,
3
Cov (c it − E t −1 (c it ), rMKT, t − E t −1 (rMKT, t ))

β = i
.
4
18
Several other papers, e.g., Pastor and Stambaugh (2003), also propose models where a stock’s return
sensitivity to market-wide liquidity is priced. Since we do not have daily Pastor and Stambaugh liquidity
factors, we do not examine this model here.
27
i
The term E (c t ) is the reward for firm-specific illiquidity level, which is the compensation
for holding an illiquid asset as in Amihud and Mendelson (1986). Acharya and Pederson
define illiquidity-adjusted net beta as:
β iLMKT = β1i + β i2 − β i3 − β i4 . (5.6)

The LCAPM implies that the linear relation between risk and return applies for
liquidity-adjusted beta and not for the market beta under the standard CAPM. The LCAPM
also implies that the linearity between risk and return applies to excess returns net of firm
specific illiquidity cost.
Acharya and Pedersen test the LCAPM using two sets of test portfolios formed based
on illiquidity and the standard deviation of illiquidity. They sort stocks based on Amihud
(2002) illiquidity measures during each year and form 25 value-weighted illiquidity test
portfolios for the subsequent year. They also form 25 σ (illiquidity) portfolios similarly
by sorting based on the standard deviation of illiquidity.
We examine the correlations between β iLMKT and the value-weighted averages of Size
and BM for these portfolios used by Acharya and Pedersen. Correlations of β iLMKT with
Size for illiquidity and σ (illiquidity) portfolios are -.96 and -.97, and those with BM are
.71 and .74, respectively. Such high correlations between liquidity-adjusted betas, i.e.,
β iLMKT , and size suggest that it would be particularly hard to determine empirically whether
average returns differ across test portfolios due to size or illiquidity-adjusted betas. This
situation parallels that in Chan and Chen (1988) who use 20 size-sorted portfolios as test
assets and find strong support for the CAPM. The correlations between market betas and
Size for Chan and Chen’s test portfolios range from -.988 to -.909 over different periods,
and the corresponding correlations in the case of illiquidity and σ (illiquidity) portfolios
are within this range. Jegadeesh (1992) shows that when test portfolios are constructed so
that size and market beta have low correlations, the market risk is not priced and that the
significant market risk premium found using size-sorted portfolios is due to a high
correlation between Size and market beta.
28
To avoid such ambiguity, we use the IV method with individual stocks to investigate
whether the liquidity-adjusted market risk β iLMKT under the LCAPM is priced in the
cross-section. To facilitate comparability, we follow the same procedure as in Acharya and
Pederson (2005) in all other respects. Because of the differences in the market structures
of the NYSE/AMEX and NASDAQ, the trading volumes reported in these two markets are
not comparable and hence NASDAQ stocks are excluded for this test. In addition to
existing screening criteria, following Acharya and Pederson, we exclude stocks that do not
trade for at least 100 days per year, which can suppress noisy illiquidity measures. 19
Acharya and Pederson define illiquidity cost as follows: 20
| rτi |
ILLIQiτ = , (5.7)
ν τi
ciτ = min(0.25 + 0.3ILLIQiτ PMKT, τ −1 ,30), (5.8)
i
where rτ is the return on day τ , ν iτ is the dollar volume (in millions) and PMKT, τ −1 is
the day τ - 1 value of $1 invested in the market portfolio as of the end of July 1962.
Equation (5.7) is based on Amihud’s (2002) illiquidity measure. Acharya and Pederson use
equation (5.8) as a measure of illiquidity cost where PMKT, τ −1 is used to adjust for inflation
and the illiquidity cost is capped at 30% to avoid an obviously unreasonable value for it.
Market illiquidity cost c MKT, τ is the value-weighted average of the illiquidity costs of the
individual stocks.
As in Acharya and Pederson (2005), we estimate innovations in illiquidity using an AR
model and then estimate each individual component of betas in equation (5.6) using a time-
series GMM approach and Dimson-type corrections. 21 We then fit the following cross-
sectional regression each month:
19
We follow Acharya and Pederson and impose the 100 days per year data requirement for inclusion in the
sample.
20
Acharya and Pederson use illiquidity costs at monthly frequency but we use them at daily frequency.
21
Appendix 3 presents the AR models that we use to estimate expected and unexpected components of
29
rti = α t + γ ILLIQ,t c it + γ LMKT,t β̂ iLMKT + ε it . (5.9)
i
where c t is the average illiquidity for stock i in month t. 22
The IV estimator in month t is:
γˆ t ' = (Ψ
ˆ Ψ ˆ ' ) −1 Ψ
ˆ r ',
IV, t EV, t IV, t t
where Ψ̂ IV, t is Ψ̂ even, t when month t is odd and is Ψ̂ odd, t when month t is even,
and
Ψ̂ even,t ≡ 3× N matrix of independent variables with unit vector as the first row,
c it , and estimated even-month LMKT betas for N stocks as the second and
third rows, respectively. We estimate the even-month LMKT betas using
daily data in even months in the period of month t-36 to month t-1.
Ψ̂ odd, t ≡ Analogous to Ψ̂ even,t estimated using all daily data in odd months.
We use the FMSEs to compute the standards errors of IV risk premium estimates.
Table 9 presents the regression estimates with individual stocks. The slope coefficient
on the Amihud illiquidity measure is .184%, and it is significantly positive at the 1% level.
However, the liquidity-adjusted market risk premium (the risk premium for β LMKT ) is
.140%, which is not reliably different from zero. These results suggest that firm-specific
illiquidity, a firm characteristic, is positively related to average returns, but a stock’s
liquidity-adjusted beta, supposedly a systematic risk, does not command a risk premium.
Table 9 also shows that illiquidity risk is not priced in either subperiods and that the
Amihud illiquidity characteristic is not reliably associated with average returns at the 5%
level in the second subperiod.
In comparison, Acharya and Pederson report a liquidity-adjusted market risk premium
estimate of about 2.5% per month using the value-weighted index (see Panel B of Table 5
in AP), which is about 30% per year. 23 The equity risk premium puzzle literature argues
illiquidity.
22
As in Acharya and Pedersen (2005), 30% capping is applied after taking monthly average.
23
The liquidity-adjusted market risk premium equals market risk premium minus expected illiquidity costs
30
that even an annual risk premium of about 6% observed in the data is hard to justify with
realistic levels of risk aversion, and larger risk premiums would be harder to justify. The
large risk premium estimate obtained with portfolios seems likely to be the result of
correlation between β LMKT and portfolios characteristics rather than a true depiction of
the rewards to systematic risk.
These findings further illustrate the problems that arise when portfolios are used as test
assets in asset pricing tests. In the earlier size versus beta debate in the literature, portfolios
were formed based on size ranks and hence it would be fairly natural to check the
correlation between size and beta and to uncover the problem. In the case of illiquidity-
sorted portfolios, size was not explicitly used as a sorting variable to form portfolios and
hence it is not readily apparent that one should check the correlation with this variable, but
such correlations could lead to mistaken statistical inferences. Our tests with individual
stocks avoid such confounding issues.
5.F. On the Strength of Instrumental Variables

An important issue to consider in instrumental variable regressions is the correlation
between the instrumental variables and the corresponding independent variables. The
cross-product matrix of instrumental variables and independent variables could be close to
singular if the correlation is too low. Nelson and Startz (1990) show that if the instruments
are sufficiently weak then the expected value of the IV estimator may not exist. The
intuition behind this result can be seen in a univariate regression with weak instruments. If
the covariance between the independent variable and the instrument is close to zero then
the sample covariance could be small and be either negative or positive, resulting in large
variations in both the sign and magnitude of the slope coefficient estimates in finite samples.
However, if the covariance and the sample size are sufficiently large, then the likelihood
that the sample estimate of the covariance is close to zero becomes negligibly small, and
the IV estimator is well behaved.
and hence it is smaller than the unadjusted market risk premium.
31
Nelson and Startz (1990) show that weak instruments would be a concern if
1
>> N, (5.10)
ρ̂ 2xz
where ρ̂ xz is the correlation between the independent variable and the corresponding
instrument (which in our method is the correlation between betas estimated from different
observations) and N is the number of observations in the cross-sectional regression, i.e.,
the number of individual stocks. For example, in the CAPM and Fama-French model tests
of section 5.3, there are 1934 stocks per month on average and the minimum number of
stocks is 305. From (5.10), there would be a weak instrument concern based on the
minimum (average) number of stocks, if the correlation were less than 0.057 (0.023) in
absolute value.
Table 10 presents average correlations between the odd and even month beta estimates.
The correlation for market beta under the CAPM is .67. The market beta of Fama-French
three-factor (five-factor) model is less precisely estimated and the correlation is smaller
at .52 (.42). The market beta in the q-factor asset pricing model and the LCAPM betas also
exhibit similar levels of correlation as the Fama-French three-factor market betas. The
average correlations for SMB, HML, RMW, CMA, I/A, and ROE betas range from .14
to .44. Although these correlations are smaller than those for market betas, they are all
comfortably above the Nelson and Startz (1990) critical value.
Nelson and Startz (1990) and Staiger and Stock (1997) also show that the conventional
IV standard error estimator based on asymptotic theory is not reliable in small samples if
the instruments are weak. However, this concern is not relevant in our application because
we use the Fama-MacBeth approach to estimate standard errors and do not use the
asymptotic estimator in our empirical analyses in the previous subsections. Nevertheless,
we find, in the tests proposed by Staiger and Stock (1997), that the instruments give no
cause for concern. 24
24
Staiger and Stock (1997) regress the independent variable against the instrumental variable and develop
a test based on the goodness of fit for this regression. In unreported results, we find that their test statistics
in our applications were well above critical values for all instruments in all months.
32
To provide further insights into the strength of the instruments, we also estimate the
correlation between the instruments that we use and the corresponding true but
unobservable factor betas. Although the true beta is unobservable, we can estimate this
correlation based on the correlation between the odd- and even-month betas as we show in
the following proposition:
Proposition 2: Let βik be stock i’s true unobservable sensitivity to factor k and let
β̂ iodd,k and β̂ ieven,k be the odd and even month estimates of the corresponding
betas, respectively. Then:
correlation (βik , β̂ieven,k ) = correlation (βik , β̂iodd, k )
= correlation (β̂iodd, k , β̂ieven,k ) .
Proof: See Appendix 4.
Table 10 also presents the mean correlation between estimated betas and true betas.25
The average correlation between even- and odd-month market betas is .67 and the average
correlation between estimated market beta and the unobserved true market beta is .82. We
find smaller correlations for SMB, HML, RMW, and CMA betas, but even for CMA the
average correlation between estimated beta and unobservable true beta is .38. The
correlations for the I/A and ROE betas are about the same as that for the HML beta. All
these estimates are significantly above the cutoff prescribed by Nelson and Startz (1990).
6. Conclusion
We propose a method for estimating risk premiums using individual stocks as test
25
To compute the mean correlation between estimated betas and true betas, we first compute the square root
of the correlation between odd- and even-month betas each month and then compute the average across
months. Because the variability of correlation between odd- and even-month betas is relatively small, the
square root of average correlation is about the same as the mean of the square root of the correlation.
33
assets. It sidesteps concerns about risk premium estimated with portfolios, which have been
employed in almost all previous research to mitigate an inherent errors-in-variables
problem. Estimated β s from different observations can serve as effective instruments for
estimated β s from other observations that serve as the explanatory variables in second-
stage cross-sectional regressions. We prove the consistency and provide the asymptotic
theory of the proposed risk premium estimator when the size of cross-section and the length
of time-series grow simultaneously without bounds. In simulations, our instrumental
variables (IV) method estimates risk premiums accurately even for relatively short time-
series and also provides valid tests of statistical significance. Our simulations also indicate
that our tests are reliable under time-varying betas.
We use the new IV method to test whether risk premiums suggested by several popular
factor models are different from zero. These models include the CAPM, the Fama-French
three- and five-factor models, the q-factor asset pricing model proposed by Hou, Xue, and
Zhang (2015), and the liquidity-adjusted CAPM proposed by Acharya and Pedersen
(2005). Previous empirical examinations, employing portfolios as tests assets, find strong
support for these models, but Lewellen, Nagel and Shanken (2010) suggest caution about
the low dimensionality issue when portfolios are used. We find that none of the factor risks
in these asset pricing models commands a significant risk premium in the cross-section of
individual stock returns after controlling for firm characteristics. Simulations results
indicate that this failure cannot be attributable to a lack of test power, so it represents a
puzzle that calls for further research.
34
References
Acharya Viral and Lasse Heje Pedersen, 2005, Asset Pricing with Liquidity Risk,
Journal of Financial Economics 77, 375-410.
Aharoni Gil, Bruce Grundy, and Qi Zeng, 2013, Stock returns and the Miller
Modigliani valuation formula: Revisiting the Fama French analysis, Journal of Financial
Economics 110(2), 347-357.
Ait-Sahalia Yacia, Jonathan A. Parker, and Motohiro Yogo, 2004, Luxury Goods
and the Equity Premium, Journal of Finance 59, 2959-3004.
Amihud Yakov, 2002, Illiquidity and Stock Returns: Cross-section and Time-series
Effects, Journal of Financial Markets 5, 31-56.
Amihud Yakov and Haim Mendelson, 1986, Asset Pricing and the Bid-ask Spread,
Journal of Financial Economics 17, 223-249, Dec. 1986.
Arnott Robert, Jason Hsu, and Philip Moore, 2005, Fundamental Indexation,
Financial Analysts Journal 61, 83-99.
Bai Jushan, 2003, Inferential theory for factor models of large dimensions,
Econometrica 71(1), 135-171.
Black Fisher, Michael C. Jensen, and Myron Scholes, 1972, The capital asset
pricing model: Some empirical tests, Michael C. Jensen, ed: Studies in the Theory of
Capital Markets, 79–121.
Blume Marshall, and Irwin Friend, 1973, A New Look at the Capital Asset Pricing
Model, Journal of Finance 28, 19-34.
Brennan Michael, Tarun Chordia, and Avanidhar Subrahmanyam, 1998,
Alternative Factor Specifications, Security Characteristics, and the Cross-Section of
Expected Returns, Journal of Financial Economics 49, 345-373.
Campbell John and Tuomo Vuolteenaho, 2004, Bad Beta, Good Beta, American
Economic Review 94, 1249-1275.
Chan K.C., and Nai-Fu Chen, 1988, An Unconditional Asset Pricing Test and the
Role of Firm Size as an Instrumental Variable for Risk, Journal of Finance 43, 309-325.
Chen Long, Robert Novy-Marx, and Lu Zhang, 2011, An Alternative Three-Factors
35
Model, working paper, Ohio State University.
Chordia Tarun, Amit Goyal, and Jay Shanken, 2015, Cross-Sectional Asset Pricing
with Individual Stocks: Betas versus Characteristics, Working paper, Emory University.
Cochrane John, 1991, Production-based Asset Pricing and the Link Between Stock
Returns and Economic Fluctuations, Journal of Finance 46, 209-237.
Cochrane John, 1996, A Cross-Sectional Test of an Investment-Based Asset Pricing
Model, Journal of Political Economy 104, 572-621.
Cochrane, John , 2011, Presidential Address: Discount Rates, Journal of Finance
66, 1047–1108.
Daniel Kent, and Sheridan Titman, 1997, Evidence on the Characteristics of Cross
Sectional Variation in Stock Returns, Journal of Finance 52 (1), 1-33.
Davis James, Eugene Fama, and Kenneth French, 2000, Characteristics,
Covariances, and Average Returns: 1929-1997, Journal of Finance 55, 389–406.
Dimson Elroy, 1979, Risk Measurement When Shares Are Subject to Infrequent
Trading, Journal of Financial Economics 7, 197-226.
Eisfeldt Andrea, and Dimitris Papanikolaou, 2013, Organization Capital and the
Cross-Section of Expected Returns, Journal of Finance 68, 1365-1406.
Fama Eugene, and Kenneth R. French, 1992, The Cross-Section of Expected Stock
Returns, Journal of Finance 47, 427-465.
Fama Eugene and Kenneth French, 1993, Common Risk Factors in the Returns on
Stocks and Bonds, Journal of Financial Economics 33, 3–56.
Fama, Eugene and Kenneth French, 2014, A Five-factor Asset Pricing Model,
Journal of Financial Economics 116, 1-22.
Fama Eugene F, and James D. MacBeth, 1973, Risk, Return and Equilibrium:
Empirical Tests, Journal of Political Economy 81, 607-636.
Gagliardini Patrick, Elisa Ossola and O. Scaillet, 2011, Time-Varying Risk
Premium in Large Cross-Sectional Equidity Datasets, Swiss Finance Institute, working
paper.
Harvey Campbell, Yan Liu, and Heqing Zhu, 2015, …and the Cross-Section of
36
Expected returns, forthcoming Review of Financial Studies.
Hou Kewei, Chen Xue, and Lu Zhang, 2014, A comparison of new factor models.
Ohio State University, working paper.
Jagannathan Ravi, and Zhenyu Wang, 1996, The conditional capm and the cross-
section of expected returns, Journal of Finance 51, 3-53.
Jagannathan Ravi, and Zhenyu Wang, 1998, An asymptotic theory for estimating
beta-pricing models using cross-sectional regression, Journal of Finance 53, 1285-1309.
Jegadeesh Narasimhan, 1992, Does Market Risk Really Explain the Size Effect?,
Journal of Financial and Quantitative Analysis 27, 337-351.
Kan Raymond, Cesare Robotti and Jay Shanken, 2013, Pricing model performance
and the two-pass cross-sectional regression methodology, Journal of Finance 68, 2617–
2649.
Kim Dongcheol, 1995, The Errors-In-Variables Problem in the Cross-Section of
Expected Stock Returns, Journal of Finance 50, 1605-1634.
Kim Soohun and Georgios Skoulakis, 2014, Estimating and Testing Linear Factor
Models using Large Cross Sections: The Regression-Calibration Approach, Working
paper, Georgia Institute of Technology
Lettau Martin and Sydney Ludvigson, 2001, Resurrectng the (C)CAPM: A
Cross-sectional test when risk premia are time-varying, Journal of Political Economy 109,
1238-1287.
Lewellen Jonathan, Stefan Nagel and Jay Shanken, 2010, A skeptical appraisal of
asset pricing tests, Journal of Financial Economics 96, 175-194.
Li Qing, Maria Vassalou, and Yuhang Xing, 2006, Sector investment growth rates
and the cross section of equity returns, Journal of Business 89, 1637-1665.
Lintner John, 1965, The valuation of risk assets and the selection of risky
investments in stock portfolios and capital budgets, Review of Economics and Statistics 47
(1), 13–37.
Litzenberger Robert H, and Krishna Ramaswamy, 1979, The effect of personal
taxes and dividends of capital asset prices: The theory and evidence, Journal of Financial
37
Economics 7, 163-196.
Liu Laura Xiaolei, Toni Whited, and Lu Zhang, 2009, Investment-based Expected
Stock Returns, Journal of Political Economy 117, 1105-1139.
Merton Robert, 1973, An intertemporal asset pricing model, Econometrica 41, 867-
888.
Nelson Charles and Richard Startz, 1990, The Distribution of the Instrumental
Variable Estimator and its t ratio When the Instrument is a Poor One, Journal of Business
63, S125-S140
Novy-Marx, Robert, 2013, The other side of value: The gross profitability
premium, Journal of Financial Economics 108, 1–28.
Pastor Lubos and Robert Stambaugh, 2003, Liquidity risk and expected stock
returns, Journal of Political Economy 111, 642-685.
Ross Stephen A., 1976, The arbitrage theory of capital asset pricing, Journal of
Economic Theory 13, 341-360.
Shanken Jay, 1992, On the estimation of beta-pricing models, Review of Financial
Studies 5, 1-33.
Shanken Jay, and Guofu Zhou, 2007, Estimating and testing beta pricing models:
alternative methods and their performance in simulations, Journal of Financial Economics
84, 40-86.
Sharpe William, 1964, Capital asset prices: A theory of market equilibrium under
conditions of risk, Journal of Finance 19 (3), 425–442.
Staiger Douglas and James Stock, 1997, Instrumental Variables Regression with
Weak Instruments, Econometrica 65, 557–586.
Theil Henri 1971, Principles of Econometrics, 1st Edition, John-Wiley & Sons, New
York.
38
Appendix 1: Asymptotic Theorems and Proofs
A1.1. The N-consistency and asymptotic distribution of the IV method

In this section, we show the consistency of the estimator and obtain its asymptotic
distribution. To simplify the exposition, we make following definitions and assumptions:
As in section 2.2, define βˆ1e ≡ [1; βˆ even ] and βˆ1o ≡ [1; βˆ odd ] where β̂ even and β̂ odd are
estimated factor loadings for even and odd periods. Similarly, define βˆ1sample ≡ [1; βˆ sample ] ,
βˆ1EV ≡ [1; βˆ EV ] , and βˆ1IV ≡ [1; βˆ IV ] where sample, EV or IV=odd or even. In this paper,
the column vectors are used to define time-series variables (such as returns for one stock),
and row vectors represent cross-sectional variables (returns and factors in one period).
Moreover, risk premium is a row vector.
Let f 1 ,  , f T -1 be factors for each period. They are row vectors (K vector). The
estimation error in the first pass is βˆ sample − β = (Fsample

d d
' Fsample )−1 Fsample
d
' Ωsample
d
, where
d
Fsample = Fod ≡ [ f1d ;; f Td-1 ] when sample contains odd periods and
d
Fsample = Fed ≡ [ f 2d ;; f Td ] when sample contains even periods. The superscript d
indicates the demeaned factor or residual (constructed by subtracting the factor or residual
from their sample average). Thus, f 1d , , f Td-1 are demeaned factors for each period.
Moreover, operator “;” is to stack the first row vector on top of second row vector.
The dependent variable in the second pass cross-sectional regression could be any
return vector rs for a disjoint period s not in the sample periods in the first pass. This
regression can be written as rs = γˆβˆ1sample + ξ s . Since the true model is rs = ( f s + γ ) β + εs ,

the cross-sectional residuals are
ξ s = ( f s + γ )( β − βˆ sample ) + ε s .
We show the T-consistency and convergent rate in section 3 with relative weak
assumptions. In order to show the N-consistency and asymptotic distribution, we need to
39
make the following assumptions:
Assumptions: (1) The residual process ε s = [ε s , ε s ,  , ε s ] is a stationary. The elements

1 2 N
in ε s are cross-sectionally uncorrelated, and ε s and ε t are uncorrelated when s is not

equal to t. Let Σ be the covariance matrix for the residuals, then the above assumption
implies that it is a diagonal matrix. (2) Factor process f s is a stationary. With these
assumptions and several regularity conditions (shown in the Theorem), N-consistent and
asymptotic distribution can be shown through the following Theroem.
Theorem A1 (a) Assume that β11ξ 1t ,, β1N ξ tN (where [β11 ,, β1N ] = β1 and
[ξ 1t ,, ξ tN ] = ξ t ) have finite variances, and when N∞, β1β1′ /N and β1Σβ1′ /N
converge to invertible matrices (denote the matrices by bb' and bΣ b' ), then the estimated
risk premiums γˆ t ' converges to (0, γ + f t )' an when N converges to infinity.
Thus, γ̂ ' converges to (0, γ + f )' in probability when N converges to infinity (where
γ̂ ' and f ' are sample average of γˆ t ' and f t ' , respectively) .
−1
2 2 d d 
Define δ si = −( ∑
T t∈EV
f t + γ )[ FEV
T
' FEV  Fs ' ε s ] + ε s for any s in EV. Assume that

d d,i i
β11δ 11 ,, β1N δ 1N ,, β1N δ TN have finite variances, then γ̂ ' converges to (0, γ + f )' in
probability when both N and T converge to infinity.
N
(b) In addition to the assumptions in (a), we further assume that (1) L ≡ lim 1 ∑ (var(ε )) i
t
2
N →∞ N i =1
exists, (2) [β11ξ 1t ,, β1N ξ tN ] satisfies a Lindeberg condition, 26

and (3)
26
Assume that the covariance matrices for β11ξ 1t ,, β1N ξ tN are V t1 ,  ,V tN , let V = ∑V tj , then the Lindeberg
j
condition is that lim V −1 ∑ E ((β1jξ tj ) 2 1{|V − 1 (β1 ξ j )|>ε } ) = 0 for any ε >0 where 1 is the indicator function.
n →∞ j1 t
j
40
(
[( FIVd ' FIVd )
−1
FIVd ' ε IV
d,1
(
)(ξ 1t )' ,  , ( FIVd ' FIVd )−1
FIVd ' ε IV
d, N
)(ξ tN )' ] for any t not in IV, satisfies a
Lindeberg condition, then (A) the asympotic distribution of the estimated risk premium
conditional on F is
N (γˆ t '−(0, γ + f t )′) → N (0, A −1 BA −1 ),

~
where A = bb' , B = c0 (bΣ b'+ L0,IV )), where
~  0 01× k 
L0, IV = 
0k ×1 L( FIVd ' FIVd ) −1 FIVd ' ) I d (( FIVd ' FIVd ) −1 FIVd ' )′
with
T−2
c 0 = 1 + (γ + f t )( FEV
d d −1
' FEV d
) FEV ' ) I d ((γ + f t )( FEV
d d −1
' FEV d
) FEV ' )′ − 2 (γ + f t )( FEV
d d −1
' FEV d
) FEV ' lt )
T
where l t = (− 2 ,− 2 ,  ,1 − 2 ,  ,− 2 )′ with the term 1 − 2 the t'th entry of the vector,

T T T T T
T−2 2 2 
 −  − 
and  T T T ;
 2 T−2 2 
I = −T
d
T
 −
T 
     
 2 2 T−2
 −  − 
 T T T 
Similarly, N (γˆ '−(0, γ + f )′) → N (0, A −1 B A −1 ), where

1 ~ ~
B = (c o (bΣ b'+ L0,e )) + c e (bΣ b'+ L0,o ))), and
4
2
ce = + (γ + f )( Fed ' Fed ) −1 Fed ' ) I d ((γ + f )( Fed ' Fed ) −1 Fed ' ) ′) ,
T
2
co = + (γ + f )( Fod ' Fod ) −1 Fod ' ) I d ((γ + f )( Fod ' Fod ) −1 Fod ' ) ′) ;
T
(B) If [β11δ 11 ,, β1N δ 1N ,, β1N δ TN ] satisfies a Lindeberg condition and other conditions in
(A) are satisfied, then if both T and N converges to infinity,
TN (γˆ '−(0, γ + f )′) → N (0, A −1 DA −1 ), where

D = (1 + γΣ -1
F γ' )bΣ b ' ,
and Σ F is the covariance matrix of factors.
This condition implies that β11ξ 1t ,, β1N ξ tN have similar variances.
41
As from the theorem, both asympotitc standard deviations have sandwitch forms.
Following Shanken (1992), the formulas can be decomposed as follows:
~
In case (A), A −1 BA −1 = A −1 (bΣ b' ) A −1 + A −1 ((c0 − 1)bΣ b'+c0 L0,IV ) A −1 .
In case (B), A −1 DA −1 = A −1 (bΣ b' ) A −1 + A −1 ((γΣ -1F γ' )bΣ b' ) A −1 .

In above decompostion, the first term is the asymptotic standard deviation of an OLS
estimator when there is no error in factor loadings, i.e. β is known. The second term is
the EIV adjustment on standard deviation. In particular, the standard deviation in case (B)
takes the same formula as Shanken (1992), although the rate of convergence for IV
estimator is faster. Gagliardini, Ossola and Scaillet (2011) show that when both T and N
large, the estimated risk premiums in the BJS method converge to their true values at the
1
speed of O( ) 27 only if N < O(T γ ) when γ < 3 . The rate of convergence of IV
NT
1
estimator is O( ) , and convergent rate does not depend on the relative size of T and
NT
N.
Next, we will prove Theorem A1:
Proof:
We first prove the consistentency, note that:
1 ˆ 1
γˆ t '−(0, γ + f t )' ) = ( β1IV βˆ1EV ' ) −1 ( βˆ1IV ξ t ' ).
N N
The consistency is established based on Markov’s Law of Large Number: since (1)
β11ξ 1t ,, β1N ξ tN have finite variances; (2) for any i, E (β1iξ it ) = 0 ; (3) regression residuals
ε it are not cross-sectional and time-sereies correlated; (4) it is clear that
( )−1
( FIVd ' FIVd FIVd ' ε IV
d,1
(
)(ξ 1t )' ,  , ( FIVd ' FIVd )−1
FIVd ' ε IV
d, N
)(ξ tN )' have finite variances ( ε IV is the
i
residuals vector for stock i); then,
27
Here, for any real number X, O(X) is defined as follows: there exist two positive numbers M and N, such that MX<O(X)<NX.
42
1 ˆ
N
1 N
N i =1
(
β1IV ξ t ' = ∑ (β1i + FIVd ' FIVd )
−1
FIVd ' ε IV
d,i
)(ξ it )' → 0 , and
1 ˆ
( β1IV βˆ1EV ' )→ bb'. This implies that the estimator is an N-consistent estimator to zero.
N
When T is finite, the sample average of these estimators is an N-consistent estimator to
zero. Hence, γˆ '−(0, γ + f )' is also an N-consistent estimator to zero.
For the same reason, when (1) β1δ 1 ,  , β N δ 1 ,  , β N δ T have finite variances, (2) for any
1 1 1 N 1 N
i and t, E (β i δ t ) = 0 , (3) regression residuals ε s = [ε s , ε s ,  , ε s ] are not cross-sectional

1 N 1 2 N
and time-sereies correlated, and (4) (Fsample ) Fsample −1

d d
' Fsample d
' Ωsample
d
→ 0 as T → ∞ ,
1 1 1 2 1 1 2
γˆ '−(0, f + γ )′ = (( βˆ1o βˆ1e ' ) −1 ( βˆ1o ∑ ξ t ') + ( βˆ1e βˆ1o ' ) −1 ( βˆ1e ∑ ξ t '))
2 N N T t is even N N T t is odd
converges to 0 when both N and T goes to infinity (i.e. the estimator is NT-consistent)
since
N
1 ˆ 2
( β1o ∑ ξ
T t is even
t ') =
2 N
∑ ∑
NT i =1 t is even
( −1
)
(β1i + Fod ' Fod Fod ' ε od,i )δ ti → 0 , by Markov’s Law
of the large number,

1 ˆ 2
N
( β1e ∑ ξ
T t is odd
t ') =
2 N
∑ ∑
NT i =1 t is odd
( −1
)
(β1i + Fed ' Fed Fed ' ε ed,i )δ ti → 0, by Markov’s Law of
the large number, and ( 1 ( βˆ1o βˆ1e ' )) −1 and ( 1 ( βˆ1e βˆ1o ' )) −1 are bounded given
N N
1 ˆ ˆ 1 ˆ ˆ
( β1o β1e ' )→ bb' , and ( β1e β1o ' )→ bb'.
N N
Next, we show the asymptotic distribution of γˆ t '−(0, γ + f t )' . Since (1) ββ ′/N
converges to bb' when N∞, and (2) [β11ξ 1t ,, β1N ξ tN ] and
( )
−1
[( FIVd ' FIVd FIVd ' ε IV
d,1
( )
−1
)(ξ t1 )' ,  , ( FIVd ' FIVd FIVd ' ε IV
d, N
)(ξ tN )' ] both satisfy the Lindeberg
condition, one can apply the Lindeberg-Feller Central Limit Theorem to show the
normality.
It remains to calculate the aympotitic covariance of the estimator. We first define
u t = usample = (Fsample
d d
' Fsample )−1 Fsample
d
' Ωsample
d
for any t ∈ sample where sample can be
either IV or EV. To obtain the asymptotic covariance, notice that as N∞,
1/N( βˆ1IV βˆ1EV ' ) → bb',
and for t ∈ EV , the numerator of the variance can be writte as:
E (1/N( βˆ1IV ξ t ' ξ t βˆ1IV ' ) | F )
43
1
= E( ( β1(−(γ + f t )u t + ε t )′(−(γ + f t )u t + ε t ) β1′ ) | F )
N
1
+ E ( (uIV (−(γ + f t )u t + ε t )′(−(γ + f t )u t + ε t )uIV ' ) | F )
N
1
= β1E ((−(γ + f t )u t + ε t )′(−(γ + f t )u t + ε t ) | F ) β1′
N
1 ~ E ((−(γ + f )u + ε )′(−(γ + f )u + ε ) | F )u ~ ' | F ).
+ E (u IV IV t t t t t t IV
N
E IV (⋅ | F )
is expected value of a random variable conditioning on all the information F
~ ' = [0 '
and residuals in IV periods. Moreover, u ′ . One can show that
IV 1× N , u IV ]
E IV ((−(γ + f t )u t + ε t )′(−(γ + f t ) u t + ε t ) | F )
= E ((−(γ + f t ) u t + ε t ) ′( −(γ + f t ) u t + ε t ) | F ) = c 0 Σ
hence,
E (1/N( βˆ1IV ξ t ' ξ t βˆ1IV ' ) | F )
~
→ c 0 bΣ b'+c 0 L0,IV .
Then the asymptotic covariance matrix can be written as
~
Acov(γˆ t '−(γ + (0, f t ))' | F ) = c0 (bb' ) −1 (bΣ b'+ L0,IV )(bb' ) −1 .
Note the key step is to show

E ((−(γ + f t ) u t + ε t ) ′( −(γ + f t ) u t + ε t ) | F ) = c 0 Σ
.
This result follows the proof in Shanken (1992). The details are shown below:
−1
Since u t = ( FEV ' FEV ) FEV ' Ω EV ,
d d d d
u′t (γ + f t )′ = ( I ⊗ (γ + f t )( FEVd d −1 d
' FEV ) FEV ' )Vec(Ω dEV ),
T T
where Vec(Ω ) reshapes the × N matrix Ω into the a × N column vector, i.e
d d
EV
2 2
EV
Vec(Ω EV )= (ε1 ,, ε T -1 , ε1 ,, ε T -1 ,, ε1 ,, ε T -1 )′, when EV are estimated using
d d,1 d,1 d,2 d,2 d, N d, N
data in odd periods, and Vec(Ω EV ) = (ε 2 , , ε T , ε 2 , , ε T ,  , ε 2 ,  , ε T )′, when

d d,1 d,1 d,2 d,2 d, N d, N
EV are estimated using data in even periods.
Applying this formula, one has

E ( u t ' (γ + f t ) ′(γ + f t ) u t | F )
= ( I ⊗ (γ + f t )( FEV
d d −1 d
' FEV ) FEV ' )( Σ ⊗ I d )( I ⊗ (γ + f t )( FEV
d
' FEV ) FEV ' )′
d −1 d
= (γ + f t )( FEV
d d
' FEV ) −1 FEV
d
' ) I d ((γ + f t )( FEV
d d
' FEV ) −1 FEV
d
' )′ Σ .
Using the same method, one can show that:
1
E( β1(−(γ + f t )u t + ε t )′(−(γ + f t )u t + ε t ) β1′ | F ) = c0 bΣ b'.
N
Similarly, one can show that
44
1 ~ ~ '| F )
E( uIV E IV ((−(γ + f t )u t + ε t )′(−(γ + f t )u t + ε t ) | F )u IV
N
~
= c0 L0,IV .
When T is finite, the asymptotic distribution of γˆ'−(0, γ + f )' is similar to that

of γˆ t '−(0, γ + f t )' following Lindeberg-Feller Central Limit Theorem.
The asymptotic covariance is calculated following the same logic, notice that
1 ˆ 2 1 2
E ( NT( β1IV ∑ ξ t ')( βˆ1IV ∑ ξ t ')' )
N T t∈EV N T t∈EV
1
= E( ( β1(−(γ + f )uEV + ε )′(−(γ + f )uEV + ε ) β1′ ) | F )
N
1
+ E ( (uIV (−(γ + f )uEV + ε )′(−(γ + f )uEV + ε )uIV ' ) | F )
N
1
= β1E ((−(γ + f )uEV + ε )′(−(γ + f )uEV + ε ) | F ) β1′
N
1 ~ E ((−(γ + f )u + ε )′(−(γ + f )u + ε ) | F )u ~ ' | F ).
+ E (u IV IV EV EV IV
N
Here, ε and f are the sample average of the residual and factors for t in EV. Following
the same derivation as before, we can show that the above expression converges to
~
c EV bΣ b'+c EV L0,IV . Here EV=o when IV=e and EV=e when IV=o. Thus, the asymptotic
convariance takes the form in Theorem A1.
Finally, we prove the asymptotic distribution of γˆ '−(0 , γ + f )' when both T and
N are large. Since (1) ββ ′/N converges to bb' when N∞, (2)
[β11δ 11 ,, β1N δ 1N ,, β1N δ TN ] satisfies a Lindeberg condition, and (3)
(F d
sample
d
' Fsample )
−1 d
Fsample ' Ωsample
d
→ 0 as T → ∞ , one can apply the Lindeberg-Feller
Central Limit Theorem to show the normality.
It remains to calculate the aympotitic covariance of the estimator. The asymptotic

covariance for the numerator of the estimator can be written as:
1 ˆ 2 1 2
E ( NT( β1o ∑ ξ t ')( βˆ1o ∑ ξ t ')' )
N T t is even N T t is even
4 N
( ) ∑ (β + (F )
N
(∑ ∑ (β1i + Fod ' Fod Fod ' ε od,i )δ ti )(∑
−1 −1
= E( 1
i o
d
' Fod Fod ' ε od,i )δ ti ))
NT i =1 t is even i =1 t is even
45
4 N N
→ E( (∑ ∑ β1iδ ti )(∑ ∑ β1iδ ti )' )
NT i =1 t is even i =1 t is even
−1
4 N 2 2 d d 
= E( (∑ ∑ β1i (−( ∑ f t + γ )[ FEV ' FEV  Ft ' ε t ] + ε t ) )
d d,i i
NT i =1 t is even T t∈EV T 
−1
2 d d 
N
2
(∑ ∑ β (−( ∑ f t + γ )[ FEV
1
i ' FEV  Ft ' ε t ] + ε t ))' )
d d,i i
i =1 t is even T t∈EV T 
−1
4 N 2 d d 
→ E( ∑ ∑
NT i =1 t is even
β1i (−γ[ FEV
T
' FEV  Ft ' ε t ] + ε t )

d d,i i
−1
2 d d 
(−γ[ FEV ' FEV  Ft ' ε t ] + ε t )' (β i )' ).
d d,i i 1
T 
The last equation holds because factors are assumed to have zero means, and regression
residuals are both cross-sectional and time-series uncorrelated. Since
−1 −1
2 d d  2 d d 
E ((−γ[ FEV ' FEV  Ft ' ε t ] + ε t )((−γ[ FEV ' FEV  Ftd ' ε td,i ] + ε ti )' )
d d,i i
T  T 
−1 −1
2 d d  2 d d 
= E ( E ((−γ[ FEV ' FEV  Ft ' ε t ] + ε t )((−γ[ FEV ' FEV  Ft ' ε t ] + ε t )' | F ))
d d,i i d d,i i
 T   T 
−1 −1
2 d d  d 2 d 
= E (γ[ FEV  Ft ' (σ t ) Ft  FEV ' FEV  ]γ ' )
d i 2 d
' FEV
T  T 
−1
2 d d 
+ E (γ[ FEV  Ft ' (σ t ) ) + E ((σ t ) )
d i 2 i 2
' FEV
T 
= [1 + γΣ F γ' ](σ t ) ,
-1 i 2
we can show that:
−1
4 N 2 d d 
E( ∑ ∑
NT i =1 t is even
β 1i (−γ[ FEV
 T
' FEV  Ft ' ε t ] + ε t )

d d,i i
−1
2 d d 
(−γ[ FEV ' FEV  Ft ' ε t ] + ε t )' (β i )' ).
d d,i i 1
T 
N
4
→ ∑ ∑ (β1i (1 + γΣ -F1 γ' )(σ ti ) 2 (β1i )' )
NT i =1 t is even
→ 2(1 + γΣ -F1 γ' )bΣ b'.
For the same reason, we can show that
46
1 ˆ 2 1 2
E ( NT( β1e ∑ ξ t ')( βˆ1e ∑ ξ t ')' )
N T t is odd N T t is odd
→ 2(1 + γΣ -F1 γ' )bΣ b'.
Finally, notice that as N∞,

1/N( βˆ1IV βˆ1EV ' ) → bb'.
With these results, we can show that the final asymptotic distribution:
TN (γˆ '−(0, γ + f )' ) → N (0, A −1 DA −1 ).
47
Appendix 2: Details of Simulation Experiments
A2.1. Simulation Parameters

Table A.1 presents the parameters that we use with constant betas in sections 4.A
and 4.B. We set these parameters equal to the mean risk premiums of the common factors
and their covariance structure during the 1956 to 2012 sample period (Panel A). We
determine the cross-sectional means and standard deviations of betas and the volatility
of firm-specific returns by running time-series regressions during this sample period and
shrinking the betas with a simple adjustment rule: adjusted beta = 2/3×beta estimate +
1/3 (Panel B). All simulations use a risk-free rate of 0.9996% per annum.
TABLE A.1
Simulation Parameters
Panel A: Time-series means and standard deviations of common factors

Fama-French
Single Factor Model
Three-Factor Model
Mean (%) StdDev (%) Mean (%) StdDev (%)
Factors MKT 5.80 15.69 5.80 15.69
(per annum) SMB 2.64 7.89
HML 4.36 7.56
Panel B: Cross-sectional means and standard deviations of constant betas

Fama-French
Single Factor Model
Three-Factor Model
StdDe
Mean Mean StdDev
v
Betas MKT 0.95 0.42 0.95 0.42
SMB 0.80 0.50
HML 0.19 0.51
Idiosyncratic
Volatility 0.036 0.015 0.037 0.015
(per day)
48
A2.2. Simulations with Time-varying Betas
This section describes the procedure that we use for the simulations with time-
varying betas discussed in section 4.C. We assume that β t , the beta of stock i in month t,
i
follows an AR(1) process. Specifically:
β it − β i = ρ(β it -1 − β i ) + e it
where e t is the shock to beta, and β is the mean of beta. We set ρ to equal the
i i
average autocorrelation coefficient during our sample period. To estimate the AR(1)
coefficients, we estimate the three-year rolling betas for each stock. We then trim these
estimates at the 2.5% and 97.5% levels, and shrink them by applying a simple adjustment
rule: adjusted beta = 2/3×beta estimate + 1/3. We compute the average autocorrelation of
the betas across stocks, which equals .96. Table A.2 also presents the average time-series
standard deviations for single-factor and three-factor betas.
To generate time-varying betas, we first randomly generate the time-series mean of
i
each beta as we did for the constant-beta simulations. We next draw e t from a normal
distribution with mean zero and standard deviation equal to 1 - ρ 2 times the average
time-series standard deviation of the corresponding beta. We then compute β t through
i
the AR(1) specification above. We assume that β t stays constant for 22 trading days for
i
a given month. Finally, using this time-varying factor sensitivity, we generate daily returns
by following the same simulation procedure described in section 4.A. We conduct the same
IV estimation procedure for risk premiums as in the simulations with constant betas. This
simulation procedure is used for the single factor CAPM and the Fama-French three-factor
model. Table A.3 presents the biases and RMSEs of IV risk premium estimates with time-
varying betas and Table A.4 presents the size and power of IV tests with time-varying betas.
The results here are similar to the corresponding results in Tables 1 and 2. 28
28
Back et al. (2015) report somewhat different results for the small sample distribution of the test statistic
compared to the results in Table A.4. We are not able to replicate their results.
49
Table A.2
Time-varying Beta Parameters
Average time-series standard deviations of time-varying betas and their AR(1)

Coefficients (= ρ )
Fama-French
Single Factor Model
Three-Factor Model
ρ StdDe ρ StdDev
v
Betas MKT 0.96 0.15 0.96 0.15
SMB 0.96 0.19
HML 0.96 0.21
TABLE A.3
Small Sample Properties of IV Risk Premium Estimates
with Time-varying Betas
Panel A: Single-factor CAPM

Risk Ex-ante Ex-post Ex-ante Ex-post
Estimator
Factor Bias (%) Bias (%) RMSE RMSE
OLS -25.8 -25.4 0.191 0.139
MKT
IV -2.98 -2.56 0.180 0.079
Panel B: Fama-French Three-factor Model

Estimator
OLS -39.4 -39.3 0.234 0.204
MKT
IV -3.76 -3.71 0.194 0.090
OLS -59.6 -62.4 0.144 0.148
SMB
IV 0.83 -1.96 0.132 0.096
HML OLS -61.9 -62.9 0.232 0.234
50
IV -3.29 -4.21 0.125 0.097
TABLE A.4
Size and Power of the IV Tests
with Time-varying Betas
Risk Theoretical Percentiles

Factor 1% 2.5% 5% 7.5% 10%
Panel A: Size for Single-factor Model CAPM
MKT 1.2% 2.9% 4.7% 7.3% 11.4%
Panel B: Size for Fama-French Three-factor Model
MKT 0.9% 2.5% 5.3% 7.8% 10.7%
SMB 0.8% 2.8% 5.1% 7.6% 10.4%
HML 1.3% 2.0% 4.2% 8.0% 9.5%
Risk Test
Factor Power
Panel C: Power for Single-factor Model CAPM
MKT 84.3%
Panel D: Power for Fama-French Three-factor Model
MKT 80.4%
SMB 56.3%
HML 89.6%
MKT or SMB or
99.4%
HML
51
Appendix 3: Innovations in Illiquidity Costs
We follow Acharya and Pedersen (2005) and fit the following time-series
regression to estimate expected and unexpected components of market-wide illiquidity
~
cost ( cMKT, τ = c MKT,τ − Eτ −1[c MKT,τ ]) :
L
0.25 + 0.3ILLIQMKT, τ PMKT, τ −1 = α 0 + ∑ α l × (.25 + 0.3ILLIQMKT, τ -1PMKT, τ −1 ) + c~MKT, τ ,
l =1
i 30 − 0.25
where ILLIQMKT, τ is the value-weighted average of min(ILLIQτ , ), which
0.30PMKT, τ −1
Acharya and Pederson define as un-normalized illiquidity, truncated for outliers. We cannot reject
the hypothesis that the residuals are white noise based on the Durbin-Watson tests for L=2. The
results we report are based on the application of the AR(2) model to estimate expected and
unexpected components of illiquidity for the market as well as for individual stocks. We repeat the
tests with L ranging from 2 to 6 and find that the results are not sensitive to the choice of L.
52
Appendix 4: Proof of Proposition 2
For expositional convenience, assume that the even-month beta is the independent
variable and odd-month beta is its instrument. We need to show that the correlation of true
beta ( x ) and estimated beta ( x * ) from even months is the square root of the correlation of
estimated beta ( x * ) and its instrument ( z ), i.e.,
correlation (x, x * ) = correlation (x * , z)

where
x * = x + u even z = x + u odd
and, x, u even and u odd are mutually independent and σ u2 = σ u2 even
= σ u2odd .
By the definition of correlation,
cov(x * , z) σ x2
correlation (x * , z) = =
var(x* )var(z) σ x2 + σ u2
cov(x, x * ) σ x2
correlation (x, x * ) = =
var(x)var(x * ) σ x2 (σ x2 + σ u2 )
= correlation (x * , z) .
53
Figure 1
Percentage Bias versus Time-series Length T
Single Risk Factor
This figure presents the ex-ante and ex-post biases in the estimated market risk premiums
as percentages of the true market risk premium from two sets of simulations, one when risk
premiums estimated using ordinary least squares (=OLS) regression and the other using
Instrumental Variables (=IV) regressions. All simulations are based on a risk-free rate of
0.9996%, a true market risk premium of 5.8008% per annum, The simulations use 2000
individual stocks in the cross-section. Appendix 2 describes the details of the simulation
experiments. The horizontal axis is the number of days (T) used to estimate betas. We run
1,000 repetitions for each T.
Percentage Bias vs Time-series Length T

5
-5
Bias / True MKT Premium [%]
-10
-15
-20
OLS (ex-ante)
IV (ex-ante)
-25 OLS (ex-post)
IV (ex-post)
-30
-35
-40
-45
0 500 1000 1500 2000 2500 3000
Length of Firm Time-series (T in days)
54
Table 1
Small Sample Properties of Risk Premium Estimates
Panel A presents the bias in the slope coefficients when the second stage regressions are fitted
using the OLS and Instrumental Variable (=IV) methods under the single-factor CAPM and
Panel B presents the results for the Fama-French three-factor model. Appendix 2 describes more
details of the simulations. There are 2000 stocks in the cross-section, and the results are based
on 1,000 repetitions. The sample period for the simulations is 660 months. Rolling betas are
estimated each month using daily return data over the previous 24 months. The IV method uses
data over 12 months to estimate the independent variables (betas) and data other the other 12
months to estimate the instrumental variables. Ex-ante bias is the difference between the mean
risk premium estimate and the corresponding true parameter. Ex-post bias is the difference
between the mean risk premium estimate and the sample mean of the corresponding risk factor
realizations. Ex-ante and ex-post biases are expressed as percentages of the true parameters.
Panel A: Single-factor CAPM

Estimator
OLS -28.4 -28.8 0.194 0.156
MKT
IV -0.30 -0.70 0.193 0.088

Estimator
OLS -42.7 -40.4 0.241 0.208
MKT
IV -2.10 0.20 0.191 0.095
OLS -64.7 -66.1 0.152 0.157
SMB
IV 0.80 -0.70 0.131 0.102
OLS -66.3 -65.9 0.246 0.246
HML
IV -0.40 0.00 0.134 0.109
55
Table 2
Size and Power of the IV Test
Panels A and B of this table present the test sizes under the null hypotheses that the risk
premiums equal zero using the t-statistics of the corresponding slope coefficients. The
slope coefficients are obtained from the IV estimator and t-statistics are based on their
Fama-MacBeth standard errors. Panels A and B present the results for the CAPM and for
the Fama-French three-factor model, respectively. Appendix 2 describes the details of the
simulation experiments. The number of stocks in the cross-section is set to N=2000 stocks,
and the results are based on 1,000 repetitions. The sample period for the simulations is 660
months. Rolling betas are estimated each month using daily return data over the previous
24 months, with data over 12 months to estimate the independent variables (betas) and data
other the other 12 months to estimate the instrumental variables. The simulations in Panels
C (CAPM) set market risk premium (MKT) equal to 5.8%, and that in D (Fama-French
three-factor model) sets MKT, SMB and HML equal to 5.8%, 2.64% and 4.36%,
respectively. The panels present the percentage of simulations that reject the null
hypothesis that the respective factor risk premiums less than or equal to zero at the 5%
significance level. The row labeled “MKT or SMB or HML” presents the percentage of
simulations that reject the null hypothesis that at least one of the risk premiums is less than
or equal to zero at the 5% significance level.
Risk Theoretical Percentiles

Factor 1% 2.5% 5% 7.5% 10%
Panel A: Size for Single-factor Model CAPM
MKT 0.9% 2.4% 5.1% 7.7% 9.8%
Panel B: Size for Fama-French Three-factor Model
MKT 1.1% 2.3% 4.7% 7.9% 9.9%
SMB 1.0% 2.4% 4.9% 7.6% 9.8%
HML 0.9% 2.7% 5.3% 7.3% 10.3%
Risk Test
Factor Power
Panel C: Power for Single-factor Model CAPM
MKT 82.8%
Panel D: Power for Fama-French Three-factor Model
MKT 78.1%
SMB 50.1%
HML 84.2%
MKT or SMB or
98.2%
HML
56
Table 3
Summary Statistics for Stock Test Data
Summary statistics include the mean, median, standard deviation, and first and third
quartiles. All other rows contain based on the time-series of the corresponding statistics.
Market capitalization is price multiplied by the number of shares outstanding. We compute
book-to-market ratios as in in Davis et al. (2000). Excess return is relative to the one-month
T-bill rate. Return volatility is the standard deviation of daily returns. The sample period
is from January 1956 through December 2012.
Mean Median Standard Deviation Q1 Q3

Number of Stocks each month 1934 1980 900 1368 2697
Time-series length 176 136 134 76 231
Capitalization, $ billion 1.498 0.191 6.509 0.052 0.751
Book-to-market ratio 0.904 0.750 0.674 0.463 1.151
Excess Return (%) 0.888 0.044 11.202 -5.391 5.996
Return Volatility (%) 2.711 2.336 1.652 1.628 3.351
57
Table 4
Correlations Among Estimated Factor Sensitivities (Betas)
and Size and Book-to-Market Characteristics
This table presents the average cross-sectional correlations among betas and size and book-
to-market ratios. Betas are estimated for each month using daily returns data from the
previous 36 months. SIZE is the natural logarithm of market capitalization and BM is the
book-to-market ratio. Panel A reports correlations for the CAPM and Panels B and C report
analogous correlations for the Fama-French three-factor model. The sample period is from
January 1956 to December 2012.
Panel A: CAPM
SIZE BM
Individual Stocks MKT -0.18 -0.20
25 Fama-French
MKT -0.56 -0.44
portfolios
Panel B: Fama-French three-factor model: Individual stocks

MKT SMB HML SIZE BM
MKT 1
SMB 0.35 1
HML 0.14 0.13 1
SIZE 0.15 -0.44 -0.15 1
BM -0.12 0.06 0.28 -0.35 1
Panel C: Fama-French three-factor model: 25 size and BM sorted portfolios

MKT SMB HML SIZE BM
MKT 1
SMB -0.08 1
HML -0.08 -0.15 1
SIZE 0.19 -0.97 -0.01 1
BM 0.07 -0.03 0.88 -0.08 1
58
Table 5
Risk Premium Estimates with Individual Stocks
CAPM and Fama-French Three-Factor Model
The IV method estimates risk premiums, in percent per month, using individual stocks as
test assets. Rows labelled MKT, SMB and HML are risk premiums for the market, SMB
and HML factors, respectively and the corresponding t-statistics are in parentheses (bold if
significant at the 5% level). SIZE is the natural logarithm of market capitalization and BM
is the book-to-market ratio at the end of the previous month. Betas for each month are
estimated using daily returns data over the previous 36 months and cross-sectional
regressions are fitted using the IV method. The sample period is from January 1956 through
December 2012. N is the mean number of stocks in the monthly cross-sections.
(1) (2) (3) (4)

Panel A: 1956-2012, N=1936
Const 1.050 0.784 3.544 3.512
(7.80) (5.66) (5.07) (5.77)
MKT -0.189 -0.315 0.010 0.113
(-1.00) (-1.65) (0.05) (0.62)
SMB 0.311 -0.077
(2.09) (-0.71)
HML 0.504 0.259
(3.22) (1.77)
SIZE -0.152 -0.161
(-4.31) (-5.19)
BM 0.163 0.134
(3.50) (3.13)
Panel B: 1956-1985, N=1239
Const 1.171 0.769 3.350 3.663
(6.39) (4.25) (3.41) (3.94)
MKT -0.386 -0.394 -0.163 0.061
(-1.67) (-1.55) (-0.70) (0.24)
SMB 0.358 -0.082
(1.75) (-0.60)
HML 0.594 0.317
(2.64) (1.56)
SIZE -0.144 -0.167
(-2.80) (-3.52)
BM 0.204 0.171
(3.00) (2.65)
59
Panel C: 1986 to 2012, N=2710
Const 0.871 0.759 3.687 3.462
(4.57) (3.96) (3.77) (4.24)
MKT 0.030 -0.254 0.201 0.301
(0.10) (-0.85) (0.62) (1.07)
SMB 0.322 -0.155
(1.37) (-0.84)
HML 0.321 0.248
(1.47) (1.19)
SIZE -0.160 -0.157
(-3.32) (-3.91)
BM 0.116 0.092
(1.84) (1.67)
60
Table 6
Risk Premium Estimates with Individual Stocks
Fama-French Five-Factor Model
This table reports the risk premiums estimated using the IV method, in percent per month,
using individual stocks as test assets and the corresponding t-statistics in parentheses (bold
if significant at the 5% level). Rows labelled MKT, SMB, HML, RMW, and CMA are
risk premiums for the market, SMB, HML, RMW, and CMA factors, respectively. SIZE is
the natural logarithm of market capitalization and BM is the book-to-market ratio at the
end of the previous month. OP and INV are the operating profitability and investment/total
asset, respectively. Betas for each month are estimated using daily returns data over the
previous 36 months. Panels A, B and C report results with the IV-method. The sample
period is from January 1964 through December 2012. N is the mean number of stocks in
the cross-sections.
(1) (2) (3) (4) (5) (6)

Panel A: 1964-2012, N=2256
Const 0.891 0.886 0.901 0.891 0.954 2.753
(3.38) (3.40) (4.87) (3.50) (3.89) (4.07)
MKT -0.736 -0.125
(-2.81) (-0.52)
SMB 0.508 0.034
(2.20) (0.20)
HML 0.522 0.231
(2.12) (0.99)
RMW 0.247 -0.071 0.183 0.067
(1.45) (-0.32) (1.11) (0.33)
CMA 0.121 0.445 -0.032 0.275
(0.52) (1.60) (-0.14) (1.01)
Size -0.127
(-3.53)
BM 0.167
(3.79)
OP -0.172 0.242
(-1.16) (2.55)
INV -0.659 -0.592
(-5.42) (-7.32)
61
Panel B: 1964-1988, N=1747
Const 0.765 0.457 0.869 0.871 0.653 1.520
(2.02) (1.10) (2.81) (2.11) (1.49) (1.39)
MKT -0.741 -0.373
(-1.66) (-0.86)
SMB 0.316 0.147
(0.82) (0.66)
HML 0.464 0.270
(1.18) (0.79)
RMW 0.233 -0.437 0.17 -0.082
(1.10) (-1.48) (0.80) (-0.29)
CMA 0.794 0.619 0.583 0.062
(2.34) (1.53) (1.71) (0.16)
Size -0.08
(-1.12)
BM 0.298
(3.86)
OP -0.160 0.612
(-0.53) (3.02)
INV -0.991 -1.006
(-3.56) (-5.28)
Panel C: 1989-2012, N=2741
Const 0.983 1.053 0.884 0.957 1.081 3.453
(3.08) (3.36) (3.67) (3.13) (3.46) (4.07)
MKT -0.567 0.190
(-1.70) (0.65)
SMB 0.502 -0.046
(1.68) (-0.19)
HML 0.856 0.429
(2.68) (1.35)
RMW 0.218 -0.052 0.204 0.068
(0.84) (-0.18) (0.87) (0.26)
CMA -0.337 0.102 -0.439 0.033
(-1.16) (0.28) (-1.54) (0.09)
Size -0.152
(-3.79)
BM 0.070
(1.25)
OP -0.202 0.086
(-1.48) (0.93)
INV -0.453 -0.351
(-5.23) (-5.56)
62
Table 7
Correlations Among Estimated Betas
and SIZE and Book-to-market Ratios: The q-factor Asset Pricing Model
This table reports time series averages of cross-sectional correlations among I/A and ROE
betas and size and book-to-market ratios. Betas are estimated for each month using daily
returns data from the previous 36 months. SIZE is the natural logarithm of market
capitalization and BM is the book-to-market ratio. Panel A reports the results for individual
stocks and Panel B reports the results for 25 Fama-French size and book-to-market sorted
portfolios. The sample period is January 1972 to December 2012.
Panel A: Individual stocks

MKT I/A ROE SIZE BM
MKT 1
I/A 0.04 1
ROE -0.03 0.33 1
SIZE 0.24 -0.05 0.12 1
BM -0.17 0.09 -0.07 -0.32 1
Panel B: 25 Fama-French size and BM sorted portfolios

MKT I/A ROE SIZE BM
MKT 1
I/A -0.70 1
ROE -0.69 0.52 1
SIZE -0.44 0.04 0.74 1
BM -0.48 0.88 0.29 -0.08 1
63
Table 8
Risk Premium Estimates and Characteristics
The q-factor Asset Pricing Model
if significant at the 5% level). Rows labeled MKT, ME, I/A, and ROE report factor risk
premium estimates. Rows labeled SIZE and BM contain mean cross-sectional slope
coefficients on the natural logarithm of market capitalization and the book-to-market ratio
at the end of the previous month. Betas for each month are estimated using daily returns
over the previous 36 months. The sample period is January 1972 through December 2012.
N is the average number of stocks.
(1) (2) (3) (4) (5) (6) (7) (8)

Panel A: 1972-2012, N=2431
Const 1.157 0.764 0.958 0.881 1.153 0.563 4.253 3.465
(7.11) (4.45) (3.65) (3.65) (6.17) (2.35) (5.23) (4.45)
MKT -0.150 -0.639 -0.064
(-0.56) (-2.30) (-0.27)
ME 0.263 0.252 -0.157
(1.24) (1.12) (-0.97)
I/A 0.320 -0.074 0.246 -0.114
(1.73) (-0.34) (1.36) (-0.54)
ROE -0.313 -0.796 -0.170 -0.508
(-1.33) (-2.80) (-0.75) (-1.86)
Size -0.173 -0.143
(-4.66) (-3.75)
BM 0.286 0.199
(4.41) (4.07)
Panel B: 1972-1992, N=2082
Const 1.081 0.794 0.991 0.812 1.051 0.575 4.270 3.152
(4.21) (2.91) (2.67) (2.30) (3.99) (1.56) (3.22) (2.61)
MKT 0.070 -0.412 0.114
(0.21) (-1.19) (0.35)
ME 0.390 0.430 -0.083
(1.44) (1.34) (-0.41)
I/A 0.831 -0.205 0.526 -0.214
(3.32) (-0.72) (2.19) (-0.76)
ROE -0.207 -0.832 -0.249 -0.561
(-0.76) (-2.35) (-0.94) (-1.61)
Size -0.178 -0.138
(-2.84) (-2.19)
BM 0.289 0.178
(3.07) (2.60)
64
Panel C: 1993-2012, N=2768
Const 1.351 0.875 0.853 0.921 1.250 0.601 4.055 3.601
(5.86) (3.43) (2.47) (2.79) (4.62) (1.56) (4.41) (3.59)
MKT -0.352 -0.848 -0.200
(-0.84) (-2.00) (-0.57)
ME 0.235 0.196 -0.323
(0.70) (0.62) (-1.30)
I/A 0.056 0.105 0.066 -0.073
(0.22) (0.31) (0.26) (-0.23)
ROE -0.206 -0.517 -0.064 -0.476
(-0.54) (-1.11) (-0.17) (-1.12)
Size -0.165 -0.144
(-4.01) (-3.12)
BM 0.335 0.228
(3.74) (3.31)
65
Table 9
Risk Premium Estimates with Individual Stocks: Liquidity-adjusted CAPM
if significant at the 5% level). The row labeled LMKT reports estimated illiquidity risk
premiums under the liquidity-adjusted CAPM (LCAPM), and the row labeled Amihud
illiquidity reports the slope coefficient on firm-specific Amihud illiquidity. The slope
coefficients are in percent per month, and the corresponding t-statistics are in parentheses
(bold at the 5% level). N is the average number of stocks.
Sample Period
1956-2012, N=1265 1956-1985, N=1192 1986-2012, N=1344
Constant 0.641 0.551 0.811 0.657 0.431 0.358
(4.64) (4.07) (4.38) (3.84) (1.83) (1.92)
LMKT 0.140 0.075 -0.086 -0.136 0.462 0.300
(0.63) (0.34) (-0.27) (-0.44) (1.47) (0.97)
Amihud 0.184 0.310 0.040
Illiquidity (3.89) (3.53) (1.91)
66
Table 10
Strength of Instruments
This table presents the average correlations between odd- and even-month factor loading
estimates (betas) under the models indicated in panel headings. The critical value for the
weak instruments tests proposed by Nelson and Startz (1990) is .06, based on the smallest
number of stocks in the sample in any month. The square root of the odd- and even-month
correlation is the correlation between the unobservable “true” betas and the corresponding
beta estimates; (See section 5.F and Appendix 4.)
Panel A: CAPM
Corr(Odd, Even) Corr(True Beta, Beta Est.)
Sample period
MKT MKT
1956-2012 0.67 0.82

Sample period
MKT SMB HML MKT SMB HML
1956-2012 0.52 0.44 0.30 0.71 0.66 0.54
Panel C: Fama-French Five-factor Model

Sample Corr(Odd, Even) Corr(True Beta, Beta Est.)
period MKT SMB HML RMW CMA MKT SMB HML RMW CMA
1964-2012 0.42 0.35 0.19 0.18 0.14 0.65 0.59 0.44 0.43 0.38
Panel D: q-factor Asset Pricing Model

Sample Corr(Odd, Even) Corr(True Beta, Beta Est.)
period MKT ME I/A ROE MKT ME I/A ROE
1972-2012 0.48 0.38 0.20 0.21 0.69 0.62 0.45 0.46
Panel E: Liquidity-adjusted CAPM (LCAPM)

Sample period
LMKT LMKT
1956-2012 0.58 0.76
67

Empirical Tests of Asset Pricing Models With Individual Assets Resolving The Errors-In-Variables Bias in Risk Premium Estimation

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Empirical Tests of Asset Pricing Models With Individual Assets Resolving The Errors-In-Variables Bias in Risk Premium Estimation

Uploaded by

Copyright:

Available Formats

Empirical Tests of Asset Pricing Models with Individual Assets:

Resolving the Errors-in-Variables Bias in Risk Premium Estimation

Narasimhan Jegadeesh, Joonki Noh,

September 15, 2015

Co-Author Affiliation Voice E-Mail

Key Words: Risk Premium Estimation, Errors-in-Variables Bias, Instrumental Variables,

Electronic copy available at: http://ssrn.com/abstract=2664332

Electronic copy available at: http://ssrn.com/abstract=2664332

2. Risk-Return Models and IV Estimation

where E (r i ) is the expected excess return on stock i, β ik is the sensitivity of stock i to

where f k, t is the realization of factor k in time t, ε ti is the regression residual. We will

2.2. Instrumental Variables Estimator

where rt is the 1× N vector of realized excess returns in month t, γ̂ is the 1× (K + 1)

γˆ t ' = ( βˆ1IV βˆ1EV ' ) −1 ( βˆ1IV rt ' ),

3. Asymptotics, Adjustment for Finite N, and Standard Errors

γˆ t = ( βˆ1IV βˆ1EV ' ) −1 βˆ1IV rt' (3.1)

equation (2.1). Assume that (1) The residual process ε s = [ε s , ε s ,  , ε s ] is stationary.

The elements in ε s are cross-sectionally uncorrelated, and ε s and ε t are

Proof: See Appendix 1.

3.1. Adjustment for Finite Number of Stocks (N)

3.2. Standard Errors for Risk Premium Estimates

4. Small Sample Properties of the IV Method - Simulation Evidence

4.A. Bias and RMSE

across 1000 repetitions.

For stock i, the excess return on day τ is defined as

rτi = β i rMKT, τ + ετi , (4.1)

where rMKT, τ is the market excess returns.

rτi = α i + β i rMKT, τ + ετi . (4.2)

procedure to estimate γ 0 , γ MKT , γSMB , and γ HML . Appendix 2 describes the

corresponding Fama-MacBeth standard error (FMSE).

4.C. Time-varying Factor Sensitivities

5. IV Risk Premium Estimates for Selected Asset Pricing Models

5.B. The CAPM and the Fama-French Three-Factor Model

β̂iMKT = β̂iMKT,-1 + β̂iMKT,0 + β̂iMKT,1.

5.C. The Fama-French Five-Factor Model

E (rti ) = βiMKT γ MKT + βSMB

5.D. The q-factor Asset Pricing Model

E (rti ) = βiMKT γ MKT + βiME γ ME + βiI/A γ I/A + βiROE γ ROE (5.3)

5.E. The Liquidity-Adjusted CAPM

E (rti ) = E (c it ) + λ(β 1i + β i2 − β i3 − β i4 ), (5.4)

Cov(rti , rMKT, t − E t −1 (rMKT, t ))

Cov(c it − Et −1 (c it ), c MKT, t − Et −1 (c MKT, t ))

Cov(rti , c MKT, t − E t −1 (c MKT, t ))

Cov (c it − E t −1 (c it ), rMKT, t − E t −1 (rMKT, t ))

β iLMKT = β1i + β i2 − β i3 − β i4 . (5.6)

ciτ = min(0.25 + 0.3ILLIQiτ PMKT, τ −1 ,30), (5.8)

5.F. On the Strength of Instrumental Variables

and hence it is smaller than the unadjusted market risk premium.

betas, respectively. Then:

correlation (βik , β̂ieven,k ) = correlation (βik , β̂iodd, k )

= correlation (β̂iodd, k , β̂ieven,k ) .

Proof: See Appendix 4.

A1.1. The N-consistency and asymptotic distribution of the IV method

estimation error in the first pass is βˆ sample − β = (Fsample

regression can be written as rs = γˆβˆ1sample + ξ s . Since the true model is rs = ( f s + γ ) β + εs ,

Assumptions: (1) The residual process ε s = [ε s , ε s ,  , ε s ] is a stationary. The elements

in ε s are cross-sectionally uncorrelated, and ε s and ε t are uncorrelated when s is not

risk premiums γˆ t ' converges to (0, γ + f t )' an when N converges to infinity.

exists, (2) [β11ξ 1t ,, β1N ξ tN ] satisfies a Lindeberg condition, 26

N (γˆ t '−(0, γ + f t )′) → N (0, A −1 BA −1 ),

where l t = (− 2 ,− 2 ,  ,1 − 2 ,  ,− 2 )′ with the term 1 − 2 the t'th entry of the vector,

Similarly, N (γˆ '−(0, γ + f )′) → N (0, A −1 B A −1 ), where