Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/225755143

Asymmetric Information in the Subprime Mortgage Market

Article  in  The Journal of Real Estate Finance and Economics · November 2011


DOI: 10.1007/s11146-010-9288-6

CITATIONS READS
5 1,348

4 authors:

James B. Kau Donald Keenan


University of Georgia University of Georgia
109 PUBLICATIONS   3,306 CITATIONS    81 PUBLICATIONS   1,781 CITATIONS   

SEE PROFILE SEE PROFILE

Constantine Lyubimov V. Carlos Slawson


FNMA Louisiana State University
7 PUBLICATIONS   14 CITATIONS    24 PUBLICATIONS   526 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Mortgage default View project

All content following this page was uploaded by V. Carlos Slawson on 20 May 2014.

The user has requested enhancement of the downloaded file.


Asymmetric Information in the Subprime
Mortgage Market

James B. Kau∗
Department of Insurance, Legal Studies and Real Estate
University of Georgia

Donald C. Keenan
Department of Economics and Management
Université de Cergy-Pontoise & THEMA

Constantine Lyubimov
Department of Insurance, Legal Studies and Real Estate
University of Georgia

V. Carlos Slawson
Department of Finance
Louisiana State University

Feb. 25, 2010


Corresponding author: Brooks Hall 298a, Department of Insurance, Legal Studies, and Real Estate, Terry College of Business,
University of Georgia, Athens, Georgia 30602-6255, (706) 542-3805, fax (706) 542-4295, jkau@uga.edu

1
1 Introduction

1.1 Asymmetric Information and Mortgages

Asymmetric information is generally distinguished by whether there is adverse selection or


moral hazard, often called hidden knowledge or hidden action, respectively.1 In the case of
hidden knowledge, at least one party to a transaction has knowledge not available to another,
whereas with hidden action one party acts in a way that is not observed by the other. According
to this distinction, knowledge is an unalterable fact about which of various types a person or
object happens to be, while an action, instead, is a conscious choice by the party concerned.
In the case of mortgage markets, which we will be dealing with, adverse selection is perhaps
paramount, since most acts are observable by all parties, though as will be seen, the distinction
between the two forms of asymmetry is not always clear cut, even in theory, while the empirical
implications of the two forms of asymmetric information are often quite similar.

Information asymmetry would be unimportant if it did not affect the parties and the function-
ing of the market in a substantial way. However, beginning with the work of Akerlof (1970),
economists have convincingly argued that not only can asymmetric information in a potential
transaction affect the participants but, if systematically present, can destroy the usual effi-
ciency properties of markets. Indeed, in extreme cases, adverse selection can cause a market to
completely cease to function. A case of much interest to this paper, the mortgage crisis and the
subsequent seizing up of world credit markets in late 2008, has commonly been attributed to
severe adverse selection problems, when potential purchasers of credit instruments came to the
abrupt realization that they had significantly less information about the prospect of repayment
of such loans than did the issuers.2

While it is an oversimplification of a complex set of actions between numerous parties, we


are going to regard there as being three main sorts of parties in the origination and eventual
securitization of mortgages. The first two parties are the borrower and the originator, or
1
Hidden knowledge is also commonly referred to as hidden or private information.
2
The credit crisis is sometimes attributed, instead, more to the drying up of overall liquidity than to asym-
metry of any given transaction, so that even if the potential purchaser had a realistic notion of the worth
of an asset he could not count on being able to resell it at a price reflecting this knowledge. However, this
is, of course, just evidence of adverse selection problems in this market further down the line impinging on a
particular transaction, even though there would be no asymmetric information to the given transaction, and
thus an example of how adverse selection in some potential transactions can indeed propagate and distort the
entire market. One could take the alternative view that all that happened was that everyone discovered credit
instruments were more risky than theretofore thought, but in such a case of symmetric information, the assets
should simply have lost value in a mutually understood way, with no resulting seizing up of markets.

2
primary lender, who face one another in the primary mortgage market. The third sort of
participant is the ultimate investor, or secondary lender, who then faces the primary lender in
the private secondary mortgage market. The principal distinction between the three groups,
from our point of view, is that the borrower has the most information about the prospects for
payment, prepayment, or default of the mortgage in the future, since of course it is his loan
and his actions to come. This is followed by the primary lender, who presumably has some
personal familiarity with the borrower and who must approve the loan. The primary lender is
not necessarily privy, though, to all the knowledge the borrower has of his own characteristics.
However this may be, the parties with the least amount of information are certainly the ultimate
investors in the secondary mortgage security market. It is widely accepted that they pay little
attention to personal characteristics, other than such things as the borrower’s FICO score.3

1.2 Adverse Selection and Moral Hazard with Hidden Information

The primary market can be adequately thought of in terms of the adverse selection problem
faced by the originator, who does not know the borrower’s full characteristics. The originator
offers a menu of contracts, and according to his hidden nature, the borrower chooses the par-
ticular contract, partly revealing his nature after the fact. Thus, for example, borrowers more
prone to default tend to take larger loans, given the size of their house. On the other hand, the
secondary mortgage market is usually thought of in terms of a moral hazard problem between
a principal, who is the ultimate investor, and his agent, who is the primary lender providing
the loan. This viewpoint, though, would seem to be in conflict with our earlier observation
that all acts are observed by all parties, so in the language of game theory, while one one
has a situation of incomplete information (hidden knowledge), one apparently does not have a
situation of imperfect information (hidden action). However, while primary lenders’ acts may
be easily observed, the strategies on which they are based are not so easily observed, since
what they choose to do certainly depends on what they know, and part of what they know
is hidden. One may thus regard their strategies, that is their contingent plans, mapping from
their hidden knowledge to their acts, as being the actions which are hidden, and under this
interpretation of what constitutes the hidden action, one can indeed regard the situation as one
3
As with the literature on insurers, it is often regarded as a bit of a puzzle why the private secondary market
does not make use of more information than it does, when as we will see there is good evidence that some of it is
relevant, and where this information is potentially available to these final investors. (See, e.g., Finkelstein and
Poterba (2006), who describe the U.K. annuities market, where insurers fail to use policyholders’ residential
addresses.) Some of the reason, besides the obvious costs of such determinations, may involve wariness of
running afoul of various anti-discrimination statutes or suffering regulatory disfavor. For our purposes, it is
adequate that, for whatever reason, the secondary market do not in fact appear to exploit this information.

3
of moral hazard. This viewpoint is sometimes termed “moral hazard with hidden information”
(see Macho-Stadler & Pérez-Castrillo (2001) or Gjesdal (2007)).4 The primary example would
be the unseen rule by which a lender decides to approve or disapprove a loan, partly based
on information not available to the secondary mortgage market. The moral hazard problem
is, of course, that originators will be motivated not to follow a rule of only approving loans of
adequate quality, given that the loans will simply be passed on to investors in the secondary
mortgage market not so able to ascertain that quality. The buyers in the secondary market,
by definition, observe that the loan was approved by the originator, but do not observe the
considerations that went into this approval; it is this sense of hidden action that gives rise to
the moral hazard problem.5 Another example of such an unseen rule of conduct arises when the
lender does not entirely originate to distribute (o.t.d.), and so must decide by some unobserved
criterion whether to retain the loan or sell it to the secondary market.6

While the classical solution ameliorating the adverse selection problem in the primary market
is to offer a menu of contracts with borrowers sorting themselves among the contracts according
to their types, the classical solution to the moral hazard in the secondary market would be to
have the originators continue to subsequently share in the risk of the loan. Here, the market has
not so clearly followed the prescriptions of theory. In the o.t.d. case, the originator ceases to
have much interest in the loan after selling it, whereas in the originate-to-hold (o.t.h.) model,
where originators do retain some loans in their portfolio, this merely aggravates the problem
for the loans they do pass on, which are the ones we are observing in the secondary market.
Some relief from the moral hazard problem may be offered by the practice of the originator
continuing to service the loan, but fear of loss of servicing fees appears to be a somewhat weak
incentive to properly assess a loan at origination, particularly given the origination fees to be
had by approval.
4
One could argue that the distinction from the classical moral hazard problem is unimportant, and that the
situation is, in fact, rather like the classical one, where an employer sees the output of a worker, but not his
effort. But in that situation, the moral hazard depends on there being noise in the relation of effort to output,
since otherwise the employer could indeed infer the effort. While there may well be noise in loan approval,
nonetheless, the moral hazard dilemma arising does not depend on it: even without noise, the buyer will not
be able to infer the quality of the loan, since knowledge of this depends on factors known to the originator but
hidden from the buyer. Thus, hidden information plays a critical factor in the secondary market’s moral hazard
problem absent from the classical, pure moral hazard problem.
5
Of course, if the rule is applied by the primary lender to enough loans, the ultimate investors will eventually
acquire enough information to statistically infer the rule, at least in retrospect, which is to say to that a
longstanding primary lender will then acquire a reputation, so that credible commitment to a continued presence
in the market creates incentives to avoid exploiting the immediate advantage to be had from hidden knowledge.
6
Note that, in both examples, while buyers in the secondary market see what is offered to them, they do not
know what was not offered, in the former case presumably because the potential loan was not good enough and
in the latter case presumably because because the loan was too good.

4
1.3 Previous Work on Asymmetric Information in Mortgage Mar-
kets

Literature testing for asymmetric information in mortgage markets is of recent origin. Edelberg
(2004), using data from the Survey of Consumer Finances, finds evidence of adverse selection
in the primary mortgage market, in the sense that borrowers who ex-post were riskier put
less money into downpayments, as well as paying higher contract rates. Examples of empiri-
cal studies presenting indirect evidence of moral hazard by primary lenders in the secondary
mortgage market are Dell’Ariccia, Igan, and Laeven (2008), Keys et al. (2010), and Mian and
Sufi (2009). Dell’Ariccia et al (2008), using individual loan data obtained from HMDA files,
take denial rates to be their primary measure of the quality of lending, and find that declines
in lending standards over time were more pronounced in regions where a larger share of loans
were resold in the secondary market. Keys et al. (2010) compare the performance of securitized
subprime loans originated by 48 banks against those of 57 independent lenders, and find the
quality of the former loans to be poorer. Mian and Sufi (2009) determine that the supply of
home credit to non-prime borrowers expanded most rapidly in those areas with the highest
rates of house price appreciation that also experienced negative income growth. They largely
attribute the increase in mortgage originations in such areas to the rising demand for mortgages
to be securitized, and find that areas in which a greater share of originated mortgages were sold
on the secondary non-agency market exhibited relatively larger default rates in the subsequent
period, 2005-2007. On the other hand, Elul (2009), who unlike Keys et al. (2010), is able to
directly compare loans that are securitized with those that are retained in the lenders’ port-
folios, doesn’t find any marked difference in the performance of the securitized and retained
non-prime loans, though securitized prime loans of later vintages are indeed found to perform
worse than retained prime loans, with prime ARMs exhibiting a worse delinquency pattern
than the prime fixed-rate mortgages (FRMs). This is to be compared with earlier findings of
Ambrose, LaCour-Little and Sanders (2005), who analyzed a portfolio of FRMs originated by
a single lender in the late 90s, many of whose loans were subsequently securitized. Ambrose et
al found that the loans retained in the originator’s portfolio were of disproportionately higher
risk7,8
7
See DeMarzo and Duffie (1999) for a model in which the informed lender would prefer to sell loans with
the lowest degree of asymmetry.
8
Downing, Jaffee and Wallace (2009), in their study of mortgage-backed securities backed by Freddie Mac
participation certificates, find that pools underlying multi-class mortgage-backed securities (MBSs) allocated to
Real Estate Mortgage Investment Conduits (REMICs) fare worse than non-REMIC pools, which they interpret
as evidence of Government Sponsored Enterprise (GSE) exploitation of information asymmetries. An analysis of
the pricing of commercial mortgage-backed security (CMBS) pools by An, Deng and Gabriel (2009) suggests that
“conduit lending”, which is analogous to the originate-to-distribute model, may have mitigated the potential

5
2 The Empirical Analysis

2.1 The Empirical Methodology

Empirical work on such asymmetric information is a delicate enterprise, particularly given


that what is hidden to the uninformed party is often hidden to the empirical investigator as
well. While there has been extensive work on asymmetry in various fields, the empirical work
on the insurance industry, inspired by the ground-breaking theoretical work of Rothschild &
Stiglitz (1976), bears the most obvious relation to our analysis.9 In the much studied case of
automobile insurance, the event in question, y, is the report of a claim, which would correspond,
in our context, to a default (or a prepayment). The well-known prediction, coming out of the
Rothschild & Stiglitz model when there is hidden knowledge by the insuree of his own private
characteristics, b, is that of a “positive correlation” between the contractual features, c, of the
insurance policy and the probability, h(y), of an accident, after having taken into account the
insurer’s knowledge of the insuree’s non-private characteristics, a.10 The principle is that if
adverse selection is absent, so that the contractual features are chosen by the insuree solely as
c(a), and not c(a, b), then one should obtain h(y|a) = h(y|a, c), since with self-selection, the
known characteristics of the borrower would permit one to infer their choice c(a). Conversely,
if c does provide explanatory power in the presence of a, then this would be taken as evidence
that private characteristics b are also entering into the determination of c. The simplest way to
check the conditional independence of c is to run a regression of h(y) against both a and c to
see whether or not c has a zero coefficient.

The first work to empirically test the Rothschild-Stiglitz model of adverse selection appears to
be the work of Puelz & Snow (1994), where they ran a regression of the contractual feature of
the automobile insurance policy (the deductible ) against the occurrence of an accident, as well
as various characteristics of the insuree known to the insurer. The positive coefficient of the
occurrence of an accident obtained in the regression was taken as confirmation of the presence
of adverse selection.11 Dionne et al (1997) subsequently criticized the work of Puelz & Snow
adverse selection problem associated with choosing mortgages for securitization from a portfolio. They find
that portfolio-originated loans are priced at a discount in the CMBS market, after controlling for credit quality,
which they find to be lower than that of the conduit loans.
9
The one field, other than insurance, where empirical work on asymmetric information is the most developed
is that of auctions, where extensive applied techniques have been employed with considerable success (see
Paarsch & Hong (2006)).
10
The true type of the borrower, θ = (a, b), would then consist of his private as well as non-private charac-
teristics.
11
The fact that c is regressed against a and y rather than h(y) against a and c is unimportant to the conditional
independence test, though the fact that ex post claim occurrence y, instead of ex ante probability h(y), is used

6
 based
on statistical grounds and proposed that the expected probability of an accident h(y|a),
on a regression of the insurer’s commonly known type a against the occurrence of a claim y,
be included in the regression as well, to guard against specification error. Upon doing this,
they showed that the earlier positive correlation between the occurrence of an accident and the
deductible completely disappeared. Subsequent empirical work, notably that of Chiappori &
Salanie (2000), has tended to confirm the absence of evidence of adverse selection in the au-
tomobile insurance industry, as well as in life insurance markets (Cawley & Phillipson (1999),
McCarthy & Mitchell (2003).) On the other hand, Cutler & Zeckhauser (2000) review an ex-
tensive literature finding positive correlation in the health insurance industry, while Finkelstein
& Poterba (2004) produce evidence for such positive correlation when examining annuities.

An alternative form of test for adverse selection, that has recently received attention, is the
so-called “unused variables” test, which becomes possible when the researcher has available
covariates, unused by the uninformed party (the buyer in our case), which are associated with
the unobserved characteristics b (see Finkelstein and Poterba (2006)). When these variables
appear significantly in either the regression of c or h(y) in the presence of a, this may be taken as
evidence of asymmetric information. In the comparison of the borrower to the primary lender,
we have no more information than does either of the parties involved, so we must employ the
standard “positive correlation” test. However, when comparing the primary to the secondary
mortgage market, we have an informational advantage over the secondary mortgage market,
and so will be able to employ the unused variables test.12

2.2 The Data

Our first source of data is the Bond and Loan Information System (BLIS) maintained by the
data provider BlackBox Logic (BLIS), which provides information on a large base of non-agency
loan contracts from some 6000 pools. We use only first-lien, subprime 30-year adjustable-rate
mortgages (ARMs) collateralized by single-family houses, condominiums or townhouses, and
originated in the years 2006 or 2007, which were then observed until December 2008.13 Other
than the FICO scores, most of the data therein describes loan characteristics. This then repre-
sents the body of knowledge typically available to the secondary mortgage market. Our second
does pose a difficulty for the Puelz & Snow analysis, as pointed out by Dionne et al (1997) and Chiappori &
Salanie (2000).
12
The unused variable tests also opens the way for distinguishing moral hazard from adverse selection in those
cases where it can be argued that the known characteristic can be convincingly argued not to be associated with
the latter. This is never possible with the positive correlation test, since one never knows what is the missing
factor causing the correlation.
13
We take as subprime, mortgages whose FICO scores fall below 720.

7
main source of information is the Home Mortgage Disclosure Act (HMDA) Loan Application
Register, for the years 2006 and 2007, which, for nearly every high-risk loan in that period,
contains such individual borrower information as race, gender, ethnicity, and income of the ap-
plicant. We also add Census data on incomes, racial composition and educational attainment
at the zip code tabulation area (ZCTA) level.14 This then represents further information not
available to the secondary lender, but which is presumably available to the primary lender.
Enough information is in common between the two data sets that we are able, with a high
degree of confidence, to match about 18,000 loans, for which we then have detailed informa-
tion on both loan and borrower characteristics15 , and which become the basis for our analysis.
Additional information on house price appreciation is obtained using OFHEO or Case-Shiller
house price index files, and on unemployment rates using Bureau of Labor Statistics indices,
for both states and MSAs.

2.3 Empirical Results

The main approach we employ is a competing-risk proportional hazard model of the hazards
of default and prepayment, under the assumption that either payment, default, or prepayment
occurs monthly, as a payment comes due. At each payment date i, the discrete hazard of
termination by default d or prepayment p is parameterized as:
 
h (t(i); β  ) = h0 (t(i)) ez (t(i))β  = d, p, (2.1)

where z(t) represents the vector of exogenous variables affecting termination (covariates), which
are discussed further, β  is the vector of their coefficients to be estimated, and where h0 (t) is
the nonparametric baseline hazard,16

360
dΛ0 (t)
h0 (t) =e , Λ0 (t) = λi I(t(i) ≤ t). (2.2)
i=1

The covariates used may be divided into static variables available at the origination of the
loan and later dynamic environmental variables, that serve as controls to ensure a proper
specification of the hazard.17 The former covariates may be distinguished as properties of the
14
Table 2 presents summary statistics, while Figure 1 displays the observed counts of default and prepayment.
15
We use the MABLE geographic data base (http://mcdc2.missouri.edu/websas/geocorr2k.html) to
match census tracts to zip codes. We rely on information concerning the location of the property, original loan
balance, purpose of the loan, lien status, and occupancy to match our loans.
16
The variables λ1 , λ2 , ..., λ360 are mass points of the discrete baseline cumulative hazard function Λ0 , and I
is the standard indicator function. See Kalbfleisch & Prentice (2002) for the statistical technique appropriate
to this specification.
17
The latter includes the current yield, Yield1CMT, current unemployment, Unempl, current house price
appreciation, HPNom, and the Spring Season indicator, SeasonSpring.

8
contract or the initial and projected economic environment, which are known by all parties
at origination,18 in contrast to personal and neighborhood characteristics, coming from our
matching, which are then unknown to the secondary lenders.19 As stated earlier, the FICO
score, FICOScoreOrig, is the only strictly personal characteristic in the BLIS data, and so
known to the secondary market, along possibly with low documentation, LoDocFlag and the
various purposes of the loan - RefiFlag, InvestorFlag and CondoFlag - which can alternatively
be considered as either loan or personal characteristics.

2.3.1 The Primary Market

In the case of the primary market, the result of asymmetric information can be seen in several
ways. Looking at Table 5, it can be seen that a number of contractual variables have significance
even in the presence of the observed personal characteristics, indicating the presence of either
hidden knowledge or hidden action.20 Most strikingly, both loan size and margin, the two chosen
contractual variables of the greatest interest, both act to increase the probability of default,
despite known personal characteristics being accounted for. This is as would be expected,
though, given that these contractual properties of themselves encourage default and given
that choosing them at higher levels indicates borrowers with unobservably higher propensities
to default.21 The two other significant choice variables, for which the logic is the same, are
lifetime cap and floor: one can expect higher levels of these contractual properties to encourage
default, and if, as one anticipates, such higher levels are selected by persons with higher unseen
propensities to default, this will then result in the positive signs both these covariates exhibit in
the hazard regression of default, despite known personal characteristics having been accounted
18
Properties of the contract include OrigLoanSize, MargNM, Margin, Teaser, Ceiling, and Floor, while
properties of the economic environment at origination include TermStrOrig, TermStrSlope, HousePrice, and
Y2007Vint, where all variables are briefly described in Table 1.
19
Characteristics of the borrower include OtherVsWhite, HispVsWhite, BlVsWhite, FemAppl, IncomePersNM,
IncomePers, and IncPersLoDoc, while neighborhood characteristics include IncZcode, RaceZcode, and EducZ-
code. Obviously neighborhood characteristics could just as well be considered as properties of the economic
environment at origination, but we tend to view them with regard to what they suggest about the borrower’s
unseen characteristics. As this may be, the important property is that they can be taken to be unobserved by
the secondary lenders.
20
It is notable that the addition of the six contractual variables does not change the signs of any of the
previous covariates in the hazard of default equation, and that only one of these 24 variables, PenaltyFlag,
changes in significance, becoming insignificant. For prepayment, only FicoScoreOrig apparently reverses sign,
but it is seen to be insignificantly different from zero in both cases.
21
One might argue that if one of margin or loan size is specified, then the other is redundant, since the
tradeoff of available choices must determine the other, but besides it being harmless to include both, this
argument ignores the role of other contractual properties that are part of the fuller menu.

9
for. Thus, all the results obtained are consistent with what can be expected when adverse
selection occurs in the primary mortgage market.22

In Tables 7 - 10 we provide reduced form regressions of the significant contractual choice


variables against the exogenous parameters at origination, including all known personal and
neighborhood characteristics. Then in Tables 11 & 12 we redo the regressions in Tables 5
& 6, but including predicted values of these endogenous choice variables, as a guard against
specification error, as suggested by Dionne et al (2001) in the setting of automobile insurance.
Unlike that case, the inclusion of the predicted values does nothing to alter the significance
or the signs of the actual contractual variables, suggesting the robustness of our asymmetric
result.23,24

A second sort of test of asymmetric information is provided by the correlation tests in Table
13 and Table 14. In the former, we simply calculate the correlation coefficients between the
residuals from the proportional hazard estimation of Table 3 and those of the residuals of
the various regressions of the contractual features in Tables 7 - 10. In all cases, the estimated
correlation of the residuals is positive and significant at the 95% confidence level. An alternative
test is provided in Table 14, which provides a statistic, introduced by Chiappori & Salanie
(2000), that would be zero were there conditional independence between the choice variables
and the hazard of default, but which is instead significantly positive in all cases. Together, all
these results make a strong case for asymmetric information, presumably adverse selection, in
the primary mortgage market.
22
There is the problem, present in all such analyses, that we may not have included all the information avail-
able to the primary market, and that the information provided by contractual variables is merely information
available to the originator, but not to us. We can only say that we have gone to considerable lengths to get
what is a broad set of additional variables, and that it is not immediately obvious to us what is missing and not
adequately covered by what we do have. Further, as should have been made clear from the discussion of the
secondary market into which these mortgages are necessarily being sold, the originator has limited incentives
to engage in much costly acquisition of further information.
23
The present result is not very surprising given our earlier observation in regard to Tables 5 & 6 that the
introduction of the contractual variables did not alter the pattern of the previously used variables in Tables 3
& 4 and so apparently provide independent information.
24
As an alternative robustness check, we also used residuals from the models of the borrower’s choice variables
instead of the expected values of the choice variables in the estimation of default hazard in table 11 (see, e.g.
Richaudeau (1999)). Our results do not qualitatively change compared to the reported specification.

10
2.3.2 The Secondary Market

In the case of the secondary market, we employ an unused variable test. By comparing Table
5 with the predictive equation for margin just mentioned, it can be seen that neighborhood
education levels, EducZCode, significantly increases both the hazard of default and the selection
of margin, indicating the presence of an information asymmetry in the secondary mortgage
market, since it is assumed the primary lenders do have access to this information, while
the secondary lenders do not. It should be noted that higher levels of EducZCode are also
negatively associated with loan size, other things equal. This raises the interesting potential
for exploitation of the secondary market by primary lenders, since as we have seen, lower levels
of the mutually observable margin choice variable is, given neighborhood education levels, an
indication of a lower likelihood of default, but primary lenders are in a position to realize that
people from more educated neighborhoods, while picking such lower margins, will, all else equal,
actually be more prone to default. Thus asymmetric information, presumably moral hazard
driven by private knowledge, appears to be present in the secondary mortgage market as well.25

3 Conclusion

While the problem we have set for ourselves is simply the detection of moral hazard and adverse
selection, one is of course led to consider about how these problems can be alleviated. The
solution is easiest to see in the case of the secondary mortgage market: the firms buying up loans
for collateralization must, to a larger extent than is current practice, collect and analyze as much
objective information about the loan as is possible, and then classify the mortgages accordingly.
There are other obvious practices that might be adopted, such as fully branding the loans by
the originator, or forcing such primary lenders to retain an interest in the loan, so that it will
pay them to acquire a reputation for originating quality loans. The asymmetry problem faced
25
Since the BLIS data well represents the information available to the secondary market, we can be confident
of their information set. On the other hand, while the primary lender clearly has access to the additional
information we then introduce, it is not clear that all this information does get used, and indeed, there are
anti-discriminatory statutes which might seem to prevent the use of some of these variables. In contrast to
the secondary market, however, it would be hard for regulators (who themselves then face a moral hazard
problem with hidden knowledge) to demonstrate that originators were not using such information, except by a
possibly inadmissible statistical analysis after the fact, and since originators have partial incentives to use this
information, which, in the case of neighborhood characteristics, they often know at no cost and would have
difficulty not using, we do not want to preclude the likelihood that they indeed do so. Further, regressing out
the variables that the primary market might not use only serves to confirm the role of the ones that the primary
market does use.

11
by the originators themselves in the primary mortgage market seems more difficult to resolve.
The informational advantage of the borrower is innate, but the lender would have the incentive
and ability to take more care in the approval of loans by a return to more traditional practices,
where the originator already had a developed relationship with the borrower, and subsequently
retained the loan. By acquiring more personal knowledge of the borrower, the lender is then
not at such a great informational disadvantage, and once again, better performance can be
expected if the originator continues to bear a substantial portion of the risk arising from a
failure to act prudently with regard to the approval process.

12
References

Akerlof, George (1970), The Market for Lemons: Quality Uncertainty and The Market Mech-
anism, Quarterly Journal of Economics 84(3), 488-500.

Ambrose Brent W., Michael LaCour-Little and Anthony B. Sanders (2005), Does Regulatory
Capital Arbitrage, Reputation, or Asymmetric Information Drive Securitization? Jour-
nal of Financial Services Research 28, 113-133.

An, Xudong, Yongheng Deng and Stuart A. Gabriel (2009), Is Conduit Lending to Blame?
Asymmetric Information, Adverse Selection, and the Pricing of CMBS, IRES working
paper.

Cawley, John and Tomas Philipson (1999), An Empirical Examination of Information Barriers
to Trade in Insurance, American Economic Review 89, 827-846.

Chiappori, Pierre-Andre and Bernard Salanie (2000), Testing for Asymmetric Information in
Insurance Markets, Journal of Political Economy 108, 56-78.

Cutler, David M., and Richard J. Zeckhauser (2000), The Anatomy of Health Insurance, in A.
J. Culyer and J. P. Newhouse, eds., Handbook of Health Economics, V. 1, Elsevier.

Dionne, Georges, Christian Gourieroux and Charles Vanasse (2001), Testing for Evidence of
Adverse Selection in the Automobile Insurance Market: A Comment, Journal of Political
Economy 109(2), 444-451.

Dell’Ariccia, Giovanni, Deniz Igan, and Luc Laeven (2008), Credit Booms and Lending Stan-
dards: Evidence from the Subprime Mortgage Market, IMF working paper WP/08/106.

DeMarzo, Peter M., and Darrell Duffie (1999), Liquidity-Based Model of Security Design,
Econometrica 67 (1), 65-99.

Downing, Chris, Dwight Jaffee and Nancy Wallace (2009), Is the Market for Mortgage-Backed
Securities a Market for Lemons? Review of Financial Studies, forthcoming.

Edelberg, Wendy (2004), Testing for Adverse Selection and Moral Hazard in Consumer Loan
Markets, Board of Governors of the Federal Reserve System, Finance and Economics
Discussion Series 2004-09.

Elul, Ronel (2009), Securitization and Mortgage Default: Reputation Versus Adverse Selection,
FRB of Philadelphia, working paper.

13
Finkelstein, Amy and James Poterba (2004), Adverse Selection in Insurance Markets: Poli-
cyholder Evidence from the U.K. Annuity Market, Journal of Political Economy 112,
183-208.

Finkelstein, Amy, and James Poterba (2006), Testing for Adverse Selection with ‘Unused Ob-
servables’, NBER working paper no. 12112.

Gjesdal, Frøystein (2007), Moral Hazard with Hidden Information, in R. Antle, F. Gjesdal, and
P. J. Liang, eds., Essays in Accounting in Honor of Joel S. Demski, Springer.

Kalbfleisch, John and Ross Prentice (2002), The Statistical Analysis of Failure Time Data, 2nd
edition, Wiley.

Keys, Benjamin, Tanmoy Mukherjee, Amit Seru and Vikrant Vig (2010), Did Securitization
Lead to Lax Screening? Evidence From Subprime Loans, Quarterly Journal of Eco-
nomics 125, forthcoming.

Macho-Stadler, Inés and J. David Pérez-Castrillo (2001), Introduction to the Economics of


Information: Incentives and Contracts, 2nd edition, Oxford.

McCarthy, David and Olivia Mitchell (2010), forthcoming, International Adverse Selection in
Life Insurance and Annuities, in S. Tuljapurkar, N. Ogawa, and A. Gautheir, eds., Riding
the Age Wave: Responses to Aging in Industrial Societies, Elsevier.

Mian, Atif, and Amir Sufi (2009), The Consequences of Mortgage Credit Expansion: Evidence
from the U.S. Mortgage Default Crisis, Quarterly Journal of Economics, forthcoming.

Puelz, Robert, and Arthur Snow (1994), Evidence on Adverse Selection: Equilibrium Signal-
ing and Cross-Subsidization in the Insurance Market,Journal of Political Economy 102,
236257.

Richaudeau, Didier (1999), Automobile Insurance Contracts and Risk of Accident: An Em-
pirical Test Using French Individual Data, The Geneva Papers on Risk and Insurance
Theory 24, 97-114.

Rothschild, Michael and Joseph Stiglitz (1976), An Essay on the Economics of Imperfect In-
formation, Quarterly Journal of Economics 90, 629-649.

14
Figure 1
Termination by default and prepayment, by calendar month

Defaultandprepaymentincidence
300

250

200

150

100

50

Defaultincidence Prepaymentincidence

15
Table 1
Variable definitions
Variable name Definition
OrigLoanSize Original loan size (deflated to 2000 $).
MargNM 1 if the Contract margin is non-missing, 0 otherwise.
Margin Contract margin, in percentage points.
Teaser Teaser, in percentage points; the amount the introductory
contract rate is lowered, until the first adjustment.
Ceiling Lifetime cap on the contract rate, in percentage points.
Floor Lifetime floor on the contract rate, in percentage points.

TermStrOrig Yield on a 1-year T-bill, at the time of origination of the mortgage.


TermStrSlope Difference in yields on a 10-year T-bond and a 1-year T-bill,
at the time of origination of the mortgage.
HousePrice Price of the house at origination (deflated to 2000 $).
Y2007Vint 1 if the loan was originated in 2007, 0 otherwise.

FicoScoreOrig FICO score of the borrower, at origination.


FemAppl 1 if applicant is female (HMDA).
OtherVsWhite 0 if applicant’s race is white and ethnicity is
non-Hispanic, 1 otherwise (HMDA).
HispVsWhite 1 if OtherVsWhite equals one and applicant’s
ethnicity is Hispanic (HMDA).
BlVsWhite 1 if OtherVsWhite equals one and applicant’s race is
black (HMDA).
IncomePersNM 1 if applicants income at origination is non-missing, 0 otherwise (HMDA).
IncomePers Applicants income at origination, in thousands of dollars (HMDA).
LoDocFlag 1 if less than full documentation, 0 otherwise.
IncPersLoDoc Interaction of the previous two variables (HMDA).
PenaltyFlag 1 if a prepayment penalty is present.
RefiFlag 1 if the loan is taken out for refinancing.
InvestorFlag 1 if property is not a primary residence.
CondoFlag 1 if the property is a condominium.

IncZcode Difference between the median household income, by Zip Code Area, and
the median household income in the MSA (state), as a percentage
of the median household income in that MSA (state) (2000 Census).
RaceZcode Difference between the minority share in the population of
the Zip Code Area and that of the MSA (state), as a percentage of the minority
share in the population of that MSA (state) (2000 Census).
EducZcode Share of the population with less than a high school diploma
in the Zip Code Area, less that share in the MSA (state), as a percentage
of the share of the population with less than a high school
diploma in that MSA (state) (2000 Census).

Yield1CMT Yield on 1-year Treasury bill, in the observation period.


Unempl Natural log of the ratio of the unemployment rate in the observation
period to the rate at the time of origination.
HPNom Nominal house price appreciation, as measured by the
Case-Shiller or OFHEO (purchase-only, statewide) index.
SeasonSpring 1 in the months April - July.

16
Table 2
Summary statistics for the complete sample

Variable N. obs. Mean St. dev. Min Max

TermStrOrig 17433 4.809 0.576 1.540 5.220


TermStrSlope 17433 -0.082 0.388 -0.410 1.970
Yield1CMT 17433 4.809 0.576 1.540 5.220
HousePrice 17433 240,345 154,645 22,881 729,834
OrigLoanSize 17433 157,022 102,108 16,452 616,060
Teaser 17433 1.021 1.535 0 10.755
Floor 14946 7.425 2.580 1.030 17.800
Ceiling 15387 13.658 2.919 3.450 29.200
Margin 14948 5.412 1.794 1 12
FicoScoreOrig 17433 613.38 58.99 377 720
FemAppl 17433 0.283 0.450 0 1
HispVsWhite 17433 0.131 0.337 0 1
OtherVsWhite 17433 0.496 0.500 0 1
BlVsWhite 17433 0.087 0.282 0 1
IncomePers 17397 99.98 104.28 11 4200
EducZcode 17433 -0.357 0.184 -0.953 0.255
IncZcode 17433 0.046 0.327 -0.725 2.219
RaceZcode 17433 0.121 1.134 -1.000 8.720
LoDocFlag 17433 0.574 0.495 0 1
PenaltyFlag 17433 0.548 0.498 0 1
RefiFlag 17433 0.558 0.497 0 1
CondoFlag 17433 0.034 0.181 0 1
InvestorFlag 17433 0.067 0.249 0 1
Unempl 17433 0.002 0.076 -0.377 0.598
HPNom 17433 181.78 43.60 96.60 255.91
SeasonSpring 17433 0.398 0.490 0 1

17
Table 3
Proportional hazard estimates of the default model
Variable Estimate Std. error p-value Hazard ratio
TermStrOrig -0.2601 0.0676 0.0001 0.7710
TermStrSlope -0.1842 0.0995 0.0642 0.8320
HousePrice 0.0014 0.0001 <.0001 1.0010
FicoScoreOrig -0.0057 0.0003 <.0001 0.9940
FemAppl -0.0080 0.0349 0.8195 0.9920
HispVsWhite 0.0637 0.0498 0.2009 1.0660
OtherVsWhite 0.0230 0.0396 0.5624 1.0230
BlVsWhite 0.0724 0.0657 0.2705 1.0750
IncomePersNM 0.4303 1.0009 0.6672 1.5380
IncomePers 0.0001 0.0002 0.5874 1.000
IncPersLoDoc 0.0005 0.0003 0.0803 1.000
IncZcode -0.1808 0.0806 0.0248 0.8350
RaceZcode 0.0156 0.0153 0.3085 1.0160
EducZcode 0.4069 0.1319 0.0020 1.5020
LoDocFlag 0.2280 0.0449 <.0001 1.2560
PenaltyFlag 0.0683 0.0337 0.0428 1.0710
RefiFlag -0.4807 0.0336 <.0001 0.6180
CondoFlag 0.0509 0.0851 0.5503 1.0520
InvestorFlag 0.2257 0.0634 0.0004 1.2530
Yield1CMT 0.1018 0.0248 <.0001 1.1070
Unempl 1.0981 0.1073 <.0001 2.9980
HPNom -0.0023 0.0006 <.0001 0.9980
SeasonSpring 0.0531 0.0345 0.1237 1.0540
Y2007Vint 0.0683 0.0625 0.2750 1.0710
No. of observations 300,219
Log-likelihood −37,247
House price scaled by 1E-3. Model was estimated using sample of 17,433 loans.

18
Table 4
Proportional hazard estimates of the prepayment model.
Variable Estimate Std. error p-value Hazard ratio
TermStrOrig 0.2554 0.0625 <.0001 1.2910
TermStrSlope 0.3068 0.0892 0.0006 1.3590
HousePrice -0.0012 0.0001 <.0001 0.9990
FicoScoreOrig -0.0003 0.0003 0.2416 1.000
FemAppl -0.0420 0.0335 0.2098 0.9590
HispVsWhite -0.0792 0.0512 0.1224 0.9240
OtherVsWhite -0.0472 0.0367 0.1979 0.9540
BlVsWhite -0.0206 0.0640 0.7479 0.9800
IncomePersNM -0.4422 0.7080 0.5322 0.6430
IncomePers 0.0001 0.0002 0.6140 1.000
IncPersLoDoc 0.0003 0.0003 0.3450 1.000
IncZcode 0.1335 0.0762 0.0800 1.1430
RaceZcode -0.0702 0.0160 <.0001 0.9320
EducZcode -0.1259 0.1249 0.3134 0.8820
LoDocFlag -0.0033 0.0432 0.9387 0.9970
PenaltyFlag 0.0491 0.0316 0.1198 1.0500
RefiFlag 0.2514 0.0327 <.0001 1.2860
CondoFlag -0.2201 0.0894 0.0138 0.8020
InvestorFlag -0.4142 0.0727 <.0001 0.6610
Yield1CMT -0.0228 0.0249 0.3592 0.9770
Unempl -1.8674 0.1213 <.0001 0.1550
HPNom 0.0063 0.0005 <.0001 1.0060
SeasonSpring 0.0073 0.0318 0.8197 1.0070
Y2007Vint -0.0547 0.0643 0.3948 0.9470
No. of observations 300,219
Log-likelihood −41,959
House price scaled by 1E-3. Model was estimated using sample of 17,433 loans.

19
Table 5
Proportional hazard estimates of the default model with borrower’s choice
variables
Variable Estimate Std. error p-value Hazard ratio
TermStrOrig -0.1903 0.0682 0.0053 0.8270
TermStrSlope -0.1576 0.1005 0.1169 0.8540
HousePrice -0.0022 0.0004 <.0001 0.9980
FicoScoreOrig -0.0041 0.0003 <.0001 0.9960
FemAppl -0.0089 0.0349 0.7989 0.9910
HispVsWhite 0.0791 0.0498 0.1125 1.0820
OtherVsWhite 0.0263 0.0397 0.5066 1.0270
BlVsWhite 0.0370 0.0657 0.5731 1.0380
IncomePersNM 0.5136 1.0012 0.6079 1.6710
IncomePers 0.0001 0.0002 0.8314 1.000
IncPersLoDoc 0.0005 0.0003 0.1121 1.000
IncZcode -0.2446 0.0809 0.0025 0.7830
RaceZcode 0.0051 0.0154 0.7388 1.0050
EducZcode 0.3074 0.1314 0.0193 1.3600
LoDocFlag 0.2847 0.0472 <.0001 1.3290
PenaltyFlag 0.0129 0.0366 0.7256 1.0130
RefiFlag -0.3513 0.0364 <.0001 0.7040
CondoFlag 0.0910 0.0852 0.2855 1.0950
InvestorFlag 0.3026 0.0641 <.0001 1.3530
Yield1CMT 0.1028 0.0249 <.0001 1.1080
Unempl 1.3276 0.1081 <.0001 3.7720
HPNom -0.0015 0.0006 0.0083 0.9990
SeasonSpring 0.0529 0.0345 0.1246 1.0540
Y2007Vint 0.0848 0.0627 0.1765 1.0880

OrigLoanSize 0.0064 0.0005 <.0001 1.0060


Margin 0.0773 0.0176 <.0001 1.0800
MargNM -0.3244 0.1155 0.0050 0.7230
Teaser -0.0091 0.0121 0.4509 0.9910
Floor 0.0806 0.0111 <.0001 1.0840
Ceiling 0.0412 0.0069 <.0001 1.0420
No. of observations 300,219
Log-likelihood −36,992
House price and loan size scaled by 1E-3. Model was estimated using
sample of 14,946 loans.

20
Table 6
Proportional hazard estimates of the prepayment model with borrower’s
choice variables
Variable Estimate Std. error p-value Hazard ratio
TermStrOrig 0.1657 0.0634 0.009 1.1800
TermStrSlope 0.3762 0.0898 <.0001 1.4570
HousePrice 0.0018 0.0002 <.0001 1.0020
FicoScoreOrig 0.0004 0.0003 0.24 1.000
FemAppl -0.0481 0.0335 0.1507 0.9530
HispVsWhite -0.0694 0.0513 0.1764 0.9330
OtherVsWhite -0.0421 0.0367 0.2512 0.9590
BlVsWhite -0.0153 0.0640 0.811 0.9850
IncomePersNM -0.6921 0.7081 0.3283 0.5010
IncomePers 0.0002 0.0002 0.4021 1.000
IncPersLoDoc 0.0005 0.0003 0.1042 1.000
IncZcode 0.1591 0.0761 0.0365 1.1720
RaceZcode -0.0635 0.0160 <.0001 0.9380
EducZcode -0.1121 0.1249 0.3695 0.8940
LoDocFlag -0.0051 0.0443 0.9082 0.9950
PenaltyFlag 0.0009 0.0333 0.9784 1.0010
RefiFlag 0.2786 0.0366 <.0001 1.3210
CondoFlag -0.2241 0.0896 0.0124 0.7990
InvestorFlag -0.3591 0.0731 <.0001 0.6980
Yield1CMT -0.0207 0.0248 0.4042 0.9800
Unempl -1.8637 0.1219 <.0001 0.1550
HPNom 0.0056 0.0005 <.0001 1.0060
SeasonSpring 0.0116 0.0319 0.7168 1.0120
Y2007Vint -0.0673 0.0642 0.2944 0.9350

OrigLoanSize -0.0048 0.0004 <.0001 0.9950


Margin 0.0463 0.0173 0.0074 1.0470
MargNM -0.6085 0.1173 <.0001 0.5440
Teaser 0.1598 0.0098 <.0001 1.1730
Floor 0.0425 0.0107 <.0001 1.0430
Ceiling -0.0299 0.0060 <.0001 0.9710
No. of observations 300,219
Log-likelihood −41,353
House price and loan size scaled by 1E-3. Model was estimated using
sample of 14,946 loans.

21
Table 7 OLS estimates of the model for the loan size at origination

Variable Estimate Std. error t -stat. p-value


Intercept -0.0324 0.1640 -0.22 0.8432
HousePrice 0.6041 0.0022 280 <.0001
TermStrOrig -0.0186 0.0112 -1.66 0.0979
TermStrSlope 0.0086 0.0167 0.51 0.6069
FicoScoreOrig 0.0699 0.0052 13.32 <.0001
FemAppl -0.0147 0.0064 -2.29 0.0223
HispVsWhite 0.0060 0.0096 0.62 0.5332
OtherVsWhite 0.0005 0.0068 0.08 0.9374
BlVsWhite 0.0249 0.0112 2.22 0.0267
IncomePersNM -0.1222 0.1528 -0.8 0.4239
IncomePers 0.0002 4.14E-05 4.45 <.0001
IncPersLoDoc 0.0002 5.5E-05 3.41 0.0007
IncZcode 0.0765 0.0142 5.38 <.0001
RaceZcode 0.0057 0.0028 2.02 0.0432
EducZcode -0.0007 0.0234 -0.03 0.9757
LoDocFlag -0.0890 0.0082 -10.92 <.0001
RefiFlag -0.0760 0.0061 -12.51 <.0001
CondoFlag -0.0250 0.0157 -1.59 0.1119
InvestorFlag -0.0653 0.0117 -5.58 <.0001
No. of observations 17,433
Adjusted R2 0.8658
Heteroscedasticity consistent standard errors and t-statistics are reported.
House price and loan size are scaled by 1E-5, FICO scaled by 1E-2.

Table 8 OLS estimates of the model for the contractual margin at origination

Variable Estimate Std. error t -stat. p-value


Intercept 14.6143 0.7503 19.48 <.0001
HousePrice -0.2878 0.0088 -32.61 <.0001
TermStrOrig -0.1821 0.0417 -4.36 <.0001
TermStrSlope -0.4063 0.0617 -6.58 <.0001
FICO -1.1242 0.0219 -51.28 <.0001
FemAppl -0.0001 0.0240 0.00 0.9974
HispVsWhite -0.0760 0.0372 -2.04 0.0413
OtherVsWhite 0.0235 0.0262 0.90 0.3701
BlVsWhite 0.1010 0.0414 2.44 0.0148
IncomePersNM -0.2601 0.7141 -0.36 0.7157
IncomePers 0.0002 0.0001 2.04 0.0412
IncPersLoDoc -0.0005 0.0002 -2.36 0.0182
IncZCode 0.1088 0.0565 1.93 0.0539
RaceZCode 0.0080 0.0103 0.78 0.4363
EducZCode 0.2056 0.0906 2.27 0.0233
LoDocFlag -0.2631 0.0290 -9.08 <.0001
RefiFlag -0.2800 0.0245 -11.45 <.0001
CondoFlag -0.3144 0.0700 -4.49 <.0001
IinvestorFlag -0.3999 0.0498 -8.03 <.0001
No. of observations 14,946
Adjusted R2 0.3649
Heteroscedasticity consistent standard errors and t-statistics are reported.
House price scaled by 1E-5, FICO scaled by 1E-2.

22
Table 9 OLS estimates of the model for the lifetime cap on the contract rate

Variable Estimate Std. error t -stat. p-value


Intercept 25.8853 1.3345 19.41 <.0001
HousePrice -0.2612 0.0194 -13.45 <.0001
TermStrOrig -0.9355 0.1110 -8.43 <.0001
TermStrSlope -0.5043 0.1715 -2.94 0.0033
FICO -1.2135 0.0462 -26.27 <.0001
FemAppl -0.0064 0.0597 -0.11 0.9149
HispVsWhite -0.1064 0.0864 -1.23 0.2179
OtherVsWhite 0.0678 0.0626 1.08 0.2791
BlVsWhite 0.1722 0.1031 1.67 0.095
IncomePersNM 0.4584 1.2157 0.38 0.7061
IncomePers 6.14E-05 0.0003 0.2 0.8419
IncPersLoDoc 0.0001 0.0005 0.23 0.8209
IncZCode -0.1527 0.1303 -1.17 0.2412
RaceZCode 0.0503 0.0248 2.03 0.0422
EducZCode -0.0940 0.2157 -0.44 0.6629
LoDocFlag -0.0402 0.0744 -0.54 0.5892
RefiFlag -0.8193 0.0552 -14.85 <.0001
CondoFlag -0.1538 0.1360 -1.13 0.258
InvestorFlag 0.1823 0.1053 1.73 0.0835
No. of observations 15,387
Adjusted R2 0.091
Heteroscedasticity consistent standard errors and t-statistics are reported.
House price scaled by 1E-5, FICO scaled by 1E-2.

Table 10 OLS estimates of the model for the lifetime floor on the contract rate

Variable Estimate Std. error t -stat. p-value


Intercept 21.4802 0.6935 30.97 <.0001
HousePrice -0.4569 0.0148 -30.86 <.0001
TermStrOrig -0.2701 0.0705 -3.83 0.0001
TermStrSlope -0.0504 0.1048 -0.48 0.6309
FicoScoreOrig -1.8684 0.0390 -47.91 <.0001
FemAppl 0.0448 0.0402 1.11 0.2654
HispVsWhite -0.1390 0.0607 -2.29 0.022
OtherVsWhite 0.0619 0.0434 1.43 0.1532
BlVsWhite 0.1790 0.0702 2.55 0.0108
IncomePersNM 0.3069 0.5646 0.54 0.5868
IncomePers 0.0005 0.0002 2.68 0.0074
IncPersLoDoc -0.0007 0.0003 -2.06 0.0394
IncZCode 0.1641 0.0933 1.76 0.0786
RaceZCode 0.0509 0.0178 2.85 0.0043
EducZCode 0.4503 0.1514 2.97 0.0029
LoDocFlag -0.2552 0.0506 -5.04 <.0001
RefiFlag -0.5333 0.0407 -13.11 <.0001
CondoFlag -0.4030 0.1149 -3.51 0.0005
InvestorFlag -0.4452 0.0879 -5.07 <.0001
No. of observations 14,946
Adjusted R2 0.3467
Heteroscedasticity consistent standard errors and t-statistics are reported.
House price scaled by 1E-5, FICO scaled by 1E-2.

23
Table 11
Proportional hazard estimates of the default model with predicted values
of borrower’s choice variables
Variable Estimate Std. error p-value Hazard ratio
TermStrOrig 0.0176 0.1768 0.9208 1.0180
TermStrSlope 0.8440 0.4613 0.0673 2.3260
HousePrice -0.0066 0.0022 0.0031 0.9930
Y2007Vint 0.0675 0.0721 0.3495 1.0700
FicoScoreOrig -0.0022 0.0078 0.7724 0.9980
FemAppl -0.0176 0.0433 0.6837 0.9830
HispVsWhite 0.0998 0.0774 0.1976 1.1050
OtherVsWhite -0.0038 0.0474 0.9362 0.9960
BlVsWhite -0.0532 0.0904 0.5562 0.9480
IncomePersNM 0.3885 1.0377 0.7081 1.4750
IncomePers -0.0002 0.0003 0.4226 1.000
IncPersLoDoc 0.0005 0.0004 0.1914 1.0010
LoDocFlag 1.1528 0.4097 0.0049 3.1670
PenaltyFlag -0.9538 0.4723 0.0434 0.3850
RefiFlag -0.7560 0.3253 0.0201 0.4700
CondoFlag 0.1962 0.1806 0.2773 1.2170
InvestorFlag 0.4225 0.1965 0.0315 1.5260
IncZCode -0.2846 0.1174 0.0153 0.7520
RaceZCode -0.0028 0.0282 0.9219 0.9970
EducZCode 0.2858 0.2325 0.2189 1.3310
Yield1CMT 0.1092 0.0278 <.0001 1.1150
Unempl 1.2973 0.1220 <.0001 3.6600
HPNom -0.0019 0.0006 0.0035 0.9980
SeasonSpring 0.0635 0.0389 0.1022 1.0660


OrigLoanSize 0.0053 0.0013 0.0001 1.0050

Margin 0.8337 0.4132 0.0436 2.3020

F loor 0.1811 0.4152 0.6627 1.1990

Ceiling -0.4299 0.2653 0.1051 0.6510

OrigLoanSize 0.0097 0.0007 <.0001 1.0100


Margin 0.0433 0.0196 0.0274 1.0440
MargNM -0.1530 0.7185 0.8313 0.8580
Floor 0.0687 0.0121 <.0001 1.0710
Ceiling 0.0338 0.0071 <.0001 1.0340
Teaser -0.0030 0.0124 0.8101 0.9970
No. of observations 281,278
Log-likelihood −36,055
House price and loan size scaled by 1E-3. Model was estimated using
sample of 14,946 loans.
24
Table 12
Proportional hazard estimates of the prepayment model with predicted
values of borrower’s choice variables
Variable Estimate Std. error p-value Hazard ratio
TermStrOrig 0.8404 0.1542 <.0001 2.3170
TermStrSlope 1.9740 0.4168 <.0001 7.1990
HousePrice -0.0189 0.0039 <.0001 0.9810
Y2007Vint -0.0923 0.0656 0.1597 0.9120
FicoScoreOrig 0.0014 0.0003 <.0001 1.0010
FemAppl -0.0322 0.0349 0.3570 0.9680
HispVsWhite -0.0674 0.0566 0.2336 0.9350
OtherVsWhite -0.0862 0.0390 0.0270 0.9170
BlVsWhite -0.0743 0.0705 0.2919 0.9280
IncomePersNM 2.9897 0.9501 0.0017 19.880
IncomePers -0.0003 0.0002 0.1711 1.000
IncPersLoDoc 0.0004 0.0003 0.2026 1.000
LoDocFlag 1.0206 0.3586 0.0044 2.7750
PenaltyFlag -1.2017 0.4302 0.0052 0.3010
RefiFlag 0.0588 0.2861 0.8371 1.0610
CondoFlag 0.0057 0.1109 0.9589 1.0060
InvestorFlag -0.3741 0.0750 <.0001 0.6880
IncZCode 0.1330 0.0781 0.0888 1.1420
RaceZCode -0.0709 0.0163 <.0001 0.9320
EducZCode -0.1027 0.1425 0.4713 0.9020
Yield1CMT -0.0284 0.0251 0.2572 0.9720
Unempl -1.8481 0.1245 <.0001 0.1580
HPNom 0.0053 0.0005 <.0001 1.0050
SeasonSpring 0.0246 0.0325 0.4486 1.0250


OrigLoanSize 0.0228 0.0035 <.0001 1.0230

Margin -0.1928 0.0561 0.0006 0.8250

F loor -0.2270 0.2283 0.3200 0.7970

Ceiling 1.1340 0.3766 0.0026 3.1080

OrigLoanSize -0.0083 0.0006 <.0001 0.9810


Margin 0.0368 0.0177 0.0377 1.0380
MargNM -0.6907 0.1199 <.0001 0.5010
Floor 0.0670 0.0113 <.0001 1.0690
Ceiling -0.0237 0.0061 0.0001 0.9770
Teaser 0.1626 0.0100 <.0001 1.1770
No. of observations 281,278
Log-likelihood −39,375
House price and loan size scaled by 1E-3. Model was estimated using
sample of 14,946 loans.
25
Table 13
Correlation tests

Variable name Pearson ρ 95% conf. inter. Spearman stat. 95% conf. inter.
OrigLoanSize 0.1289 (0.1141, 0.1437) 0.0952 ( 0.0803, 0.1101)
Margin 0.0906 ( 0.0743, 0.1068) 0.0741 (0.0578, 0.0904)
Ceiling 0.0639 (0.0483, 0.0794) 0.0819 (0.0664, 0.0975)
Floor 0.1123 (0.0960,0.1287) 0.1060 (0.0891,0.1220)
Pearson and Spearman correlation statistics and confidence intervals for correlation between ˆi (residuals from
the regression of the contractual variable) and η̂i (residuals from the proportional hazard model of default).

Table 14
Parametric test for the presence of asymmetric information
Variable W -stat p-value
OrigLoanSize 181 <0.01
Margin 40.6 <0.01
Ceiling 34.5 <0.01
Floor 121 <0.01
Null hypothesis: conditional independence of the choice of the specific contractual feature (such as loan size)
and the probability of default (conditional
 on the variables observed by the lender). Our test statistic is
constructed as W = ( n1 wi ˆi ηˆi )2 / n1 wi2 ˆi 2 ηˆi 2 , where ˆi are residuals from the regression of the contractual
variable and η̂i are residuals from the proportional hazard model of default, with wi being weights that reflect
the length of observation of an individual contract. Under the null, W is χ2 (1) distributed.

26
View publication stats

You might also like