Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Journal of Economic Behavior and Organization 170 (2020) 286–300

Contents lists available at ScienceDirect

Journal of Economic Behavior and Organization


journal homepage: www.elsevier.com/locate/jebo

Don’t blame the messenger. The Delivery method of a


message mattersR
Daniel Ortega a, Carlos Scartascini b,∗
a
Development Bank of Latin America (CAF) and IESA, Avenida Luis Roche, Caracas 1060, Miranda, Venezuela
b
Research Department, Inter-American Development Bank, 1300 New York Ave. NW, Washington, DC 20577, USA

a r t i c l e i n f o a b s t r a c t

Article history: Sending messages and providing information to individuals tends to affect their behavior
Received 14 August 2018 (e.g., reduce energy consumption, increase donations, reduce tax evasion). Most studies to
Revised 10 December 2019
date have concentrated on evaluating the effect of different types of messages but they
Accepted 11 December 2019
have avoided discussing if the method to communicate those messages could affect the
Available online 11 January 2020
effectiveness of the intervention. This study shows that the effectiveness of a message is
JEL classification: not independent of the communication method used. We conducted a field experiment
C93 in Colombia that varies the way the National Tax Agency contacts delinquent taxpayers.
D03 More than 20,0 0 0 taxpayers were randomly assigned to a control or one of three delivery
D83 mechanisms (letter, email, and personal visit). Conditional on delivery of the treatment,
H26 a personal visit is more effective than an email, and both are more effective than a let-
ter (the more traditional method used by tax administrations and researchers worldwide).
Keywords:
The findings show that identifying the mechanisms through which policies are informed
Information
Methods of communication and publicized should be fully incorporated in the literature to assess the effectiveness of
Messages an intervention. Tax administrations should explicitly evaluate what is the right vector of
Tax compliance communication methods that maximizes their policy objectives.
Development
© 2020 Published by Elsevier B.V.

1. Introduction

Does the method a government uses to communicate public policies to its citizens affect the effectiveness of those poli-
cies? Are impersonal methods such as a letter or an email as effective as those methods that include personal interactions,

R
We would like to thank the editor, two anonymous reviewers, the discussants and participants at the National Tax Association Meeting in Boston,
the 3rd TARC workshop at University of Exeter, the CeATS Meeting, the CIAT Tax Studies and Research Network Meeting in Montevideo, and seminars at
DIAN, George Mason University, George Washington University, the Inter-American Development Bank, King’s College London, Universitat de Barcelona, and
University of Virginia for their comments and suggestions, and to Martín Ardanaz, Raquel Bernal, Anne Brockmeyer, Matías Busso, Paul Carrillo, Phil Keefer,
Christos Kotsogiannis, Giulia Mascagni, Pablo Sanguinetti, and Christian Traxler for very fruitful discussions about this project. We would also like to thank
the staff at DIAN and the Government of Colombia for their collaboration, and Lesbia Maris, María Franco Chuaire, Edgar Castro, Mónica Mogollón, and
Andrea Lopez-Luzuriaga for their research assistance at different stages of the project. We duly appreciate the funding provided by the Public Capacity
Building Fund (KPC) of the IDB, funded by the Government of Korea, for this project. The opinions presented herein are those of the authors and do not
represent the official position of their institutions. The policy evaluated in this document was executed under the norms and regulations of the Government
of Colombia, and according to the procedures imposed by the National Tax Agency in its own code of ethics.

Corresponding author.
E-mail addresses: dortega@caf.com (D. Ortega), carlossc@iadb.org (C. Scartascini).

https://doi.org/10.1016/j.jebo.2019.12.008
0167-2681/© 2020 Published by Elsevier B.V.
D. Ortega and C. Scartascini / Journal of Economic Behavior and Organization 170 (2020) 286–300 287

such as a phone call or a visit? Do taxpayers react to electronic communication by a tax agency? The literature usually tends
to avoid discussing the issue but, if the method of communication matters, the results of an intervention that provides in-
formation to people cannot be evaluated independently of the method of communication used. The delivery method may
matter for individual decisions in several ways: (i) it may convey additional information about the government strategy or
commitment to the policy; (ii) it may encourage compliance according to the personal connection it creates (Kessler and
Zhang, 2014); (iii) it may generate different levels of trust with the information received, and; (iv) it may affect whether
and how it reaches people and how salient the message is for the individual.
The lack of a systematic analysis of the consequences of choosing one method over another and interpreting those results
in light of the method is relatively common. For example, the empirical literature on tax compliance has advanced steadily
in the last few years in trying to explain what motivates individuals to pay their taxes in full and on time, and what is
the best way to deal with those who do not declare the full tax amount or are late with their payments (Blumenthal et
al., 2001; Castro and Scartascini, 2015; Chirico et al., 2015, 2019; Dwenger et al., 2016; Fellner et al., 2013; Hallsworth et
al., 2017; Kleven et al., 2011; Meiselman, 2018; Ortega and Sanguinetti, 2013; Perez-Truglia and Troiano, 2018; Slemrod et
al., 2001; Del Carpio, 2014).1 In almost every case, the results, be that positive or null, have been taken as granted irre-
spective of the method of communication used. This has not been exclusive to tax compliance. Just to name one, Karlan
and List (2007) evaluates whether price matters in charitable giving without considering whether using direct mail solici-
tations instead of other methods could have affected the effectiveness of the intervention.2 However, as the extant political
science literature on “get-out-the-vote” has shown: personalized methods may be more effective than the more impersonal
methods, such as letters (Green and Gerber, 2004).
In this paper, we evaluate if the delivery method matters by conducting a field experiment in Colombia in which we send
the same message to taxpayers with tax delinquencies (taxpayers who presented a tax declaration, had a tax to pay, but
had not deposited the payment) but using different communication methods. Tax delinquencies are not a problem only in
developing countries but in developed ones too. In 2006, according to an estimate by the United States Treasury Department,
Americans failed to pay about $110 billion, or around 25 percent of the estimate of the total amount underpaid in that year
(Perez-Truglia and Troiano, 2018).
Around 21,0 0 0 individuals with tax delinquencies were randomly assigned to one of three different treatments (physical
letter, email, personal visit), and to a control group. We choose these methods because they could allow us to ascertain: (i)
if there are differences between impersonal and personal methods of communication; (ii) if there are differences between
physical and electronic delivery. The first comparison would inform the tax authority whether spending additional resources
for contacting taxpayers pays off. The second comparison would inform the tax authority about the role that new tech-
nologies could have in its enforcement strategies. Both comparisons would inform researchers about how to interpret the
results coming from interventions that use only one type of communication – for example, whether finding null results in
an intervention using letters could be explained by low power.
The results in the paper show that differences across communication methods are significant. Among those assigned to
a letter (ITT results), the probability of making a payment is 4 percentage points higher than doing nothing (control group).
Given that the underlying probability for the control group of paying what they had declared is about 5%, sending a letter
almost doubles the probability that the taxpayer would pay part of the debt. Sending an email and scheduling a personal
visit have an impact that is three times larger. These results show that messages matter for reducing tax delinquencies and
that emails seem to be particularly effective given that they have a large effect while costing basically nothing. Consequently,
tax authorities could consider switching to electronic communication whenever possible.
Unfortunately, tax agencies sometimes have a hard time locating some taxpayers or have no enough resources to send as
many agents to the street as planned. For example, in fiscal year 2012, the IRS closed about half a million cases (involving
almost US$7 billion) because it could not locate delinquent taxpayers (Treasury General Inspector for Tax Administration,
2014). Consequently, it is also important to evaluate the policy conditional on attempting to contact the taxpayers and
according to the effective delivery of the messages. Each one of these set of results provide different insights into the
effectiveness of the policy. Restricting to the group of taxpayers the agency attempted to contact (LATE results), results
were higher and the differences across methods were broader: 4 percentage points (letter), 15 percentage points (email),
and about 67 percentage points (personal visit.) These estimates increase even more to 8, 17, and 88 when we considered
only those who were actually treated. Overall, the economic relevance of the exercise was highly significant. The Agency
recovered about 4.5 times more from the people they contacted than from the people in the control group (on average,
about US$1100 ppp vs. US$250 ppp.) These differences are almost ten times larger for those in the group of personal visits.
Moreover, we find positive spillover effects across taxes for the same individual: those taxpayers who received a message
regarding their unpaid income, wealth, and value added taxes made payments of other arrears too.
To check for the robustness of our results, given that there is sizable non compliance with assignment to the treatment
– in particular, the tax agency attempted to visit only about 1/3 of the taxpayers assigned to treatment –, and there is some

1
The tax evasion literature is far too vast to be summarized in this paper. For comprehensive overviews, of the theoretical literature see Traxler (2010),
Hashimzade et al. (2013), Dell’Anno (2009). Luttmer and Singhal (2014) reviews the literature on the moral determinants of compliance. Hallsworth (2015),
Mascagni (2018), and (Slemrod, 2017) present broad overviews of the use of field and laboratory experiments for increasing tax compliance.
2
Della Vigna et al. (2012) show that social pressure matters in charitable giving; consequently, personal interaction could be more effective and this
effect be heterogeneous in pricing.
288 D. Ortega and C. Scartascini / Journal of Economic Behavior and Organization 170 (2020) 286–300

very small contamination across treatments, we perform additional exercises using nearest neighbor matching and inverse
probability weighting, and the plausible exogenous approach of Conley et al. (2012). Results hold. The effectiveness of the
communication methods is significantly different: personalized methods have a higher effect than impersonal methods, and
the electronic communication surpasses the physical letter. Differences in the coefficients for the visit between the LATE and
the matching estimations may indicate some targeting of those more likely to repay.
This document contributes to the literature in several ways. First, the results show that the delivery mechanism matters.
In particular, how much the tax agency can reduce delinquencies is not independent of the method of communication
the agency uses to inform the taxpayer. Therefore, the paper opens up the discussion in the tax compliance literature and
other fields (such as the growing literature on charitable fund-raising Della Vigna et al., 2012; Landry et al., 2006, and
financial markets Bertrand et al., 2010) about the relevance of the delivery mechanism for affecting behavior. It may be
relevant to consider this explicitly in the design and analysis of experimental interventions, and it may be worth to include
the method explicitly in the theoretical models. Additionally, the method of communication should be considered when
comparing results across papers and even more so when discussing null results of interventions. By comparing a set of
methods in the same setting, this article provides researchers a good benchmark from which to evaluate the potential power
of future interventions and choose the most appropriate method (particularly if sample sizes are small).
Second, we show that contacting taxpayers and warning them about their outstanding debt on one tax can have spillover
effects on the payment of other taxes, and this effect also varies according to the delivery method. It highlights the impor-
tance of observing the full portfolio of taxes when evaluating the impact of an intervention. Otherwise, under or over-
estimations are possible, and policy recommendations may be misguided. As such, researchers should incorporate more
explicitly the existence of potential spillover effects in their analyses, including conditions under which spillovers could be
negative (Carrillo et al., 2017; Slemrod et al., 2017) or positive (Lopez-Luzuriaga and Scartascini, 2019).
The paper has relevant policy implications. First, it provides information to tax agencies that may help them to choose
the delivery method that could maximize recovering the most revenue at the lowest cost. Second, it stresses the relevance of
getting the basic things right first: having accurate, valid, and up-to-date ways to contact taxpayers may be as important in
the longer run as developing other, more sophisticated enforcement strategies. In the case of this interventions, the National
Tax Agency of Colombia (DIAN) could have recovered at least an additional US$8 million if it had been able to contact all
the taxpayers in the treatment group.
The paper is organized as follows. Section 2 presents a summary of the related literature and describes the analytical
framework. Section 3 describes the experiment, and Section 4 presents the empirical results. Section 5 concludes.

2. Why might the delivery method matter?

Field experiments that rely on providing information to subjects as the main treatment tend to avoid discussing the
implications of the chosen method of communicating that information and how it affects the estimates of the intervention.
This is particularly true in the burgeoning tax compliance literature, which has relied on the use of letters as the main
delivery mechanism for messages. While evaluating systematically the role of different delivery technologies has been absent
from the tax compliance literature, it has been more common in others, such as in the ‘get-out-the-vote’ (GOTV) literature.
Existing randomized experiments have provided relevant information on the effect of campaigning and voter mobilization
on election outcomes. It has been shown that impersonal methods of voter turnout communication such as robotic calls
(Ramirez, 2005; Shaw et al., 2012; Green and Karlan, 2006) and emails (Nickerson, 2006; Stollwerk, 2006) are recurrently
ineffective. On the other hand, non-partisan face-to-face canvassing (Gerber and Green, 20 0 0), and phone calls (Imai, 2005;
Arceneaux, 2007; Nickerson, 2006; Arceneaux and Nickerson, 2006) are more effective than non-personalized methods such
as flyers. This result is also confirmed by Barton et al. (2014), who look at the role of candidate door-to-door canvassing. In
the experiment, voters are persuaded by personal contact (the delivery method), but no evidence was found for the content
of the message. An emerging result from this literature, quite relevant for the research we pursue here, is that the content
of the message may not be as relevant as the type and quality of its delivery for nudging people. Doerrenberg and Schmitz,
2017 offer a first glimpse at the effect on compliance by comparing the effect of audit probability letters sent to firms
through the postal office and by tax agency personnel.
One reason why the methods may have different impact is because “actions may speak louder than words.” Taxpayers
understand that the tax agency has a menu of options for warning them about the consequences of evasion. If the agency
decides to visit the taxpayer to inform her of outstanding liabilities and warn her about the consequences of not paying,
the taxpayer may update the probability of being prosecuted if she does not comply more than if she receives a letter —
which she may assume was less selective and reached more taxpayers. This argument can be embedded in the traditional
tax evasion model (Allingham and Sandmo, 1972; Yitzhaki, 1974).3 We present a simple model of decision making by a
taxpayer when there are different delivery mechanisms available to the tax agency in the Online Appendix. The model
shows that more selective methods of communication (such as personal visits) should generate a larger response than less
selective methods (such as a letter).

3
Hashimzade et al. (2013) and Traxler (2010) constitute broad and comprehensive surveys of this literature.
D. Ortega and C. Scartascini / Journal of Economic Behavior and Organization 170 (2020) 286–300 289

A second reason for finding differences across methods is that receiving the visit of a tax agent may generate different
behavior than the more impersonal methods because of social forces that make people behave differently when confronted
with other people. Individuals try to take actions that make others view them more favorably (Harbaugh, 1998; Lacetera and
Macis, 2010), and individuals will be more likely to take action when asked to do so by someone else (Kessler and Zhang,
2014). For example, there is evidence that people are more likely to donate and volunteer when called, visited, or asked by
a friend (Card et al., 2011; Freeman, 1997; Meer and Rosen, 2011), and more likely to vote under personal canvassing than
under more impersonal methods (Imai, 2005).
Finally, there is a mechanical reason. Each method might have different probabilities of actually reaching the taxpayer and
delivering the message for several reasons. The first is data quality. Not every entry in the taxpayers’ record may have been
updated at the same time, which can generate a different probability for reaching the taxpayer electronically or physically. A
second consideration is human effort. While electronic methods are quite impersonal, physical and personal methods require
the effort and dedication of mail carriers and public employees. Therefore, the effectiveness of the intervention may depend
on how much human effort each treatment requires, and whether the appropriate incentives are in place.4 A third issue to
consider is taxpayer attention. Some methods require different levels of attention by the taxpayers. While a personal visit
may be very salient for the taxpayer, a letter or an email may go unnoticed even if received.

3. The experiment

With the objective of increasing tax collection and evaluating the effectiveness of different delivery mechanisms for
sending messages to taxpayers, the National Tax Agency of Colombia (DIAN) agreed to randomly assign the method used to
contact a sample of taxpayers with due liabilities (self-declared but unpaid taxes).
The agency randomized a subset of taxpayers with due tax liabilities into four main groups. One group was assigned to
be contacted via email, another via physical letter, and a third by means of a visit by an agent. The fourth group was left
as a control group. This way we can compare in a unified exercise the relative importance of personal versus impersonal
methods, and physical versus electronic ones.
The sample of this experiment includes all taxpayers with declared but unpaid income, wealth, or sales tax liabilities for
the years 2011 to 2013.5 Taxpayers with relatively low (lower than COP 20,0 0 0, about US$ 20 in ppp) and high (more than
COP 50 million, about US$ 46,0 0 0 in ppp) debts were not included.6 Those who did not have a physical address, telephone
number, and email on file were also left out.7 At this point, 20,818 taxpayers from the universe of taxpayers with unpaid
but declared obligations remained eligible. Among them, 50 0 0 taxpayers were assigned to standard mail, 50 0 0 taxpayers to
email, and 4042 to a personal visit; the remaining 6776 taxpayers were assigned to the control group. The randomization
was performed in six strata according to the size of debt and whether the debt was recent or not.8
As shown in Table A1 of the Appendix, the main variables of interest are balanced across treatments using the baseline
data. That is, treatment groups were balanced according to the number of unpaid obligations and the amount of standing
debt with the tax authority, which is the information provided to the taxpayer to affect the taxpayer’s choice variable (the
taxpayers decide whether to pay the informed amount of outstanding debt, a fraction of it, or nothing).
There are a few imbalances for some of the treatments for some of the individual’s characteristics such as being a firm
or an individual. We include them as controls in the empirical analysis below and show that their inclusion does not affect
the size or significance of the coefficients of interest.
The experiment was implemented between September and October of 2013.9 We collected the information about pay-
ments realized by the taxpayer at the end of the year. The message included in both the physical letter and the email was
exactly the same. The message stated the account balance on 31 July 2013, the type of tax, and the year or month it had
not been paid. It also included information on methods of payment and the cost that the taxpayer was incurring by not
paying (interest and penalties, potential legal action, and possible effect on credit history). Finally, it provided a moral sua-
sion message (“Colombia, a commitment we can’t evade”). The message concluded with the contact information of a tax
agency authority.10 This way, even though the content of the messages was not the subject of the evaluation, careful steps

4
The problem can only be corrected in the estimations if there is accurate information about who received the treatment and who didn’t.
5
As Hallsworth (2015) identifies, focusing on the payment decision of a predetermined amount reduces many of the measurement problems that the
papers focused on declaration have. See also Castro and Scartascini (2015) for a discussion of this point.
6
To convert from COP to US $ in PPP terms, we use World Development Indicators’ data for exchange rate (about COP1800 per dollar during the period)
and PPP conversion factor (about 0.6). Data available at: http://data.worldbank.org/indicator.
7
Originally, we planned to use phone calls as an additional delivery method. Unfortunately, it could not be accomplished in the context of this experi-
ment. Mogollon et al., 2019 summarizes the results of a posterior experiment which used only phone calls as delivery method.
8
This way we can balance on variables that may proxy economic activity, and payment history. This strategy is similar to Dwenger et al. (2016). Con-
straints on the total capacity to deliver visits by each regional office was also considered, thus making the probability of assignment to visit slightly
different across regional offices.
9
Personal visits were carried out on 10 September 2013, emails were sent on 2 October 2013, and physical letters were sent out between 30 September
and 4 October 2013. Outcome data was collected at the end of year. According to data from a complementary experiment (Mogollon et al., 2019) 2/3 of
people pay within 30 days of receiving a notification and the rest within 60 days. As such, the difference in time length between treatment and collection
does not seem to be a binding factor.
10
The actual letter is included in the Online Appendix
290 D. Ortega and C. Scartascini / Journal of Economic Behavior and Organization 170 (2020) 286–300

were taken to include all the components that have been identified in the literature to matter for increasing compliance
(Behavioral Insights Team (BIT), 2012; Hallsworth, 2015)
Personal visits had a unique protocol that the Agency personnel assigned to the task were supposed to follow. At the
time of the visit, if the taxpayer was present at the physical address, the agents identified themselves and proceeded with
the protocol (included in Online Appendix). While the agents identified themselves as DIAN personnel at no moment they
presented themselves as auditors or enforcement agents–they were there only to provide a notification. Moreover, they
had a script that provides the same information that the letter and they were not able to answer any additional questions
from the taxpayers. As such, the treatment followed the same logic and text than the written messages: the taxpayer was
informed about his or her standing tax delinquencies and urged to pay, the penalties the taxpayer was incurring and the
possibility of further legal actions in case of noncompliance, and the message ended with the verbal delivery of a moral
suasion message.
In the case the taxpayer was not present at the address but there was some certainty that the address was correct, the
agents left a citation informing that the agents had been there. In this case, no detailed information (such as the amount of
debt) was left in the citation because of privacy concerns; the taxpayer was asked to visit the Tax Agency offices instead to
obtain information regarding his or her standing liabilities. If the taxpayer was not present at the domicile and there was
no certainty that the address was correct, no notification was left behind.
The intervention was designed to reduce the influence of two of the three potential mechanisms we described before
(deterrence, moral suasion, technological). For reducing the moral suasion channel, we provided agents with a written script
they had to read and follow, and they were not supposed to answer any additional questions. Additionally, differently with
most of the literature that finds a higher effect of the personal visits because of a personal connection (e.g., the donations
literature), payment could not be made on the spot. If the visit and the payment are disconnected actions, the personal
connection should lose some of its relevance because the payment action is unobservable by the person visiting the taxpayer.
For reducing the effect of the technological channel, we spent significant resources in making sure we could collect very
detailed information about who was assigned to treatment, who the agency attempted to contact, and who was finally
contacted. This way, while ITT results could be affected by this channel, we could reduce its effect by looking at the IV
estimations. Importantly, this effort led to important policy recommendations to the tax agency regarding how to manage
their taxpayers database, the benefits of updating addresses often, and the necessary mechanisms for ensuring a better
registration of email addresses.

4. Empirical results

The general model we estimate is presented in the following equation:

Y = α + T β + X λ + Bγ + Dθ +  (1)
where T is the vector of treatments (email, physical letter, and personal visit), X a vector of control variables, B the blocks
(or strata), and D district-level fixed effects. The six strata are defined according to the size and maturity of the debt, and
the district-level fixed effects correspond to the geographic district the taxpayer belongs to and the tax agency jurisdiction
she reports to.
We use several dependent variables to measure compliance. Paid is a dummy that takes value 1 if the taxpayer made
any payment to reduce liabilities after the experiment. Full payment is a dummy that takes value 1 if the taxpayer paid
the liabilities reported in the message in full. Total Payment is the amount (in logs) paid by the taxpayer after the experi-
ment.11 Payment share is the share of liabilities paid by the taxpayer. Other payments is a dummy that takes value 1 when
the taxpayer made a payment of liabilities not included in the communication.
The set of control variables includes all the observable characteristics we had access to: Liabilities, which is the amount
informed to the taxpayers in the messages; Number of debts, which is the number of tax obligations the taxpayers did
not paid on time; Tax, which is a set of dummy variables that indicate the type of tax the taxpayer had liabilities for
(wealth, income, VAT); Taxpayer type, which indicates whether the taxpayer is a firm or an individual, Pre-payments, which
is the amount of liabilities paid by the taxpayer between the moment of the randomization and the experiment; Wrong
information, which takes a value 1 when the amount of debt informed to the taxpayer was different than his or her actual
liabilities with the tax authority because of the prepayments; and Over-payments, which takes a value 1 in those cases when
the taxpayer made a payment higher than his or her standing liabilities before the experiment took place.

4.1. Effectiveness of the intervention

The first analysis we perform is to evaluate whether conducting the revenue collection exercise was worthwhile for the
Agency. As shown in Table 2, during the campaign the Agency collected about COP1,800M from payments made by 335 out
of the almost 70 0 0 taxpayers in the control group. Therefore, absent any effort by the agency (which we could call the zero
deterrence scenario), approximately only 5% of the taxpayers would had paid a part of what they owe and only 2% would

11
We also run the regressions with the full payment – no logs.
D. Ortega and C. Scartascini / Journal of Economic Behavior and Organization 170 (2020) 286–300 291

have had paid their liabilities in full. This is consistent with data available in a companion paper where we find that absent
any action by the agency, from the universe of taxpayers with debts at the beginning of the year, only 5.4% cancel their
debts by the end of the year (Mogollon et al., 2019).12 As such, those who decide not to pay would tend to maintain that
behavior unless the agency acts.
Contrary to that scenario, the exercise had a large revenue collection effect for the Tax Agency. The amount it collected
from taxpayers assigned to the treatment group was much higher: about COP8,800M (or around COP0.63M per taxpayer –
about US$583 ppp, almost two-and-a half-times higher than in the zero warning scenario.) In the case of this group, 2774
taxpayers (about 20%) made payments and 11% paid their debt in full. The share paid amounts to about 15%, which may
indicate that the payment probability could be marginally higher for those with lower debts. Importantly, there were large
and significant spillovers for the same individual across taxes, as 15% of the taxpayers paid other obligations not included
in the messages too.
A summary of the regression results (OLS) is included in Table 3. In the Online Appendix we include the full set of regres-
sions, including weighted OLS results (results are basically the same).13 Here, the treatment variables indicate assignment
to the treatment (ITT estimates). The upper panel of the table shows the regressions results when we pool those assigned
to treatment. The lower panel shows the regression results considering each treatment separately. Even columns show the
results including the control variables. As can be observed, point estimates change little from one specification to the other.
As shown in the upper panel, taxpayers included in the treatment group had a positive and significantly higher proba-
bility of paying their liabilities (paid) compared to the taxpayers in the control group (10 percentage points higher) and a
higher probability of paying the full amount (full payment), 8 percentage points higher. The share paid with regards to the
informed debt (payment share) is 9 percentage points higher than the share paid by those in the control group, and people
in the treatment group paid more than twice the amount than those in the control group (total payment). Interestingly,
there are large spillover effects, as 13% of those in the treatment groups made payments to other liabilities they also had
but that had not been part of the warning sent by the tax agency.

4.2. Relative effectiveness of each eelivery method

The overall program executed by the Agency was successful in terms of revenue collection (the revenue collected by
taxpayer more than doubled). As can be observed in the bottom panel of Table 3, the effectiveness of the intervention
cannot be evaluated in isolation of the method used. Sending a letter generates a 55% larger amount paid (total payment,
column [8]) and increases the share of the amount paid with respect to liabilities by 3 percentage points when compared to
the control group (payment share, column [6]). Sending a letter also favors higher compliance. On average, taxpayers in the
group that were sent a letter are 4 percentage points more likely to make a payment than those in the control group (paid,
column [2]) and also 3 percentage points more likely to pay their debt in full (full payment, column [4]). These taxpayers
are also 12 percentage points more likely to make payments on other arrears they may have with the tax authority (other
payments, column [10]).
Sending an email has an even larger effect when compared to the control group. Those contacted by this method pay
a 13 percentage points higher share of paid liabilities (column [6]), and they are 15 percentage points more likely to make
any type of payment (column [2]), 11 percentage points more likely to pay in full (column [4]), and 13 percentage points
more likely to make payments over other arrears not included in the experiment (spillover effects, column [10]).14
Scheduling a personal visit has a similarly large effect (as we show later, results are much higher when we condition for
delivery).15 Taxpayers contacted by this method pay a 11 percentage points higher share of paid liabilities (payment share,
column [6]), and they are 13 percentage points more likely to make any type of payment (paid, column [2]), 10 percentage
points more likely to pay in full (full payment, column [4]), and 14 percentage points more likely to make payments on other
arrears not included in the experiment (other payments, column [10]).
These results, which measure the effectiveness of the campaign the tax agency planned to conduct, show that the method
of communication matters. A letter (the most usual method used by tax agencies and evaluated in the literature) is not as
effective as other methods. Given that most papers to date, including very recent for the USA (Chirico et al., 2015; 2019;
Meiselman, 2018) rely on letters, this is a very relevant result. It provides a benchmark for the effectiveness of the delivery

12
Unpaid debts are an important problem for the Colombian Tax authority (DIAN). Every year more than half of the outstanding tax debt is written off.
DIAN does not have many legal instruments to collect unpaid debts. For example, DIAN cannot offset tax debt with outstanding tax credits of the same
taxpayer and cannot initiate bankruptcy procedures, and businesses are not required to pay all their taxes to be contractors of the public sector (Daude
et al., 2015).
13
Because the probability of being assigned to the control and treatment groups is not uniform across blocks we also estimate the models using weighted
least squares (weights are the inverse of the probability of being selected to the control or treatment groups) even though the results are basically the
same.
14
This result, higher effect of email than letter, is relatively constant across specifications even after controlling for non-compliance with assignment
(LATE results). While these results may seem counter-intuitive for those outside Colombia, they can be explained by a strategic decision of the Government
of Colombia to move transactions to electronic means. For example, Colombia has been one of the countries at the forefront of providing legal status to
electronic communications. Additionally, given the fact that payments can be made online, the act of paying may have been more spontaneous than after
receiving a letter (the person was already sitting at the computer).
15
The email and the personal visit are statistically different at the 10% level only for payment share.
292 D. Ortega and C. Scartascini / Journal of Economic Behavior and Organization 170 (2020) 286–300

Table 1
Compliance with the experiment design.

Treatment

Letter Email Visit Control group

Assignment (randomized) 5,000 5,000 4,042 6,776


Attempted treatment 4,394 87.9% 4,982 99.6% 1,270 31.4% 0
Failed treatment 2,511 50.2% 584 11.7% 263 6.5% 0
Actual treatment 1,883 37.7% 4,398 88.0% 1,007 24.9% 0

Note: each column presents the number that had been assigned to each treatment, the number that the Agency attempted to contact, the
number of times they failed, and finally the number actually treated. For example, out of 5,0 0 0 assigned to a letter, the Agency only sent
4,394 letters (87.9%). Of those, only 1,883 reached the taxpayers (37.7%) while 2,511 (50.2%) were returned by the mail carriers because of
problems locating the taxpayers Source: Authors’ calculations.

Table 2
Summary Statistics of intervention results.

Tax payers Paid Full payment Payment ratioa Total payments (in US$ PPP) Other payment

Control group
Total 6,776 335 102 5.6% 1,660,185 0
Per taxpayer 6,776 5% 2% 245

Overall treatments (pooling)


Total 14,042 2,774 1,519 8,181,481 2,163
Per taxpayer 14,042 20% 11% 14.6% 583 15%
Per contacted taxpayer 7,457 37% 20% 15% 1,097 29%

Source: Authors’ calculations.


a
Fraction of the total debt that was paid. Equals "total amount paid" over "total debt".

mechanism that should be taken into account by researchers in the future. The email seems to be particularly effective given
its very low cost; tax agencies should consider the use of electronic communication as alternative delivery mechanisms
whenever possible.

4.3. Taking into account non-compliance with assignment: LATE estimations

As shown in Table 1 (and in the full table in the Online Appendix), there were several sources of one-side non-compliance
with the random assignment because the agency did not have the personnel-time to send all the letters and accomplish all
the personal visits that they had originally planned. This fact restricted the number of letters sent (88%) but restricted even
more the number of personal visits they attempted (31%). The agency attempted to maintain the balance when deciding
which taxpayers to visit out of the sample assigned by the lottery. Still, it could be the case that some agents might have
preferred to schedule visits in a non-random fashion. For example, they could have preffered to visit those with higher
debts or those who were closer to the tax agency. As such, running OLS regression would provide bias results. Running an
instrumental variable (2SLS) estimation using the variation generated solely by the random assignment should attenuate
that bias.
Therefore, to correct for this difference between assignment (e.g., letter assigned to be sent) and attempted treatment
(e.g., letters actually sent), we instrument first the attempted treatment variable with the assignment to the treatment. In
some of the regressions we also use the distance from the domicile to the DIAN as an instrument to control for the fact
that agents may be marginally more likely to attempt to contact a taxpayer who was relatively closer.16 Distance should
be a good instrument in the context of this intervention that deals with tax delinquencies instead of tax declarations or
compliance. There is no reason to expect that a taxpayer would choose a location to live based on having declared a tax
obligation but not paying it and there are no obvious reasons why distance would affect paying the outstanding debt other
than through the probability of being treated. Different would be the case if the intervention was dealing with informality
– in which case the taxpayer may want to locate far away from the tax agency to avoid detection – or if taxpayers had to
pay their taxes personally at the agency instead of being able to pay them online or at any bank. It is important to note
that these assumptions are supported by the data. For example, distance does not explain either the amount owed nor the
number of missing payments at baseline. Also, distance is not significant in explaining actual payment decisions within the
sample of those selected by the agency to visit. It is also important to note that the fact that tax agents may have been
slightly more likely to attempt to visit taxpayers with higher debt (doubling the amount of debt increases the chances of an
attempted visit by 2 pp) would not necessarily drive our results upwards. The evidence shows that taxpayers with higher
debt are not more likely to pay their taxes than taxpayers with lower debts.
A summary of second-stage results is included in Table 4. Again, the top panel shows the results for the overall treatment
(pooling) and the bottom panel shows the results considering each treatment individually. In the odd columns, we report

16
Unfortunately, we could only geo-code about 4/5th of the sample because limitations of the software to convert some of the addresses.
D. Ortega and C. Scartascini / Journal of Economic Behavior and Organization 170 (2020) 286–300 293

Table 3
ITT results.

Dependent variable

Paid Full payment Payment share Total payment (logs) Other payments

[1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

Overall treatment 0.109∗ ∗ ∗ 0.105∗ ∗ ∗ 0.076∗ ∗ ∗ 0.078∗ ∗ ∗ 0.092∗ ∗ ∗ 0.091∗ ∗ ∗ 1.469∗ ∗ ∗ 1.410∗ ∗ ∗ 0.136∗ ∗ ∗ 0.129∗ ∗ ∗
(0.00) (0.00) (0.00) (0.00) (0.01) (0.01) (0.06) (0.06) (0.00) (0.00)

N 20,818 20,818 20,818 20,818 20,818 20,818 20,818 20,818 20,818 20,818
Controls No Yes No Yes No Yes No Yes No Yes

Letter 0.042∗ ∗ ∗ 0.039∗ ∗ ∗ 0.026∗ ∗ ∗ 0.027∗ ∗ ∗ 0.031∗ ∗ 0.031∗ ∗ 0.591∗ ∗ ∗ 0.550∗ ∗ ∗ 0.126∗ ∗ ∗ 0.120∗ ∗ ∗
(0.01) (0.01) (0.00) (0.00) (0.01) (0.01) (0.08) (0.08) (0.01) (0.01)

Email 0.153∗ ∗ ∗ 0.148∗ ∗ ∗ 0.110∗ ∗ ∗ 0.111∗ ∗ ∗ 0.135∗ ∗ ∗ 0.133∗ ∗ ∗ 2.042∗ ∗ ∗ 1.967∗ ∗ ∗ 0.139∗ ∗ ∗ 0.133∗ ∗ ∗
(0.01) (0.01) (0.01) (0.01) (0.01) (0.02) (0.09) (0.09) (0.01) (0.00)

Personal visit 0.136∗ ∗ ∗ 0.133∗ ∗ ∗ 0.095∗ ∗ ∗ 0.099∗ ∗ ∗ 0.110∗ ∗ ∗ 0.110∗ ∗ ∗ 1.839∗ ∗ ∗ 1.792∗ ∗ ∗ 0.148∗ ∗ ∗ 0.138∗ ∗ ∗
(0.01) (0.01) (0.01) (0.01) (0.01) (0.01) (0.12) (0.12) (0.01) (0.01)

Pvalue of joint significance 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.04
Letter=Email 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.05 0.06
Letter=Visit 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.02
Email=Visit 0.10 0.18a 0.08 0.12a 0.07 0.09 0.14a 0.25a 0.36a 0.52a

N 20,818 20,818 20,818 20,818 20,818 20,818 20,818 20,818 20,818 20,818
Controls No Yes No Yes No Yes No Yes No Yes

Notes: each row shows the regression coefficients and the standard error in parenthesis corresponding to an OLS regression that includes strata and
district. Standard errors are robust. ∗ ∗ ∗ p < 0.01, ∗ ∗ p < 0.05, ∗ p < 0.1. The top section of the table shows the results for a regression that includes the overall
treatment variable. The bottom section shows the results for regressions that include each treatment individually. Overpayments as additional controls.
Source: Authors’ calculations
a
indicates that Email and Personal Visit coefficients are not statistically different

the results using assignment to treatment as instrument. In the even columns, we report the results using both assignment
to treatment and distance as instruments. Full regression tables (including estimations with and without controls) and first-
stage results are included in the Online Appendix. As expected, once we control for the fact that the Agency did not attempt
to contact all the individuals assigned to treatment, the estimates are now substantially larger than before. Moreover, the
differences across delivery methods have become even more noticeable. The probability that people would make any pay-
ment (column [2]) according to each method has broadened: 0.04 for letter, 0.15 for email, and 0.67 for personal visits.
Similarly happens for the rest of dependent variables. Results change little when we also use distance as instrument (even
though the sample size drops).
In addition to the fact that the agency did not attempt to treat all the taxpayers, some of the taxpayers the agency did
try to reach could not be located because either their physical or electronic address was outdated (about 50% of the letters).
While this number seems large, it is not uncommon even for countries with higher levels of compliance. For example,
in fiscal year 2012, the IRS closed about 500 thousand cases (involving almost US$7 billion of tax debt) because it could
not locate delinquent taxpayers (Treasury General Inspector for Tax Administration, 2014). Consequently, only about 38% of
those assigned to the letter actually received a letter, 88% of those assigned to the email received an email, and 25% of those
assigned to the personal visit were actually visited by an agent. We present a full analysis in the Online Appendix evaluating
the characteristics of those who could not be located by regressing having the wrong address on taxpayers’ observables.
The empirical analysis indicates that there seems to be no relevant selection into providing a wrong address (be that
physical or electronic) given that there is no correlation between having a wrong or outdated address on file and the number
of missed payments or the size of the accumulated debt.17 Consequently, for evaluating the impact of actual treatment, we
instrument the effective treatment variable with the assignment to the treatment (and with distance in some regressions).
Once more, the IV estimation should attenuate any bias from selection given that the instruments comply with the exclusion
restriction.
A summary of second-stage results is included in Table 5. Again, the top panel shows the results for the overall treatment
and the bottom panel shows the results considering each treatment individually. First-stage results and full regression tables
are included in Online Appendix. As expected, the differences between the estimates for each method are now substantially
larger than before. The probability that people would make any payment (column [1]) is now: 0.085 for letter, 0.17 for
email, and 0.88 for personal visits; the probability that they would pay the full amount of debt (column [3]): 0.06, 0.13, and
0.65 respectively. The share of payments with respect to liabilities (column [5]): 0.07 for letter, 0.15 for email, and 0.73 for

17
There is a negative correlation with owing income and VAT taxes instead of wealth taxes, but that is expected given that the tax base for the tax would
usually be the residential property, which makes it more important to maintain accurate records.
294 D. Ortega and C. Scartascini / Journal of Economic Behavior and Organization 170 (2020) 286–300

Table 4
LATE (IV) results – attempted treatment.

Dependent variable

Paid Full payment Payment share Total payment (logs) Other payments

[1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

Overall treatment 0.121∗ ∗ ∗ 0.124∗ ∗ ∗ 0.090∗ ∗ ∗ 0.090∗ ∗ ∗ 0.106∗ ∗ ∗ 0.116∗ ∗ ∗ 1.629∗ ∗ ∗ 1.679∗ ∗ ∗ 0.149∗ ∗ ∗ 0.160∗ ∗ ∗
(0.01) (0.01) (0.00) (0.00) (0.01) (0.01) (0.07) (0.10) (0.00) (0.00)

Observations 20,818 16,376 20,818 16,376 20,818 16,376 20,818 16,376 20,818 16,376
Instrument
Assignment Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
Distance No Yes No Yes No Yes No Yes No Yes

IV tests:
LM test statistic for 11,051 11,048 11,051 11,048 11,051 11,048 11,051 11,048 11,051 11,048
underidentification
(Anderson or
Kleibergen–Paap)
p-value of underidentification 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
LM statistic
F statistic for weak 59,953 60,075 59,953 60,075 59,953 60,075 59,953 60,075 59,953 60,075
identification (Cragg–Donald
or Kleibergen–Paap)

Letter 0.038∗ ∗ 0.043∗ ∗ ∗ 0.027∗ ∗ ∗ 0.028∗ ∗ ∗ 0.030∗ ∗ ∗ 0.042∗ ∗ ∗ 0.540∗ ∗ ∗ 0.619∗ ∗ ∗ 0.127∗ ∗ ∗ 0.139∗ ∗ ∗
(0.01) (0.01) (0.00) (0.00) (0.01) (0.01) (0.08) (0.10) (0.01) (0.01)
Email 0.148∗ ∗ ∗ 0.153∗ ∗ ∗ 0.111∗ ∗ ∗ 0.112∗ ∗ ∗ 0.133∗ ∗ ∗ 0.144∗ ∗ ∗ 1.973∗ ∗ ∗ 2.048∗ ∗ ∗ 0.133∗ ∗ ∗ 0.145∗ ∗ ∗
(0.01) (0.01) (0.01) (0.01) (0.01) (0.01) (0.09) (0.11) (0.01) (0.01)
Personal visit 0.675∗ ∗ ∗ 0.691∗ ∗ ∗ 0.502∗ ∗ ∗ 0.505∗ ∗ ∗ 0.560∗ ∗ ∗ 0.604∗ ∗ ∗ 9.075∗ ∗ ∗ 9.330∗ ∗ ∗ 0.658∗ ∗ ∗ 0.711∗ ∗ ∗
(0.05) (0.05) (0.04) (0.04) (0.06) (0.05) (0.68) (0.76) (0.05) (0.05)

Observations 20,818 16,376 20,818 16,376 20,818 16,376 20,818 16,376 20,818 16,376
Instrument
Assignment Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
Distance No Yes No Yes No Yes No Yes No Yes

IV tests:
LM test statistic for 607.4 615.1 607.4 615.1 607.4 615.1 607.4 615.1 607.4 615.1
underidentification
(Anderson or
Kleibergen–Paap)
p-value of underidentification 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
LM statistic
F statistic for weak 231.3 170.0 231.3 170.0 231.3 170.0 231.3 170.0 231.3 170.0
identification (Cragg–Donald
or Kleibergen–Paap)
Notes: Each row shows the regression coefficients and robust standard error in parentheses corresponding to the second stage of IV regression that include
strata and district. The top section of the table shows the results for a regression that includes the overall treatment variable. The bottom section shows
the results for regressions that include each treatment individually. All estimations are OLS and include the following controls: Liabilities (in logs), Taxpayer
type (firm), Type of tax dummmies, Pre-payments (in logs), Wrong Information, and Overpayments. ∗ ∗ ∗ p < 0.01, ∗ ∗ p < 0.05, ∗ p < 0.1.

personal visits. The same patterns of higher compliance also exists in terms of total payments and other payments, once
more confirming the spillover effect for the same taxpayer onto other taxes than the ones targeted of the intervention.
Results in Mogollon et al., 2019, which look only at the effect of phone calls in a similar experimental setting, com-
plement these results. Phone calls have an intermediate effect between the impersonal methods and the visit, which is
consistent with the framework in this paper. Results are also in line with the evidence coming from the GOTV literature
(summarized in Section 2), where personal canvassing has usually been more important than other mechanisms. For ex-
ample, according to Imai (2005), personal canvassing was six times more effective than regular mail for getting people out
to vote. Our results indicate that personal visit can be up to 10 times more effective than regular mail. The difference in
magnitude between these results could be explained at least in part by the deterrence component, which is not present in
the GOTV case.18

18
So far, the GOTV and related literature have focused on moral/behavioral response to personal interactions. The results here show that rational reactions
matter too and should be incorporated into the analysis (e.g., personal canvassing has an effect through personal interaction but it may also provide a signal
that may affect the stakes for the individual in the electoral results).
D. Ortega and C. Scartascini / Journal of Economic Behavior and Organization 170 (2020) 286–300 295

Table 5
LATE (IV) results – effective treatment.

Dependent variable

Paid Full payment Payment share Total payment (logs) Other payments

[1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

Overall treatment 0.174∗ ∗ ∗ 0.180∗ ∗ ∗ 0.130∗ ∗ ∗ 0.130∗ ∗ ∗ 0.152∗ ∗ ∗ 0.168∗ ∗ ∗ 2.346∗ ∗ ∗ 2.433∗ ∗ ∗ 0.215∗ ∗ ∗ 0.232∗ ∗ ∗
(0.01) (0.01) (0.01) (0.01) (0.02) (0.01) (0.11) (0.14) (0.01) (0.01)

Observations 20,818 16,376 20,818 16,376 20,818 16,376 20,818 16,376 20,818 16,376
Instrument
Assignment Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
Distance No Yes No Yes No Yes No Yes No Yes

IV tests:
LM test statistic for 6,803 3,042 6,803 3,042 6,803 3,042 6,803 3,042 6,803 3,042
underidentification
(Anderson or
Kleibergen–Paap)
p-value of underidentification 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
LM statistic
F statistic for weak 13,715 6,092 13,715 6,092 13,715 6,092 13,715 6,092 13,715 6,092
identification (Cragg–Donald
or Kleibergen–Paap)

Letter 0.085∗ ∗ ∗ 0.100∗ ∗ ∗ 0.060∗ ∗ ∗ 0.064∗ ∗ ∗ 0.067∗ ∗ 0.096∗ ∗ ∗ 1.214∗ ∗ ∗ 1.430∗ ∗ ∗ 0.290∗ ∗ ∗ 0.321∗ ∗ ∗
(0.01) (0.02) (0.01) (0.01) (0.03) (0.02) (0.19) (0.24) (0.01) (0.02)
Email 0.169∗ ∗ ∗ 0.176∗ ∗ ∗ 0.127∗ ∗ ∗ 0.129∗ ∗ ∗ 0.152∗ ∗ ∗ 0.166∗ ∗ ∗ 2.250∗ ∗ ∗ 2.352∗ ∗ ∗ 0.152∗ ∗ ∗ 0.167∗ ∗ ∗
(0.01) (0.01) (0.01) (0.01) (0.02) (0.01) (0.10) (0.12) (0.01) (0.01)
Personal visit 0.879∗ ∗ ∗ 0.907∗ ∗ ∗ 0.653∗ ∗ ∗ 0.662∗ ∗ ∗ 0.729∗ ∗ ∗ 0.792∗ ∗ ∗ 11.801∗ ∗ ∗ 12.244∗ ∗ ∗ 0.841∗ ∗ ∗ 0.917∗ ∗ ∗
(0.07) (0.07) (0.05) (0.06) (0.08) (0.07) (0.90) (1.02) (0.06) (0.07)
Observations 20,818 16,376 20,818 16,376 20,818 16,376 20,818 16,376 20,818 16,376
Instrument
Assignment Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
Distance No Yes No Yes No Yes No Yes No Yes

IV tests:
LM test statistic for 489.1 450.4 489.1 450.4 489.1 450.4 489.1 450.4 489.1 450.4
underidentification
(Anderson or
Kleibergen–Paap) 450.4
p-value of underidentification 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
LM statistic
F statistic for weak 164.3 132.3 164.3 132.3 164.3 132.3 164.3 132.3 164.3 132.3
identification (Cragg–Donald
or Kleibergen–Paap)
Notes: Each row shows the regression coefficients and robust standard error in parentheses corresponding to the second stage of IV regression that include
strata and district. The top section of the table shows the results for a regression that includes the overall treatment variable. The bottom section shows
the results for regressions that include each treatment individually. All estimations are OLS and include the following controls: Liabilities (in logs), Taxpayer
type (firm), Type of tax dummmies, Pre-payments (in logs), Wrong Information, and Overpayments. ∗ ∗ ∗ p < 0.01, ∗ ∗ p < 0.05, ∗ p < 0.1.

4.4. Robustness

As we have argued before, while there has been one-side non compliance with the assignment, the IV estimates should
attenuate any bias introduced by selection. Still, to check the robustness of our results we have run a set of different spec-
ifications to make sure that we are comparing taxpayers of similar characteristics in the treatment and control groups by
performing matching estimations of treatment and control groups. First, we use inverse probability weighting where the
variables used for matching are the type of tax (sales, corporate or wealth tax), the size of the debt, and the number of
obligations. The inverse probability weighting allows us to estimate the average treatment effect of all treatment arms si-
multaneously, so we can make a comparison across them. Second, we use the nearest neighbor matching between each
treatment arm and the control group. Again, we use all the observable characteristics we have. As we describe in detailed
in the Online Appendix, we get perfectly balanced samples. We present a summary of the regressions for the variable Paid
in the Appendix for both the attempted and effective treatment, and full regressions in the Online Appendix. Under both
estimations, the same pattern of results appears: differences between the methods remains significantly large. In the case
of effective treatment, the email is almost three times more effective than the letter and the personal visit is about 5 times
higher than the letter and two times higher than the email. Differences in the coefficients for the visit between the LATE
and the matching estimations may indicate some targeting of those more likely to repay.
Another potential concern with the LATE estimates is that the tax agency not only attempted to treat fewer taxpayers
but they also treated a few of them using a different method than the one assigned (as it can be observed in the Online
296 D. Ortega and C. Scartascini / Journal of Economic Behavior and Organization 170 (2020) 286–300

Appendix), which could affect the exclusion restriction. To check whether our results are robust to relaxing the exclusion
restriction we follow the approach suggested by Conley et al. (2012), and we conduct some sensitivity analysis to test the
extent to which our estimates survive after allowing for plausible amounts of exogeneity of our instrument (assignment).19
We describe the method and assumptions in detail in the Online Appendix. In Table A4 we allow for plausible exogeneity
and report the lower and upper confidence intervals of the coefficient of interest for different plausible non-zero values of
the coefficient we call γ .20 Our presentation of the results follow buonanno inequality 2019 in this journal. The results are
sensitive to the assumptions for γ but show that the range of potential values for β for the email is larger than the range
for the letter, and the range for the visits is larger than the range for both of them.

4.5. Discussion of results

Are the results different for different types of taxpayers? We check for potential heterogeneous effects by interacting the
treatments with the control variables that proxy observable differences across taxpayers. We include the full set of results
and discussion in the Online Appendix because we don’t find patterns that provide novel evidence or help us shed light on
mechanisms. Basically, we find that taxpayers with standing liabilities on the income tax and VAT seem to react more to
the treatments than those who owe wealth taxes. We believe this may be explained by the fact that wealth taxes affect an
asset, which may not be liquid, while the VAT and income taxes tax the flow of revenues. Additionally, the level of debt
seems to have little effect on the probability of paying overall, but individuals with higher debts are less likely to pay than
the rest in the visit treatment. Finally, there seems to be little difference between firms and individuals. Firms seem to react
less than individuals when they receive a letter but they are more responsive when visited by an agent.
Overall, the set of results offers some lessons. The main one, and relevant for the literature at large, is that different
delivery methods seem to have different effects. These differences indicate that the methods chosen should enter into the
analysis of effectiveness of an informational intervention. For the more specific tax compliance literature, we can first con-
clude that enforcement matters. Contacting taxpayers in a personalized and detailed manner to inform them of their debts
and the consequences of maintaining unpaid liabilities is effective for eliciting payments, at least in the short run.The effec-
tiveness of these interventions are not independent of the method of communication chosen. Second, these interventions
can have positive spillover effects for the same taxpayer by increasing payment of other obligations not included in the
intervention. The evaluation of an intervention should consider the full tax portfolio. Looking only at the tax under treat-
ment could under/over estimate the actual effect. Third, the large levels of non-compliance with the assignment show that
there are plenty of gains to be made by simple strategies such as keeping databases up-to-date. Finally, results are informa-
tive about the benefits and costs of each intervention, and optimal warning strategies. Fixed costs are the same for every
intervention (keeping databases up to date, registering taxpayers, etc.) Variable costs are different but relatively low. The
tax agency has calculated them to be about US$ 0 per email, US$ 0.85 ppp per letter, and US$ 13 ppp per personal visit.
The average amount collected per attempted letter was around US$ 550 ppp, US$ 590 ppp for the email, and more than
US$ 2,0 0 0 ppp for the attempted visits. Consequently, the tax agency could use this information to determine the optimal
enforcement strategy. Letters have a small and constant variable cost per unit, which is determined exogenously by the Post
Office. Emails have a zero marginal cost. Personal visits, however, have an increasing variable cost given by labor market
conditions, distance from the agency, ability to find the taxpayer, etc. That is, while the first visits are relatively cheap, the
cost of each additional visit is higher. The optimal enforcement vector depends on the marginal benefit of each method
such that in equilibrium the marginal dollar per treatment should equalize across methods. Optimal portfolio choices could
change according to changes to fixed costs, restrictions (such as not having everybody’s electronic addresses), and with the
ability of the agency to convey or hide information to the taxpayer about its strategy. For example, if the agency could
commit to using the impersonal methods sparingly, those who receive a letter or an email would update the enforcement
probability much more than they would do otherwise. On the contrary, publicizing massive visit campaigns could work in
the opposite direction.

5. Conclusions

Most empirical papers that provide information to individuals have usually avoided evaluating the effect that different
communication methods could have on the effectiveness of the intervention. The literature on tax compliance is a good
example: it has shown that sending messages has an effect on compliance, and that different messages in terms of both
the content (e.g., deterrence, moral) and the characteristics of the messages (e.g., whether they are signed by a tax agency
authority or not) have different impacts. Evaluating different delivery mechanism for the messages has been mostly absent.

19
We thank one of our reviewers for suggesting this.
20
γ is the coefficient in the regression

Y = X β + Zγ + ε (2)
where Y is the outcome, X is a matrix of endogenous variables and Z is a set of instruments variables. γ equals zero if Z is exogenous but different from
zero in a context of plausible exogeneity.
D. Ortega and C. Scartascini / Journal of Economic Behavior and Organization 170 (2020) 286–300 297

Incorporating different delivery methods seems to matter according to results in this article. This evidence is relevant
because it sheds light on the fact that some of the results in the literature are method specific (and can not be evaluated
separately from the method it was used to deliver that message.) Similarly could be occurring in other areas. As such, the
mechanisms through which policies are informed and publicized should not be neglected in the literature.
In the future, it may make sense to consider randomizing both the message and the method to isolate each effect. In
particular, some types of messages may be more effective when delivered by some methods than by others. For example,
moral suasion messages may be relatively more effective when delivered by an individual in a personalized manner than
using an impersonal method such as a letter, which has usually been the norm. Additionally, future experiments could in-
clude explicit messages on the letters declaring that the number of taxpayers being contacted is fixed. This way, researchers
could isolate the enforcement and the moral effect. Additionally, the study has shown that spillover effects across taxes for
the same taxpayer can be substantial. Whenever possible, studies should consider the full tax portfolio when evaluating the
impact of an intervention.
The policy implications of the results are clear. There are plenty of gains to be made by tax agencies by contacting
the taxpayers regarding their outstanding liabilities, particularly if they keep a clean and up-to-date contact information
database. The agency collected two-and-a-half times the amount it would have collected if it had done nothing. The cam-
paign increased payment of other pending obligations, too. The policy recommendations that come out of this document
are in line with recommendations made by Slemrod, 2017 for the IRS. Basically, it would be optimal to increase the size of
its labor force with a focus on deterrence and using a portfolio of delivery methods. Electronic communications seem to be
a viable and cheap instrument that tax agencies should consider using more explicitly if they have not done it already. So
far, most tax agencies have been leaving plenty of money on the table.

Appendix A

A.1. Description of the variables

Randomization was performed according to taxpayer’s liabilities, which was the information to be provided in the mes-
sages, in six blocks according to size of debt and maturity. As can be observed in the table, samples balance on that variable.
Unfortunately, they do not balance in some of the other covariates; we include them as controls in the empirical analysis.

Table A1
Random assignment to treatment,

Difference w.r.t. control (coeff and s.e.)

Average Overall Individual treatments p-value Wald test equality coefficients Sample size
and s.d. treatment

Letter Email Visit [3]=[4] [3]=[5] [4]=[5] [3]=[4]=[5]


[1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

Liabilities (in millions) 4.440 0.026 −0.024 0.019 0.172 0.723 0.135 0.277 0.32 20,818
(7.731) (0.098) (0.120) (0.113) (0.144)
Liabilities (in logs) 13.998 0.009 0.001 0.01 0.023 0.524 0.195 0.489 0.425 20,818
(1.820) (0.012) (0.015) (0.014) (0.019)
Number of debts 1.753 0.015 0.001 0.031 0.002 0.267 0.981 0.345 0.491 20,818
(1.421) (0.022) (0.027) (0.025) (0.0316)
Tax (Wealth) 0.105 −0.004 −0.001 −0.012∗ ∗ 0.007 0.067 0.273 0.011 0.03 20,818
(0.307) (0.005) (0.006) (0.006) (0.00758)
Tax (Income tax) 0.229 −0.002 0.002 −0.007 0.001 0.27 0.902 0.437 0.518 20,818
(0.420) (0.006) (0.008) (0.008) (0.00986)
Tax (VAT) 0.666 0.007 −0.001 0.019∗ ∗ −0.008 0.032 0.55 0.02 0.032 20,818
(472) (0.007) (0.009) (0.009) (0.0113)
Taxpayer type (firms) 0.616 0.055∗ ∗ ∗ 0.046∗ ∗ ∗ 0.049∗ ∗ ∗ 0.095∗ ∗ ∗ 0.764 0 0 0 20,818
(0.486) (0.008) (0.009) (0.009) (0.0116)

Notes: Each row shows statistics for a different variable. Column [1] shows the sample average and the standard deviation in parenthesis. Column [2] shows
the regression coefficient and the standard error in parenthesis corresponding to an OLS regression that includes controls for strata and district. Standard
errors are robust. ∗ ∗ ∗ p < 0.01, ∗ ∗ p < 0.05, ∗ p < 0.1.
Columns [3]−[5] shows the regression coefficients and the standard errors in parenthesis corresponding to an OLS regression that includes controls for
strata and district. Standard errors are robust. ∗ ∗ ∗ p < 0.01, ∗ ∗ p < 0.05, ∗ p < 0.1 Columns [6]–[9] shows the p-value of a test of equality of coefficients.
Column [10] shows the sample size. Source: Authors’ calculations.
298 D. Ortega and C. Scartascini / Journal of Economic Behavior and Organization 170 (2020) 286–300

Table A2
First-stage regression table.

Overall attempted Attempted Attempted Attempted


treatment letter email visit

[1] [2] [3] [4] [5] [6] [7] [8]

Assignment to treatment 0.601∗ ∗ ∗ 0.592∗ ∗ ∗


(0.01) (0.01)
Assignment to letter 0.400∗ ∗ ∗ 0.400∗ ∗ ∗ 0.009∗ ∗ ∗ 0.007∗ ∗ ∗ 0.004∗ ∗ −0.002
(0.01) (0.01) (0.00) (0.00) (0.00) (0.00)
Assignment to email 0.004∗ ∗ ∗ 0.004∗ ∗ 0.879∗ ∗ ∗ 0.877∗ ∗ ∗ −0.001 −0.007∗ ∗ ∗
(0.00) (0.00) (0.00) (0.00) (0.00) (0.00)
Assignment to visit 0.052∗ ∗ ∗ 0.054∗ ∗ ∗ 0.011∗ ∗ ∗ 0.009∗ ∗ ∗ 0.144∗ ∗ ∗ 0.136∗ ∗ ∗
(0.00) (0.00) (0.00) (0.00) (0.01) (0.01)
Distance −0.001 0.001 −0.000 −0.003∗ ∗
(0.00) (0.00) (0.00) (0.00)

Adjusted R-squared 0.107 0.0878 0.0327 0.00169 0.0327 0.00169 0.0327 0.00169
Observations 20,818 16,376 20,818 16,376 20,818 16,376 20,818 16,376

LM test statistic for 6,803 3,042 450.4 489.1 450.4 489.1 450.4 489.1
underidentification
(Anderson or
Kleibergen–Paap)
p-value of 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
underidentification LM
statistic
F statistic for weak 13,715 6,092 164.3 132.3 164.3 132.3 164.3 132.3
identification
(Cragg–Donald or
Kleibergen–Paap)
Notes: Each row shows the regression coefficients and robust standard error in parenthesis corresponding to the First stage of IV regression that include
strata and district specific dummies. All estimations are OLS and include the following controls: Liabilities (in logs), Taxpayer type (firm), Type of tax dum-
mmies, Pre-payments (in logs), Wrong Information, and Overpayments. ∗ ∗ ∗ p < 0.01, ∗ ∗ p < 0.05, ∗ p < 0.1.

A.2. Robust estimations

Table A3
Average treatment effect on the treated on the probability of paying.

Attempted treatment Effective treatment

Matching nearest neighbor Inverse probability weighting Matching nearest neighbor Inverse probability weighting
∗∗∗ ∗∗∗ ∗∗∗
Letter vs. control 0.067 0.067 0.069 0.069∗ ∗ ∗
(0.005) (0.006) (0.008) (0.008)
Email vs. control 0.169∗ ∗ ∗ 0.170∗ ∗ ∗ 0.187∗ ∗ ∗ 0.187∗ ∗ ∗
(0.006 (0.006) (0.007) (0.007)
Visit vs. control 0.297∗ ∗ ∗ 0.296∗ ∗ ∗ 0.351∗ ∗ ∗ 0.350∗ ∗ ∗
(0.014) (0.014) (0.016) (0.016)
Email vs. letter 0.103∗ ∗ ∗ 0.119∗ ∗ ∗
(0.008) (0.010)
Visit vs. letter 0.230∗ ∗ ∗ 0.281∗ ∗ ∗
(0.014) (0.017)

The average treatment effect on the treated was calculated using the matching nearest neighbor for the first column and with inverse probability weight-
ing in the second column. In both cases a logit was use in the first stage. Controls for the amount of the debt, type of tax and number of obligations
were included. All monetary values are in Colombian pesos $COL. Standard errors are in parenthesis. ∗ p < 0.10, ∗ ∗ p < 0.05, ∗ ∗ ∗ p < 0.01.
D. Ortega and C. Scartascini / Journal of Economic Behavior and Organization 170 (2020) 286–300 299

A.3. Relax exclusion restriction

Table A4
LATE estimation relaxing the exclusion restriction assumption.

Attempted letter Attempted email Attempted visit

Lower Upper Lower Upper Lower Upper


Paid

γ ∈ (−0.0 05, 0.0 05 ) 0.022 0.063 0.133 0.174 0.539 0.821


γ ∈ (−0.010, 0.010 ) 0.016 0.068 0.127 0.180 0.509 0.853
γ ∈ (−0.015, 0.015 ) 0.011 0.074 0.122 0.185 0.479 0.885
γ ∈ (−0.020, 0.020 ) 0.005 0.079 0.117 0.190 0.448 0.917
γ ∈ (−0.025, 0.025 ) −0.000 0.085 0.112 0.195 0.418 0.949

Full payment

γ ∈ (−0.0 05, 0.0 05 ) 0.012 0.044 0.096 0.130 0.383 0.614


γ ∈ (−0.010, 0.010 ) 0.007 0.049 0.091 0.135 0.353 0.645
γ ∈ (−0.015, 0.015 ) 0.001 0.055 0.086 0.141 0.323 0.677
γ ∈ (−0.020, 0.020 ) −0.005 0.061 0.081 0.146 0.292 0.709
γ ∈ (−0.025, 0.025 ) −0.010 0.066 0.075 0.151 0.262 0.741

Payment share

γ ∈ (−0.0 05, 0.0 05 ) 0.013 0.065 0.112 0.174 0.448 0.723


γ ∈ (−0.010, 0.010 ) 0.007 0.071 0.107 0.179 0.418 0.754
γ ∈ (−0.015, 0.015 ) 0.002 0.076 0.101 0.184 0.388 0.786
γ ∈ (−0.020, 0.020 ) −0.004 0.082 0.096 0.189 0.358 0.818
γ ∈ (−0.025, 0.025 ) −0.009 0.087 0.091 0.195 0.328 0.850

Other payments

γ ∈ (−0.0 05, 0.0 05 ) 0.115 0.151 0.123 0.156 0.567 0.821


γ ∈ (−0.010, 0.010 ) 0.110 0.156 0.118 0.161 0.538 0.854
γ ∈ (−0.015, 0.015 ) 0.104 0.162 0.113 0.166 0.508 0.886
γ ∈ (−0.020, 0.020 ) 0.099 0.168 0.107 0.172 0.478 0.918
γ ∈ (−0.025, 0.025 ) 0.093 0.173 0.102 0.177 0.448 0.950

Payment (log)

γ ∈ (−0.100, 0.100 ) 0.271 0.917 1.720 2.366 6.983 11.271


γ ∈ (−0.20 0, 0.20 0 ) 0.160 1.029 1.614 2.472 6.380 11.908
γ ∈ (−0.30 0, 0.30 0 ) 0.048 1.141 1.508 2.578 5.776 12.547
γ ∈ (−0.40 0, 0.40 0 ) −0.063 1.253 1.403 2.684 5.170 13.187
γ ∈ (−0.50 0, 0.50 0 ) −0.175 1.366 1.297 2.791 4.562 13.827

We calculate the minimum (β min ) and maximum (β max ) average treatment effect relaxing
the exclusion restriction assumption as Conley et al. (2012) with a 95 significance level. The
direct effect of the assignment on the outcome is given by gamma. The control variables are
indicators for the kind of tax, the number of debts, the log of the original debt, the log of the
amount previously paid, strata, and district fix effect. All monetary values are in Colombian
pesos $. Robust standard errors are in parenthesis ∗ p < 0.10, ∗ ∗ p < 0.05, ∗ ∗ ∗ p < 0.01.

Supplementary material

Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.jebo.2019.12.008.

References

Allingham, M.G., Sandmo, A., 1972. Income tax evasion: a theoretical analysis. J. Publ. Econ. 1 (3–4), 323–338. doi:10.1016/0047-2727(72)90010-2.
Arceneaux, K., 2007. I’m asking for your support: the effects of personally delivered campaign messages on voting decisions and opinion formation. Q. J.
Polit. Sci. 2 (1), 43–65. doi:10.1561/10 0.0 0 0 060 03.
Arceneaux, K., Nickerson, D., 2006. Even if you have nothing nice to say, go ahead and say it: two field experiments testing negative campaign tactics. In:
Proceedings of the 2005 Meeting of the American Political Science Association. Washington, DC, USA
Barton, J., Castillo, M., Petrie, R., 2014. What persuades voters? A field experiment on political campaigning. Econ. J. 124 (574). doi:10.1111/ecoj.12093.
Behavioral Insights Team (BIT), 2012. Applying Behavioural Insights to Reduce Fraud, Error and Debt. Cabinet Office, London, UK.
Bertrand, M., Karlan, D., Mullainathan, S., Shafir, E., Zinman, J., 2010. What’s advertising content worth? Evidence from a consumer credit marketing field
experiment. Q. J. Econ. 125 (1), 263–305. doi:10.1162/qjec.2010.125.1.263.
Blumenthal, M., Christian, C., Slemrod, J., 2001. Do normative appeals affect tax compliance? Evidence from a controlled experiment in minnesota. Natl. Tax
J. 125–138. doi:10.17310/ntj.2001.1.06. LIV, No.1
Card, D., DellaVigna, S., Malmendier, U., 2011. The role of theory in field experiments. J. Econ. Perspect. 25 (3), 39–62. doi:10.1257/jep.25.3.39.
300 D. Ortega and C. Scartascini / Journal of Economic Behavior and Organization 170 (2020) 286–300

Carrillo, P., Pomeranz, D., Singhal, M., 2017. Dodging the taxman: firm misreporting and limits to tax enforcement. Am. Econ. J. Appl. Econ. 9 (2), 144–164.
Castro, L., Scartascini, C., 2015. Tax compliance and enforcement in the pampas evidence from a field experiment. J. Econ. Behav. Org. 116, 65–82. doi:10.
1016/j.jebo.2015.04.002.
Chirico, M., Inman, R., Loeffler, C., Macdonald, J., Sieg, H., 2015. An experimental evaluation of notification strategies to increase property tax compliance:
free-riding in the city of Brotherly Love. In: Tax Policy and the Economy, 30. University of Chicago Press.
Chirico, M., Inman, R., Loeffler, C., Macdonald, J., Sieg, H., 2019. Deterring property tax delinquency in philadelphia: an experimental evaluation of nudge
strategies. NBER Working Papers. National Bureau of Economic Research.
Conley, T.G., Hansen, C.B., Rossi, P.E., 2012. Plausibly exogenous. Rev. Econ. Stat. 94 (1), 260–272.
Daude, C., Perret, S., Brys, B., 2015. Making Colombia’s Tax Policy More Efficient, Fair and Green. OECD Economics Department Working Paper 1234. OECD
Publishing.
Del Carpio, L., 2014. Are the neighbors cheating? evidence from a social norm experiment on property taxes in peru. Unpublished Manuscript. Princeton
University
Della Vigna, S., List, J.A., Malmendier, U., 2012. Testing for altruism and social pressures in charitable giving. Q. J. Econ. 127 (1), 1–56. doi:10.1093/qje/qjr050.
arXiv:1011.1669v3.
Dell’Anno, R., 2009. Tax evasion, tax morale and policy maker’s effectiveness. J. Socio Econ. 38 (6), 988–997. doi:10.1016/j.socec.20 09.06.0 05.
Doerrenberg, P., Schmitz, J., 2017. Tax compliance and information provision. A field experiment with small firms. J. Behav. Econ. Policy 1 (1), 47–54.
Dwenger, N., Kleven, H., Rasul, I., Rincke, J., 2016. Extrinsic and intrinsic motivations for tax compliance: evidence from a field experiment in Germany. Am.
Econ. J. Econ. Policy 8 (3), 203–232. doi:10.1257/pol.20150083.
Fellner, G., Sausgruber, R., Traxler, C., 2013. Testing enforcement strategies in the field: threat, moral appeal and social information. J. Eur. Econ. Assoc. 11
(3), 634–660. doi:10.1111/jeea.12013.
Freeman, R.B., 1997. Working for nothing: the supply of volunteer labor. J. Labor Econ. 15 (1), S140—166. doi:10.1086/209859.
Gerber, A.S., Green, D.P., 20 0 0. The effects of canvassing, telephone calls, and direct mail on voter turnout: a field experiment. Am. Polit. Sci. Rev. 94 (3),
653–663. doi:10.2307/2585837.
Green, D.P., Gerber, A.S., 2004. Get Out the Vote!: How to Increase Voter Turnout. Brookings Institution Press.
Green, D., Karlan, D., 2006. Effects of robotic calls on voter mobilization. Unpublished Manuscript. Columbia University
Hallsworth, M., 2015. The use of field experiments to increase tax compliance. Oxford Rev. Econ. Policy 30 (4), 658–679. doi:10.1093/oxrep/gru034.
Hallsworth, M., List, J.A., Metcalfe, R.D., Vlaev, I., 2017. The behavioralist as tax collector: Using natural field experiments to enhance tax compliance. J.
Public Econ. 148, 14–31. doi:10.1016/j.jpubeco.2017.02.003.
Harbaugh, W.T., 1998. What do donations buy?: A model of philanthropy based on prestige and warm glow. J. Publ. Econ. 67 (2), 269–284. doi:10.1016/
S0047-2727(97)00062-5.
Hashimzade, N., Myles, G.D., Tran-Nam, B., 2013. Applications of behavioural economics to tax evasion. J. Econ. Surv. 27 (5), 941–977. doi:10.1111/j.
1467-6419.2012.00733.x.
Imai, K., 2005. Do Get-Out-the-Vote Calls Reduce Turnout ? The Importance of Statistical Methods for Field Experiments. Am. Polit. Sci. Rev. 99 (2), 283–300.
doi:10.1017/S0 0 03055405051658.
Karlan, D., List, J.A., 2007. Does Price Matter in Charitable Giving? Evidence from a Large-Scale Natural Field Experiment. Am. Econ. Rev. 97 (5), 1774–1793.
Kessler, J.B., Zhang, C.Y., 2014. Behavioral economics and health. In: Oxford Textbook of Public Health. Oxford University Press.
Kleven, H.J., Knudsen, M.B., Kreiner, C.T., Pedersen, S.S., Saez, E., 2011. Unwilling or unable to cheat? Evidence from a tax audit experiment in Denmark.
Econometrica 79 (3), 651–692. doi:10.2307/41237767.
Lacetera, N., Macis, M., 2010. Social image concerns and prosocial behavior: Field evidence from a nonlinear incentive scheme. J. Econ. Behav. Org. 76 (2),
225–237. doi:10.1016/j.jebo.2010.08.007.
Landry, C.E., Lange, A., List, J.A., Price, M.K., Rupp, N.G., 2006. Toward an understanding of the economics of charity: evidence from a field experiment. The
Q. J. Econ. 121 (2), 747–782. doi:10.1162/qjec.2006.121.2.747. arXiv:1011.1669v3.
Lopez-Luzuriaga, A., Scartascini, C., 2019. Compliance spillovers across taxes: The role of penalties and detection. J. Econ. Behav. Org. 164, 518–534. doi:10.
1016/j.jebo.2019.06.015.
Luttmer, E.F.P., Singhal, M., 2014. Tax morale. J. Econ. Perspect. 28 (4), 149–168. doi:10.1257/jep.28.4.149. arXiv:1011.1669v3.
Mascagni, G., 2018. From the lab to the field: a review of tax experuments. J. Econ. Surv. 32 (2), 273–301. doi:10.1111/joes.12201.
Meer, J., Rosen, H.S., 2011. The ABCs of charitable solicitation. J. Publ. Econ. 95 (5–6), 363–371. doi:10.1016/j.jpubeco.2010.07.009. arXiv:1011.1669v3.
Meiselman, B., 2018. Ghostbusting in Detroit: Evidence on nonfilers from a controlled field experiment. J. Publ. Econ. 158, 180–193. doi:10.1016/j.jpubeco.
2018.01.005.
Mogollon, M., Ortega, D., Scartascini, C., 2019. Who’s calling? The effect of phone calls as a deterrence mechanism. Inter-American Development Bank.
Nickerson, D.W., 2006. Volunteer phone calls can increase turnout: evidence from eight field experiments. Am. Polit. Res. 34 (3), 271–292. doi:10.1177/
1532673X05275923.
Ortega, D., Sanguinetti, P., 2013. Deterrence and reciprocity effects on tax compliance: experimental evidence from Venezuela. In: Proceedings of the CAF
Working papers, 2013/08 doi:10.1017/CBO9781107415324.004.
Perez-Truglia, R., Troiano, U., 2018. Shaming tax delinquents. Journal of Public Economics 167, 120–137. doi:10.1016/j.jpubeco.2018.09.008.
Ramirez, R., 2005. Giving voice to latino voters: a field experiment on the effectiveness of a national nonpartisan mobilization effort. ANNALS Am. Acad.
Polit. Soc. Sci. 601 (1), 66–84. doi:10.1177/0 0 02716205278422.
Shaw, D.R., Green, D.P., Gimpel, J.G., Gerber, A.S., 2012. Do robotic calls from credible sources influence voter turnout or vote choice? Evidence from a
randomized field experiment. J. Polit. Market. 11 (4), 231–245. doi:10.1080/15377857.2012.724305.
Slemrod, J., 2017. Tax compliance and enforcement. An overview of new research and its policy implications. In: The economics of tax policy, Auerbach and
Smetters (eds). Chapter 4
Slemrod, J., Blumenthal, M., Christian, C., 2001. Taxpayer response to an increased probability of audit: evidence from a controlled experiment in Minnesota.
J. Publ. Econ. 79 (3), 455–483. doi:10.1016/S0047-2727(99)00107-3.
Slemrod, J., Collins, B., Hoopes, J.L., Reck, D., Sebastiani, M., 2017. Does credit-card information reporting improve small-business tax compliance? Journal of
Public Economics 149, 1–19.
Stollwerk, A.F., 2006. Does e-mail affect voter turnout? An experimental study of the New York City 2005 election. Unpublished Manuscript, Yale University.
Traxler, C., 2010. Social norms and conditional cooperative taxpayers. Eur. J. Polit. Econ. 26 (1), 89–103. doi:10.1016/j.ejpoleco.20 09.11.0 01.
Treasury General Inspector for Tax Administration, 2014. Delinquent taxes may not be collected because required research was not always completed prior
to closing some cases as currently not collectible. Audit report 2014-30-052014. Washington, DC, USA.
Yitzhaki, S., 1974. A note on income tax evasion: a theoretical analysis. J. Publ. Econ. 3 (2), 201–202.

You might also like