A Plea For Ecological Argument Technologies

Philos. Technol.
(2017) 30:209–238
DOI 10.1007/s13347-016-0222-6
R E S E A R C H A RT I C L E
A Plea for Ecological Argument Technologies
Fabio Paglieri 1
Received: 29 December 2015 / Accepted: 6 May 2016 / Published online: 5 June 2016
# Springer Science+Business Media Dordrecht 2016
Abstract In spite of significant research efforts, argument technologies do not seem

poised to scale up as much as most commentators would hope or even predict. In this
paper, I discuss what obstacles bar the way to more widespread success of argument
technologies and venture some suggestions on how to circumvent such difficulties:
doing so will require a significant shift in how this research area is typically understood
and practiced. I begin by exploring a much broader yet closely related question: To
what extent are people natively good at arguing? This issue has always been central to
philosophical reflection and it has become even more urgent nowadays, with the
explosion of persuasive technologies and unprecedented opportunities for large-scale
social influence. The answer hinges on what aspect of argumentation is taken under
consideration: evidence suggests that people are relatively bad at analyzing the struc-
ture of arguments, especially when these are presented out of context and in abstract
terms; in contrast, data show that even laymen tend to excel in the interactive practice
of argumentation, in particular when motivation is high and something significant is at
stake. Unfortunately, current argument technologies are more closely tailored to the
former type of activity than to the latter, which is the main reason behind their relative
lack of success with the general public. Changing this state of affair will require a
commitment to ecological argument technologies: that is, technologies designed to
support real-time, engaging and meaningful argumentative interactions performed by
laypeople in their ordinary life, instead of catering to the highly specific needs of a
minority of niche users (typically, argumentation scholars).
Keywords Argumentation . Argument technologies . Psychology of reasoning .

Ecological rationality
* Fabio Paglieri
fabio.paglieri@istc.cnr.it; http://www.istc.cnr.it/people/fabio–paglieri
1
Istituto di Scienze e Tecnologie della Cognizione, Consiglio Nazionale delle Ricerche
(ISTC-CNR), Goal-Oriented Agents Lab (GOAL), Via San Martino della Battaglia 44,
00185 Rome, Italy
210 F. Paglieri
1 Introduction
The ordinary definition of argumentation describes it as “the act or process of giving

reasons for or against something, the act or process of making and presenting argu-
ments”.1 This is in line with the understanding of argument developed in the scholarly
community, where an argument is typically defined as a claim-reason complex that
presents some premises as supporting a given conclusion (Hitchcock 2006). In fact, the
dictionary definition even hints at an important distinction made in argumentation
theory in a variety of guises: argument vs. argumentation (Habermas 1984), argument
1 vs. argument 2 (O’Keefe 1977), and the nowadays more common labels argument-as-
product vs. argument-as-process (for discussion, see Reed and Walton 2003; Walton
and Godden 2007). In all these instances, a difference is noted between the proposed
inferential structure tying some premises to a certain conclusion (argument, argument 1,
argument-as-product) and the dialogical interaction where such inferential structure is
proposed and discussed (argumentation, argument 2, argument-as-process), whether
such dialogue is interpersonal, as it is most often the case, or internal to the reasoning
process of a single agent.
Argument technologies are tools designed to support the argumentative practices of
users and/or further elaborate their resulting output—typically, textual arguments. In
order to make this working definition precise enough to be useful, it is crucial to specify
what “supporting argumentative practices” exactly means. Most Internet technologies
(as well as the Internet itself) afford people with previously inaccessible opportunities
for various types of social interactions, including argumentation. Most of these technol-
ogies, however, are not designed to specifically promote argumentation as a preferable
type of engagement, and some of them may even tend to deter from it. Consider
Facebook as a case in point: while the volume of dialogical engagements happening
there is nothing but staggering, only a tiny fraction of them would count as “arguments”
in the sense discussed above, whereas a much larger proportion would exemplify a
different meaning of “argument,” i.e., a verbal fight with a back and forth of mutual
accusations and inflammatory language. Moreover, interactions on Facebook are mostly
non-argumentative because of how the platform works, not in spite of it (as discussed in
Kirschner 2015). Thus, Facebook and other social networking sites (SNSs) do not count
as argument technologies per se, although it is of course possible to develop SNS
platforms designed to that end (e.g., Klein 2012) or to use existing SNSs to promote
learning via argumentation (for discussion, see Tsovaltzi et al. 2015a, 2015b).
Argument technologies are a thriving multi-disciplinary research field, and a new
synthesis of its main directions of interest is yet to appear—although a rich but somehow
outdated picture can be found in the collections edited by Reed and Norman (2004), Bench-
Capon and Dunne (2007), Rahwan and McBurney (2007), and Rahwan and Simari (2009).
For the purposes of this contribution, it suffices to distinguish between approaches that use
argumentation as a technical tool to facilitate effective interaction among a community of
software agents (e.g., in argumentation-based negotiation, see Rahwan et al. 2004;
Karunatillake et al. 2009) and technologies aimed at facilitating argumentation among
human users (e.g. in the context of CMC and CSCW) or at creating usable interfaces for
argumentative interaction between users and computer programs (e.g., in decision support
1
“Argument”, retrieved December 28, 2015, from http://www.merriam-webster.com/dictionary/argument
Ecological Argument Technologies 211
systems). Only the latter, i.e., human-oriented argument technologies, will be further
discussed in this paper. The success of this research area is attested not only by a growing
number of scholarly publications but also by several international workshops,2 a biennial
conference, 3 various large-scale research projects, 4 and, since 2010, the creation of an
international journal specifically devoted to argument technologies, Argument &
Computation (http://www.iospress.nl/journal/argument-computation/). Recently, the links
between argument technologies and other related research fields, such as negotiation,
reputation, trust, and norms, have been explicitly acknowledged, leading to the creation
of a new umbrella concept, so-called agreement technologies (Ossowski 2012), which in
turn attracted research funding5 and originated its own international conference.6 Instead,
cross-fertilization between argument technologies and persuasive technologies has been
negligible so far—although the recent studies on SNSs just mentioned have the potential to
start changing that. The Stanford Persuasive Technology Lab (http://captology.stanford.
edu/) has gained worldwide recognition for its work on captology, that is, the study of
computers as persuasive devices (Fogg 2003). However, in spite of the obvious topics of
common interest between argument technologies and persuasive technologies, these two
areas rarely overlap: technological solutions are either oriented towards the application of
rational models of argumentation or based on persuasion strategies that remain largely
independent from the former. This lack of communication is regrettable and also
symptomatic of a key problem in argument technologies—namely, too little interest for
how arguers engage in ordinary discourse in real life (see Section 3).
Given the amount of attention and research efforts devoted to argument technolo-
gies, one would expect them to be having a transformative effect on how people argue,
especially on those media for which such technologies were natively designed, i.e.,
online. However, no such transformation is yet apparent, certainly not in the direction
of enhanced argumentative skills being deployed by Internet users. Again, let us take
SNSs as a case in point, given their widespread adoption. While getting robust
empirical evidence on such large-scale processes is hard at best, the predominant view
is that SNSs increase opportunities for interaction but often at the price of favoring
unproductive or even hostile engagements, with potentially adverse social conse-
quences. SNSs allow the preservation and development of social relations in spite of
time and distance constraints (Ellison et al. 2007), as well as providing both an
alternative and a remedy to lack of social opportunities in one’s offline environments
(Steinfield et al. 2008); at the same time, descriptive statistics from the Pew Research
Center reveal a marked spreading across SNSs of “online incivility,” a complex
2
CMNA, Computational Models of Natural Arguments, since 2001 (http://www.cmna.info/); ArgMAS,
Argumentation in Multi-Agent Systems, since 2004 (http://www.mit.edu/~irahwan/argmas/);
UM4Motivation, User Models for Motivational Systems, in 2011 and 2012 (http://cgi.csc.liv.ac.uk/
~floriana/UM4Motivation2/Home.html)
3
COMMA, Computational Model of Arguments, since 2006 (http://www.comma-conf.org/)
4
Just to mention a few: ASPIC, Argumentation Service Platform with Integrated Components (http://cordis.
europa.eu/ist/kct/aspic_synopsis.htm); DAM, Dialectical Argumentation Machines (http://www.arg-tech.org/
index.php/projects/dialectical-argumentation-machines/); DYNARG, The Dynamics of Argumentation (http://
icr.uni.lu/dynarg/DYNARG/Home.html); and the recent EPSRC large grant on Argument Mining (http://gow.
epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/N014871/1)
5
The EU COST action Agreement Technologies (http://www.agreement-technologies.eu/)
6
AT, International Conference on Agreement Technologies, since 2012 (http://www.agreement-technologies.
eu/resources/at-conference-series)
212 F. Paglieri
phenomenon including aggressive and disrespectful behaviors, vile comments, harass-

ment, and hate speech (Rainie et al. 2012; Duggan 2014), and recent studies suggest
that the exposure to online incivility may be detrimental to SNS users’ trust and well-
being (Sabatini and Sarracino 2014; Antoci et al. 2015). None of this is to blamed on
argument technologies, of course: however, it does not look like arguments “in the
wild” are getting any better due to technological advancements—in fact, some data
suggest the opposite is the case.
Thus, it is safe to say that the argument technology revolution has yet to happen, at
least in the ways its proponents envisioned it, i.e., as a technology-based improvement
of users’ argumentative skills and practices. However, perhaps it is just a matter of time:
Rome was not built in a day, and what argument technologies aim to accomplish is a
transformation on a grand scale, which will certainly take some time. But with renewed
efforts, we will get there — or so the argument goes. This is by and large the view of
proponents of argument technologies, even those who happen to be very realistic on its
challenges and prospects. Without intending to curb the current enthusiasm on argu-
ment technologies, this paper proposes a slightly more pessimistic view of their
prospects. Importantly, this pessimism is not unconditional: the suggestion is not that
argument technologies are destined to fail no matter what—instead, it is argued that, for
these technologies to scale up as desired, a change of course in their development is
needed, as well as a reconsideration of what principles should inspire their design.
Addressing these concerns first requires facing a more fundamental problem, one that
has attracted philosophical interest since ancient times: How good (or bad) are our
native argumentative skills? Are we natural born arguers or hopeless argumentative
dummies? The answer, not surprisingly, depends on what kind of argumentative skills
one focuses on: Section 2 summarizes evidence suggesting that arguers are relatively
bad at analyzing the structure of arguments in highly abstract terms, whereas they are
quite proficient at intuitively engaging in argumentative interaction; in addition, a
conceptual framework to reconcile these two facts is outlined. The implications on
argument technologies are discussed in Section 3, to show how most current methods
focus on the first type of argumentative skills, trying to promote and/or exploit abstract
analysis of argument structure among users: this approach, it is argued, is partly
responsible for the relative lack of success of argument technologies. In Section 4, an
alternative proposal is presented, based on a plea for ecological argument technologies.
To be ecological,7 a technology needs, among other things, to be self-sustaining: that is,
users must be able and willing to use it without requiring either external incentives or
constant monitoring from third parties. If an argument visualization tool for students is
used by pupils only when forced to do so by their teacher, it is not ecological. If a SNS
designed to support proper argumentative interaction is populated only during a
university course in return for credits, and abandoned immediately afterwards, it is
not ecological. If an annotation system to crowdsource argument analysis is adopted
only by paid users, it is not ecological. And without this type of sustainability, human-
7
The notion of ecological argument technologies detailed in this paper refers to the broader concept of
ecological rationality, as developed by Gigerenzer and Selten (2001) and Gigerenzer et al. (2011). The ties
between the present proposal and this line of work will be further discussed later on: for now, let us just say
that an argumentative technology, in order to be ecological in the required sense, ought to be designed to
match the argumentative skills human users developed in response to the ecological pressures they face in
everyday life.
oriented argument technologies will never scale up effectively enough to make a

difference in the current information ecology.8
2 The Limits of Our Mistakes: the Bad and the Good in Lay Arguers
The assumption that laypeople are not particularly good at arguing, at least not according
to the desired normative standards, is deeply entrenched in argumentation theory.
Consider for instance the notion of a fallacy, popular in the philosophy of argument
since ancient times, e.g., in Aristotle’s Sophistical Refutations, and revived in contem-
porary argumentation theory by the seminal work of Charles Hamblin (1970). A fallacy
is not just an inferential error but also one that people are supposed to commit often,
mostly because they (mistakenly) consider its underlying logic to be compelling: it is
precisely the alleged frequency of these errors that justifies the use of specific labels to
name them, such as argumentum ad hominem (supporting or criticizing a claim based on
the personal characteristics of whoever is proposing it), straw man fallacy (attacking a
claim by targeting a position that actually misrepresents it), post hoc ergo propter hoc
(confusing temporal succession with causation), and many more. This has been dubbed
“the standard conception of a fallacy in the Western tradition” (Hitchcock 2007), and it is
indeed easy to find examples of its influence on how fallacies are defined and analyzed in
contemporary argumentation theory. The following are two notable examples:
Fallacies are the attractive nuisances of argumentation, the ideal types of improp-
er inference. They require labels because they are thought to be common enough
or important enough to make the costs of labels worthwhile (Scriven 1987, p.
333).
By definition, a fallacy is a mistake in reasoning, a mistake which occurs with

some frequency in real arguments and which is characteristically deceptive
(Govier 1987, p. 177).
The conception expressed in these and other popular definitions of fallacies is

summarized by the acronym EAUI (Woods 2013): on this view, fallacies are supposed
to be inferential errors (E) that are attractive (A), universal (U), and incorrigible (I)—
because, if they were corrigible, at some point people would stop committing them, yet
fallacy theorists are convinced fallacies are as frequent nowadays as they ever were.9
8
Non-ecological argument technologies may still be very valuable for dedicated purposes, e.g., education, and
may even provide guidance on how to design more productive and sustainable platforms and tools. What they
cannot do, however, is to scale up, as long as they fail to meet the ecological challenge. This is the claim
articulated in this paper, which of course does not deny the potential usefulness of argument technologies also
in more restricted domains.
9
Both the universality and the incorrigibility claims are tied to the tradition of classifying fallacies at a high
level of abstraction, as idealized inference patterns—a tradition criticized by van Eemeren and Grootendorst
(1995), among others. In contrast, situated learning theory (Lave and Wenger 1991) has provided evidence that
otherwise common errors occur less frequently (and are more varied) in concrete problem-solving situations,
as opposed to abstractly defined problems. An echo of this line of work can be found in recent attempts to
articulate more nuanced versions of fallacy theory, which will be discussed in Section 2.2 (see also Boudry
et al. 2015; Paglieri 2016).
214 F. Paglieri
The EAUI conception of fallacies have often attracted criticism, both in the past
(Massey 1981; Finocchiaro 1981; Hintikka 1987; Hitchcock 1995; van Eemeren and
Grootendorst 1995) and in recent years (Woods 2013; Boudry et al. 2015; Paglieri
2016), yet it remains prominent in the literature, and it certainly inspires the treatment
of fallacies in most logic textbooks, where the most common, quick-and-dirty defini-
tion of fallacies is “an argument that seems to be valid but it is not.” What matters for
present purposes is not whether the standard EAUI conception of fallacies is satisfac-
tory but rather the assumption on human rationality upon which that conception is
based: defining fallacies as frequent mistakes of reasoning implies a view of laypeople
as highly defective arguers.
The same, negative outlook on the argumentative skills of the general population is
mirrored in the urgency and emphasis given to the issue of critical thinking education
across the world. Historically, critical thinking and argumentation theory have always
been intertwined: for instance, the approach to arguments championed by the so-called
informal logic movement (Johnson and Blair 1977) had a profound effect in shaping
critical thinking curricula in North America. Thematically, the fundamental connection
between these areas is twofold: on the one hand, teaching people how to think critically
requires arming them with the ability to understand, reconstruct, and evaluate argu-
ments; on the other hand, the defining features of what makes someone a good arguer
are precisely the type of traits that critical thinking education intends to foster—
analytical rigor, precision, open-mindedness, and fairness, to mention but a few. As
noted, these educational needs are acknowledged almost universally: whether the aim
of teaching critical thinking is effectively supported by nation-wide policies and
concrete investments (as it has been happening, to some degree, in the USA and in
Canada since the 90s) or just paid lip service in institutional documents (as it has been
the case recently in Italy10), it is almost impossible to find a country in the world where
no mention is made of the importance of critical thinking in some official educational
guidelines. In the scholarly literature, there is widespread agreement on the central role
of critical thinking in education, as witnessed by the consensus statement reached by
the expert panel sponsored by the American Philosophical Association (Facione 1990).
While the methods to teach critical thinking effectively are hotly debated, alongside the
issue of whether critical thinking skills are domain-specific or domain-independent
(Ennis 1989; McPeck 1990; Sà et al. 1999), the fact that critical thinking education is
both important and urgent is agreed by all parties. This emphasis on critical thinking is
even magnified when it comes to educational technologies, insofar as critical discourse
and reflection constitute a key component in the community of inquiry model for
distance education (Garrison et al. 2001): if learning to reason well is crucial at the
individual level, to avoid error, it is considered even more essential at the group level, to
10
The key documents to consult, in order to get a sense of the dominant institutional view on critical thinking
education in Italy, are the current guidelines on the curriculum of Italian kindergartens and primary and
secondary schools (“Indicazioni nazionali per il curricolo della scuola dell’infanzia e del primo ciclo di
istruzione,” http://www.indicazioninazionali.it/documenti_Indicazioni_nazionali/indicazioni_nazionali_
infanzia_primo_ciclo.pdf), issued by the Italian Ministry of Education and Research (MIUR) in September
2012. In that document, critical thinking is identified as a priority across all disciplines, in spite of the utter lack
of (i) dedicated critical thinking training for teachers and (ii) resources to support the proclaimed shift towards
more rigorous critical thinking education in Italian schools.
avoid error and a host of social problems—misunderstandings, polarization, flaming,

loss of interest for the least argumentative participants, etc.
Regardless of many important theoretical and practical differences, much of the
discussion on critical thinking education is thus predicated on a shared premise: left to
their own devices, laypeople would be terrible arguers—exactly the same assumption
underlying the standard version of fallacy theory. But is this assumption truly warrant-
ed? In what follows, it is argued that the evidence suggests a different, more nuanced,
and very instructive answer to this central issue.
2.1 We, the Arguers: Bad at Abstract Analysis…
Much of the alarm on critical thinking skills originates from critical thinking tests
themselves: since the average scores typically observed are remarkably low, this seems
to confirm the need for intervention. In some instances, these worrying findings can be
imputed to the poor design of testing instruments. Consider for instance the following
two questions, taken from a very popular standardized test, the California Critical
Thinking Skills Test (CCTST):
(A) Local recreational soccer leagues can be very exciting to follow, since teams are
often evenly matched. In two recent games, the Sparklers beat the Wildflowers,
and the Wildflowers beat the Mustangs. Based on this knowledge, what can be
concluded on the upcoming match between the Sparklers and the Mustangs?
(B) What is the exact meaning of the sentence: “Erzenians tell lies”?
According to the test, the correct answer to (A) is “The Sparklers will probably beat the
Mustangs, although they may lose,” while the appropriate meaning of the sentence in (B) is
“If someone is Erzenian, then that person is a liar.” However, as noted by Groarke (2009),
regarding (A), a true critical thinker may be perfectly justified in withholding judgment, thus
rejecting the allegedly correct answer, based on lack of evidence: without further knowledge
of the previous matches (e.g., the circumstances and the extent of each victory), and given
that teams are “evenly matched,” we simply do not know enough to consider one outcome
more likely than another, unless we accept a very lousy standard of evidence—which is not
something critical thinking education ought to encourage, to be sure. The problem with (B)
is even more blatant: in natural language, generic statements like “Erzenians tell lies” are
typically (and correctly) interpreted as generalizations valid for the majority of cases, not the
totality—in short, they are not interpreted as universal statements, in contrast with the answer
proposed by the test. For instance, “Italians love pizza” is not falsified by the fact that a
minority of Italians in fact dislike pizza, since the intended meaning, well understood by all
competent speakers, is not a universal regularity but rather a generalization on a vast
majority. The upshot is that, according to Groarke, we should not put too much faith in
CCTST scores, since it is unclear whether the test measures critical thinking at all.
However, other standardized tests do not seem to suffer from the same basic flaws
and thus can be used (and are indeed routinely used) as a rough-and-ready measure of
critical thinking competence. An instrument widely applied in academic settings is the
Thinking Skills Assessment test (TSA),11 developed by Cambridge University and used
11
See http://www.admissionstestingservice.org/for-test-takers/thinking-skills-assessment/.
216 F. Paglieri
for admission there, as well as in other universities (e.g., Oxford, UCL, Leiden; for
discussion on the skills the test is supposed to measure, see Butterworth and Thwaites
2013). The critical thinking section of the TSA is based on short verbal arguments, and
each exercise requires to focus only on a single feature of the underlying argument: its
main conclusion, its hidden assumption, its worst flaw, etc. There are only seven types
of exercise, and it is a multi-choice test with no hard penalty for wrong answers: that is,
trying to guess is not punished, and it is in fact the rational strategy to adopt, especially
if the candidate is able to exclude one or more possible answers as patently mistaken (as
it is often the case). Moreover, the verbal stimuli are stylized argumentative texts: they
are based on real-life texts but without most of the usual textual noise. In other words,
the test makes things easier, in comparison with the complexity of argumentative
engagements in real life. The following is an example of a test item12:
Vegetarian food can be healthier than a traditional diet. Research has shown that
vegetarians are less likely to suffer from heart disease and obesity than meat
eaters. Concern has been expressed that vegetarians do not get enough protein in
their diet but it has been demonstrated that, by selecting foods carefully, vege-
tarians are able to amply meet their needs in this respect.
Which of the following best expresses the main conclusion of the above
argument?
A A vegetarian diet can be better for health than a traditional diet.
B Adequate protein is available from a vegetarian diet.
C A traditional diet is very high in protein.
D A balanced diet is more important to health than any particular food.
E Vegetarians are unlikely to suffer from heart disease and obesity.
In spite of the apparent simplicity of the test, most people find the TSA very
challenging. At Cambridge University, the average TSA score of an applicant for
admission ranged approximately in the interval 0–100 and factoring in question and
overall test difficulty is in the high 50s, with only around 10% of applicants scoring
over 70.13 In the 2014–2015 admissions round at Oxford University, the average TSA
overall mark for summoned candidates (i.e., the top 25% of all applicants) was 67.1.14
These figures show that even highly skilled, strongly motivated, and well-prepared
candidates still fail more than 3 out of 10 TSA exercises. Things look even worse when
the TSA is taken by people with no prior knowledge of its structure, as opposed to
12
Source: http://www.admissionstestingservice.org/images/47832-tsa-test-specification.pdf (last consulted on
December 27, 2015)
13
Source: http://www.admissionstestingservice.org/for-test-takers/thinking-skills-assessment/tsa-cambridge/
about-tsa-cambridge/ (last consulted on December 27, 2015)
14
Source: http://www.merton.ox.ac.uk/admissions-feedback-economics-and-management (last consulted on
December 27, 2015)
admission candidates, who typically invest at least some time preparing for the test on
practice versions of it.
The following data are taken from my own experience in teaching critical thinking to
various audiences, often using a TSA-based diagnostic test to assess the starting level of
proficiency in the class. Figure 1 shows the average percentage of correct answers
registered in four different groups of people: students in primary and secondary schools
(N = 637), their own teachers (N = 43), undergraduates just enrolled at a private uni-
versity (N = 101), and participants in the Second Interdisciplinary Graduate School on
Argumentation and Rhetoric (IGSAR), held in Warsaw in 2014 (N = 15). The latter
sample included PhD students working on argumentation and rhetoric as well as their
tutors for the graduate school, marking them as bona fide experts on argument analysis.
In spite of the numerous differences across these four groups, none of them had any
prior training on the TSA at the time of testing. All data were collected between
February 2014 and September 2015. Three out of four groups were tested on a 10-
item version of the TSA critical thinking test, whereas one group (university students)
used a 30-item version of it with the same difficulty level, based on previously recorded
scores on individual items. The battery administered to students of primary and
secondary school was simplified in terms of language but preserving the same under-
lying argument structure and the same difficulty level. All tests were administered in
Italian to native Italian speakers, except to 49 university students (who took the critical
thinking course in English, as part of their English-based graduate studies) and to
IGSAR 2014 participants, who were all fluent in English.15
These data were collected in the course of my teaching activity, not as part of a
controlled experiment: thus, their validity is to be taken with a modicum of caution.
Nonetheless, a couple of trends are just too marked and suggestive to be ignored.
Firstly, none of these groups managed to perform over 50% of correct responses on
average, not even actual experts at argument analysis: this should not be taken to
indicate poor competence in this particular group but rather as evidence that the type of
argument analysis required by the TSA is not something we are well suited to perform,
not even after we spent a considerable part of our life preparing for it. Secondly, no
significant difference was observed between school teachers and their own pupils (the
mean of correct responses was 30.5% for teachers and 28.2% for students; t test,
t(45.83) = −0.96, p = 0.34, corrected for unequal sample variance, NS), with both
groups performing close to chance level (20% in this test). Given that teacher training
in Italy does not include any type of critical thinking instruction, this is less surprising
than it may seem. But considering the central role assigned to critical thinking
education by current institutional documents in Italy, these data should nevertheless
cause great concern, since they suggest teachers are extremely ill-equipped to instruct
their pupils on critical thinking. This latter worry is confirmed by comparing student
performance at different stages of schooling (see Fig. 2): in this case, statistical
comparison is legitimate, since all students undertook the same test under analogous
conditions, although the difference in sample size suggests caution in interpreting the
15
Given all these differences across groups, in terms of test materials, sample size, methodology, context, etc.,
running statistical analyses on these data would be rather uninformative. Thus, statistical details will be kept to
a minimum and provided only for those comparisons that are methodologically appropriate. For the same
reason, these findings are meant here only to illustrate the difficulties that various sorts of people encounter in
dealing with abstract argument analysis.
218 F. Paglieri
Fig. 1 Average performance on a TSA-based critical thinking test in four groups of people without prior
training in this particular test. Data refer to primary and secondary school students (N = 637), their own
teachers (N = 43), undergraduates just enrolled at a private university (N = 101), and experts on argumentation
at various stages of training (N = 15)
results. The outcome does not speak in favor of marked improvements in critical
thinking competence, at least as measured by the TSA, in between the fourth year of
elementary school (approximately age 9) and the fifth year of high school (around age
18). Whereas an analysis of variance does reveal a main effect of year of schooling
(one-way ANOVA, F(9, 637) = 3.4, p = 0.0004), this is actually an artifact of the
unusually good performance of students in the third year of high school (p < 0.05 in
all pairwise comparisons except two, Tukey HSD post hoc test), which was most likely
due to a quirk in the sampling procedures (i.e., we chanced upon an unusually bright
class of third years), since this result was not confirmed for students in the fourth and
fifth year of high school. Further data will be needed to confirm these preliminary
findings, especially regarding high school students, but the first impression is that
10 years of schooling do little to improve critical thinking competence in Italian pupils.
While these data should preoccupy anyone interested in improving critical thinking
skills in the general population, it is also important to carefully assess their scope. The
TSA test is meant to measure a specific aspect of critical thinking, rather than critical
thinking in general: by its very nature, the test evaluates one’s skill at analyzing
argument structure, based on simple texts abstracted away from any relevant con-
text—in fact, in preparatory sessions TSA candidates are typically instructed to stick to
the letter of the text and avoid any over-interpretation, no matter how well justified by
pragmatic considerations. Such skill is not without value, but it certainly does not
exhaust the wide spectrum of abilities needed to qualify someone as a critical thinker
and/or a competent arguer (for a tentative list in relation to popular standardized tests,
see Ennis 1993; Possin 2008). In fact, some of the very abilities the TSA requires us to
bracket (e.g., interpreting texts based on shared pragmatic conventions) are actually
essential to smooth interaction in everyday argumentation. Thus, the data on critical
thinking education reviewed so far, rather than justifying the assumption that people are
poor arguers in general, only support a more circumscribed claim: laypeople, and
Fig. 2 Average performance on a 10-item TSA critical thinking test of Italian students in different years of
elementary school (white columns, fourth and fifth year), middle school (gray columns, from first to third
year), and high school (black columns, from first to fifth year), without prior training. All comparisons are
between-subjects
apparently even experts, struggle to engage successfully in the abstract analysis of

argument structures, even under relatively simplified conditions.
Another important source of evidence on the alleged weakness of our reasoning
powers is the so-called heuristics and biases program in cognitive psychology, which
gained worldwide renown in the wake of the Nobel Prize awarded to Daniel Kahneman
in 2002 for his shared work with Amos Tversky. The correct interpretation of the notion
of heuristics and its relationship with Herbert Simon’s ideas on bounded rationality
(e.g., Simon 1956) are at the core of a heated debate between two opposite fronts, the
“heuristics as biases” approach favored by Kahneman himself (for a recent non-
technical exposition, see Kahneman 2011) and the “heuristics as adaptive tools” theory
championed by Gerd Gigerenzer and others (e.g., Gigerenzer and Selten 2001;
Gigerenzer et al. 2011). Let us first focus on the former view of heuristics: according
to Kahneman, our judgments are biased by a series of automatic (intuitive) responses
that are rigid, i.e., unable to adapt to the specific features of the task at hand and thus
likely to lead us into error. With respect to reasoning tasks, this view is best illustrated
by discussing some notable examples. Consider the following exercise in disjunctive
reasoning, adapted from Levesque (1986):
Paul is looking at Linda and Linda is looking at Patrick.
Paul is married but Patrick is not married.
Is a person who is married looking at a person who is not married?
Yes/No/We cannot tell
The intuitive response to this puzzle is, for the vast majority of people, “We cannot
tell” (Trouche et al. 2014). However, upon reflection, the correct response is clearly
“Yes,” as a simple application of disjunctive reasoning shows: Linda is either married
220 F. Paglieri
or not married; if she is married, then someone who is married (Linda herself) is
looking at someone who is not (Patrick); if, on the other hand, Linda is not married,
then someone who is married (Paul) is still looking at someone who is not (Linda, in
this case). Either way, the answer to the question is “Yes.” However, the first response
provided by people is biased by the intuition that “We need to know something more
about Linda to answer either yes or no,” an intuition that happens to be mistaken: it is
enough to know that Linda is either married or not to be able to answer the test
question.
Similar results are observed using the Cognitive Reflection Test (CRT), proposed by
Shane Frederick in 2005. The CRT includes just three (deceptively) simple items, as follows:
(A) A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How
much does the ball cost?
(B) If it takes five machines 5 min to make 5 widgets, how long would it take 100
machines to make 100 widgets?
(C) In a lake, there is a patch of lily pads. Every day, the patch doubles in size. If it
takes 48 days for the patch to cover the entire lake, how long would it take for the
patch to cover half of the lake?
Each question elicits a strong intuitive response (10 cents, 100 min, 24 days,
respectively) that happens to be systematically mistaken, since the correct answers
are, respectively, 5 cents, 5 min, and 47 days. As in the previous case, a moment of
reflection is enough to see why the intuitive response is wrong, but the relevant fact is
that even smart people rarely take the time to engage in such reflective practice—or so
the story goes. Frederick (2005) reported mean CRT scores (ranged 0–3) obtained with
undergraduates at a variety of US universities, and results were not flattering, not even
for top-ranking institutions such as Harvard (1.43), Carnegie Mellon (1.51), and
Princeton (1.63); even MIT students, who were the best performers in Frederick’s
sample, only averaged 2.18 in the CRT, with 52 % of the respondents failing on at least
one of the three items. Subsequent studies with the CRT confirmed this pattern of
results and usually reported even lower performance in less highly qualified samples of
respondents: for instance, Toplak et al. (2011) found a mean performance of 0.7 correct
answers in a sample of 363 undergraduates, with as many as 55.8 % of participants
failing to solve even one of the three problems and only 6.6 % being able to answer
correctly all of them.
These findings on poor reasoning performance should not be ignored by anyone
interested in improving our argumentative skills, including scholars engaged in the
development of argument technologies. Yet, once again, their true scope first has to be
assessed. After all, the puzzles used to demonstrate our systematic tendency to make
intuitive mistakes are not just a garden variety of problems we face in everyday life: not
only they tend to give relevant information in a rather convoluted fashion, they are also
deliberately designed to include a red herring, i.e., a highly salient but ultimately
irrelevant (or even misleading) feature of the problem. Moreover, people are asked to
mull over these puzzles as a feat of solitary reasoning, instead of facing them in the
context of a social act of collective reasoning, which is what argumentation is all about
in real life. Would it make a difference in performance, if participants had to consider
this type of problems in the context of an argumentative exchange? The next section
summarizes evidence that bears on that question and further illuminates the nature of
our argumentative skills or lack thereof.
2.2 …But Good at Social Engagement
Hugo Mercier and Dan Sperber have recently proposed an argumentative theory of
reasoning (2011), according to which reasoning did not evolve as a correction mech-
anism for mistaken intuitions or as a support system for individual decisions, contrary
to the view championed by Kahneman and others, but rather as a tool to effectively
engage in a special type of social activity—namely, argumentation. On this view, the
function of reasoning is to find and evaluate reasons in dialogic contexts, not through
solitary introspection. The evolutionary rationale of the theory has been extensively
discussed elsewhere (Sperber et al. 2010; Mercier and Sperber 2009, 2011; Mercier
2013) and it is of secondary importance here. What matters is that the argumentative
theory of reasoning offers a strikingly different interpretation of the findings summa-
rized in the previous section. Mercier and collaborators argue that if the function of
reasoning was indeed to correct mistaken intuitions, then the high incidence of errors
observed in laboratory tasks would be hard to account for, especially considering tasks
in which reaching the correct solution only requires the application of basic knowledge
in logic (Wason 1966), statistics (Tversky and Kahneman 1982), probability theory
(Tversky and Kahneman 1983), or mathematics (Frederick 2005). Moreover, according
to Mercier and Sperber, the reason why people on their own fail to achieve better
performance in such tasks is not because they try to reason correctly and fail but rather
because reasoning itself leads them further into error, by strengthening, rather than
questioning, their initial, mistaken intuitions, in line with the literature on motivated
reasoning (Kunda 1990).
Consider again the disjunctive puzzle of Paul, Linda, and Patrick presented in the
previous section: when Trouche et al. (2014) asked people to think about their answer
individually and provide a justification for it, not one of the 166 participants (out of
185) who had intuitively given the wrong answer changed their mind, since all of them
were ready to provide plenty of justifications to their intuitive mistake (e.g., “It is not
stated whether Linda is married or not”), instead of coming up with ways of challeng-
ing their initial response. Crucially, this self-confirmation pattern was not replicated
when the very same subjects were presented with reasons in support of the correct
conclusion, as provided by the minority of participants (19 out of 185) who had reached
the right solution since the onset: in this case, 65 of the initially mistaken answers were
later corrected, and 62 of these participants were also able to provide the appropriate
justification to their newly adopted solution by the end of the test; in contrast, none of
the 19 early adopters of the correct solution changed their answers, after being exposed
to flawed justifications for the wrong solution. This shows that (i) people are remark-
ably blind to their own mistakes but (ii) much better at critically discriminating the
value (or lack thereof) of information provided by third parties.
Such results align well with the predictions made by the argumentative theory of
reasoning, according to which the very same features that make solitary reasoning so
flawed, e.g., myside bias (Nickerson 1998; Stanovich and West 2007; Mercier 2010)
and reasoning laziness (Kuhn 1991; Perkins et al. 1991; Kahneman 2011), make perfect
sense in a social context, where unilaterally making your case as strong as possible is (i)
222 F. Paglieri
personally advantageous and (ii) collectively counterbalanced by the critical scrutiny

others will direct against your claims, so that (iii) directing your cognitive efforts
towards the evaluation of the arguments of others, rather than your own, is just a
sensible allocation of your limited resources, provided this laziness is selective, i.e., it
affects self-evaluations and not the cross-examination of the arguments made by others
(as shown in Trouche et al. in press). In support of this interpretation, Mercier and his
collaborators cite the significant benefits of group discussion on performance in a
variety of tasks that individuals fail to master alone, such as the Wason selection task
(Moshman and Geil 1998), in which group performance was on average four times
higher than individual results (data analyzed in Mercier et al. 2015). Recently, similar
results were reported also with respect to the CRT, where simply exposing participants
to each other solutions, with no opportunity for actual dialogue among them, was
sufficient to rapidly converge on the correct answer (Rahwan et al. 2014). More
generally, the benefits of group discussion have been observed for “problems or
decisions for which there exists a demonstrably correct answer within a verbal or
mathematical conceptual system” (Laughlin and Ellis 1986, p. 177). The relevant fact
seems to be the ability of the correct answer to assert itself: as long as one participant
gets it right, other members converge on that solution, even if it is originally held only
by a minority, or even by a single individual, and independently from how confident
the original “truth-bearer” is (Trouche et al. 2014). This does not happen by magic but
rather due to the demonstrable nature of the solution, which Laughlin and Ellis (1986,
pp. 179–180) define according to four criteria: presence of a consensus on the vocab-
ulary, syntax, and permissible relationships of the system (e.g., the axioms and rules of
mathematics); sufficient information to reach a solution within the system (e.g., a
system of two simultaneous equations in two unknowns, as opposed to one equation
in two unknowns); group members who are not themselves able to solve the problem
must have sufficient knowledge of the system to recognize and accept a correct
solution, when proposed by someone else; and the member who first finds the correct
solution must have sufficient ability, motivation, and time to demonstrate it to others.
The fact that people, including experts, systematically underestimate the benefits of
group discussion on reasoning performance (Mercier et al. 2015) is unfortunate and
contributes to explain the widespread picture of laypeople as “bad arguers,” but it does
not alter what the evidence suggests: provided with the appropriate social context in
which argumentation is meant to take place, we do not argue (nor reason) nearly as
poorly as the standard interpretation of so-called fallacies would lead us to believe.
Importantly, this is not just an emergent effect at the collective level but rather the
consequence of the nature of our argumentative competence at the individual level: we
are individually geared to pay closer attention to arguments produced by others than to
those we produce ourselves, and this is why groups outperform individual performance
on a variety of argumentative tasks. It would be misleading to take this as evidence of
the fact that groups are smarter than their members or that they somehow manage to
“make” their members smarter. Instead, what happens is at the same time simpler and
more interesting: the social context is the kind of ecology that our argumentative skills
have adapted for; thus, it is not surprising that they operate better in that situation.
Hence, the correct conclusion to draw is that in groups we are individually smarter—
not because the group magically affects our cognitive processes but simply because
those processes are evolved to function best in social interaction. So, it is the social
intelligence of individuals that make groups smart (under certain conditions, as

discussed), not the other way around.
Independent confirmation of this moderately optimistic view on our native argu-
mentative competence comes from studies on the intuitive evaluation of allegedly
fallacious arguments. As discussed at length elsewhere (Boudry et al. 2015; Paglieri
2016), contemporary studies on fallacy theory can be seen as an attempt to be more
discriminating in launching accusation of fallacious reasoning against arguers: scholars
from different traditions have come to realize that so-called fallacies often do not
instantiate any actual reasoning error and thus do not deserve such a pejorative
label—although the label itself is still widely used, somewhat paradoxically. Detailed
theories on how to distinguish between actual fallacies and their legitimate cousins
have been proposed for a garden variety of so-called informal fallacies, such as ad
hominem (van Eemeren and Grootendorst 1992; Walton 1998; Mizrahi 2010), ad
baculum (Woods 1998; Levi 1999; Walton 2000), ad verecundiam (Woods and
Walton 1974; Mackenzie 1980; Walton 1997; Goodwin 1998), and ad ignorantiam
arguments (Woods and Walton 1978; Walton 1992, 1999), among others. In spite of
their differences, all these approaches have in common the intuition that the same
superficial argument structure may or may not result in fallacious reasoning. The
following pairs of example will help illustrate the point, with respect to appeal to
ignorance (ad ignorantiam, A1 and A2) and appeals to expert opinion (ad verecundiam,
B1 and B2):
A1: Ghosts exist, since no one has proved that they do not.
A2: My flight tomorrow from Schiphol will be on time, since I have received no
indication to the contrary from the airline.
B1: Hollywood celebrities claim that homeopathy is effective, so it is likely to be

effective.
B2: Medical practitioners claim that homeopathy is effective, so it is likely to be

effective.
In contemporary argumentation theories, the fact that there is a marked difference in

presumptive strength (Walton 1996) among these arguments is uncontroversial: A1 and
B1 are weak to the point of fallaciousness, whereas A2 and B2 are strong enough to be
considered perfectly acceptable in most contexts. Various theories may provide diverg-
ing explanation of the difference, but the existence of this strange philosophical animal,
non-fallacious fallacies, is not in dispute16: there are arguments that fit the bill of the
standard conception of a certain fallacy and yet do not lack presumptive strength. This
is true even for formal fallacies, the standard bad boys of logical textbooks: affirming
the consequent (AC) and denying the antecedent (DA). As Luciano Floridi pointed out,
16
Of course, the fact that we are forced to admit the existence of non-fallacious fallacies is just another
indication of the theoretical inadequacy of the standard notion of a fallacy: a category designed to capture
erroneous reasoning ends up including in its extension also acceptable forms of inference. But what to do
about the problematic status of fallacy theory is a topic for another day (for discussion, see Woods 2013;
Paglieri 2016).
224 F. Paglieri
AC and DA can be coherently interpreted as “Bayesian ‘quick and dirty’ informational

shortcuts [which] assume […] that there are no false positives (double implication), or
that, if there are, they are so improbable as to be disregardable (degraded Bayes’
theorem)” (Floridi 2009, p. 322; see also Stone 2012). Take the following reasoning
pattern: “When I have the flu, I have a fever. Today I have a fever, therefore I have the
flu.” This is certainly not deductively valid, but is it also blatantly mistaken on any
other legitimate standard of inference? Ultimately, whether the argument is a fallacious
instance of AC or an acceptable inference to the best explanation depends on the
likelihood of alternative reasons for the observed symptoms, i.e., my fever: if this
likelihood is low enough, in relation to the conditional probability of having a fever
given the flu, the inference goes through without particular problems. Granted, it is still
defeasible, and necessarily so, since no amount of probabilistic considerations can
produce deductive validity out of thin air. But defeasibility, in and by itself, is not
enough to brand it as fallacious.
Obviously, even allowing for non-fallacious fallacies is not enough to exempt people
from the accusation of being lousy arguers: if laypeople were unable to reliably
discriminate between strong and weak instances of the same argument scheme, then
they would be blind to the nuances of argumentation, and all the scholarly insistence on
non-fallacious fallacies (formal or informal) would be, ultimately, much ado about
nothing. However, this is not the case: as the work of Ulrike Hahn, Mike Oaksford, and
others have demonstrated in recent years, people are intuitively good at discriminating
between strong and weak instances of the same, potentially fallacious argument
schemes, and they do so in ways consistent with a bona fide inference standard
(Bayesian update) and showing sensitivity to rationally relevant factors, such as prior
belief, argument strength, and nature of the evidence (Hahn and Oaksford 2007; Hahn
et al. 2009; Corner and Hahn 2012; Harris et al. 2012; Hahn et al. 2012, 2013; Collins
et al. 2015).17 These findings complement well the argumentative theory of reasoning,
suggesting that people are relatively good at appraising strength and weaknesses in
each other arguments, in spite of their deficiencies in critically evaluating their own
beliefs and claims.
3 Practical Implications: Argument Technologies for Real Arguers
The empirical evidence reviewed in the previous section can be roughly summarized as
follows: in the appropriate dialogical context, laypeople have sound intuitions on what
arguments to accept and what to reject, leading to a collective ability to converge on
appropriate solutions to difficult reasoning problems; but once the social scaffolding is
17
These results should not be confused with the claim that people are “good with probabilities,” since there is
ample evidence they are not—witness the garden variety of well-known biases of probabilistic reasoning, e.g.,
gambler’s fallacy (Tversky and Kahneman 1971), conjunction fallacy (Tversky and Kahneman 1983), and
base rate neglect (Tversky and Kahneman 1982). What these findings show, instead, is that people track the
quality of argument based on the same factors that would be relevant if Bayesian update was used: the
observed similarity is at the level of the outcome, with no claim being made on a corresponding similarity of
mechanisms—indeed, people are certainly not doing Bayesian computation as part of their explicit reasoning.
The fact that we get the outcome right when evaluating each other’s arguments, while we do not in solitary
reasoning tasks of the sort used by Tversky and Kahneman, provides further support to the argumentative
theory of reasoning.
stripped away, individual performance drops at an alarming rate, and the abstract
analysis of argument structure in particular seems to be a daunting task, even for
intelligent and well-educated people. For the sake of brevity, let us further distill these
results in three stylized facts:
1. The valuable argumentative skills people exhibit in real-life argumentation, i.e.,

their ability to correctly appraise arguments in terms of their strength, are intuitive,
rather than reflective—contrary to the standard dual systems view of rationality,
according to which reflective reasoning is a corrective mechanism of biased
intuitions.
2. These argumentative intuitions do not trigger outside of social interaction, but
otherwise, they are socially adequate: in solo reasoning, myside bias and reasoning
laziness tend to lead to error; in group discussion, they are often effective in
promoting sound collective reasoning, since the myside biases of different parties
tend to cancel out and reasoning laziness do not apply to the evaluation of
arguments put forward by others (Trouche et al. in press).
3. In general, lay arguers are structural misers, rather than just cognitive misers (Fiske
and Taylor 1984; Toplak et al. 2011)—more precisely, they are structural misers
because processing argument structure would be very demanding in terms of
cognitive effort and thus contrary to their miserly inclinations. This means that
people may very well be sensitive to the general structure and the specific nuances
of each other arguments’, while at the same time remaining unable to spell out such
structure if required.
For the development of argument technologies, these facts should immediately

inspire three design principles, in order to create tools that match the strengths and
weaknesses of their intended users:
& Intuition first: Argument technologies should tap into users’ argumentative intui-
tions and leverage their native skills, rather than asking them to perform tasks (e.g.,
abstract argument analysis) at which they are demonstrably deficient.
& Social engagement: Argument technologies should ask users to work together on
whatever task they support, rather than in isolation, since it is mostly in that context
that laypeople argumentative intuitions are conducive of sound results.
& No structure: Argument technologies should not ask users to explicitly provide
information on argument structure but rather design the system itself to structure the
argumentative interaction, making the structural information already embedded in
the system as invisible to users as possible.18
At present, most argument technologies seem set to disregard some or

even all of these design principles. The intuition first principle is ignored mostly due
to an equivocation on the nature of our argumentative competence—an equivocation
computer science partly inherited from philosophy. For a long time, argumentation
18
Similar considerations apply also to (and have received much more attention in) instructional design in
computer-supported collaborative learning, as documented by scripting studies; for a discussion of several
applications to argument education, see Weinberger et al. (2007).
226 F. Paglieri
has been regarded as the hallmark of rational thinking, and rational thinking was
invariably assumed to be all about explicit reflection and careful deliberation. This
misrepresents how people think when they argue, though: far from dwelling on the
abstract structure of each other’s arguments, they instead react instinctively to those
arguments. More often than not, they have good instincts—that is, their intuitive
appraisal of an argument is adequately tuned to those features that make the argument
objectively good or bad (as suggested by the experimental data just reviewed). But this
attunement is not based on any underlying theory of argumentation, nor it is accessible
to explicit reflection, no more than our intuitive ability to discriminate between
grammatical and ungrammatical sentences is based upon a theoretical understanding
of the rules of grammar.
Moving beyond this hyper-rationalist view of argumentation will require argument
technologies to start putting more emphasis on users’ intuitive appraisal (e.g.,
prompting them to express simple agreement or disagreement and to assign a specific
target for it—what do you agree/disagree about?), while giving less weight to the
articulation of explicit reasons behind that appraisal (e.g., no longer pestering users
with requests for further justification—why do you agree/disagree with this?). 19 This
suggestion may strike many argumentation scholars as bordering blasphemy: after all,
how can we argue effectively, without an appropriate emphasis on justification? Is it not
argumentation, by definition, the process through which reasons are put forward in
support of a claim? Of course it is, but argument technologies, to have any hope of
scaling up, should aim to engage users in the practice of arguing, rather than in the
theory of argumentation: the former is a vibrant concern for everybody, and the latter is
the professional interest of a tiny minority, i.e., argumentation scholars. Tailoring
argument technologies to the needs and habits of that minority would be a false start
of monumental proportions. In real-life argumentation, lay arguers do not make
justification explicit, unless the dialogical engagement requires it. For instance, we
do not say something like “Tomorrow we should go hiking, because the weather
forecasts are excellent”; instead, we just claim “Tomorrow we should go hiking”, and
only if a potential fellow hiker inquires “Why?”, then we counter with “The weather
forecasts are excellent, it’s an opportunity not to be missed!”, or something along those
lines. Thus, in most applications, we do not want technology to force us to spell out the
reasons behind our claims, unless and until some other user prompts us to. Doing
otherwise would turn technologies designed to be helpful into an annoying Talking
Cricket, at risk of meeting the same fate of its literary counterpart (on the Talking
Cricket problem in argument technologies, see Paglieri and Castelfranchi 2010).
The second design principle, social engagement, may seem to be fulfilled by
argument technologies as a matter of course: insofar as these technologies aim to
support our argumentative practices, how can they fail to put users in contact with
19
While this may seem reminiscent of Facebook likes, it is actually not—or, more precisely, it aims to achieve
the same level of intuitive appeal but for entirely different purposes. What prevents Facebook likes from being
used as argumentative indicators is their ambiguity of meaning: by liking a post, a comment, a photo, or
anything else, a Facebook user may express a variety of communicative intentions—approval of the contents,
support, or solidarity for the author of the contents, approval for the fact that the author of the post decided to
make it public, hilarity prompted by the contents or their posting, and more. Argument technologies should
strive to provide users with tagging options that are as appealing as Facebook likes but with a much better
defined semantics.
each other? However, the actual practice of most tools is centered around an individual
user operating in relative isolation from others. Take argument diagramming tools as a
case in point, both when they are dedicated to this specific purpose, such as Araucaria20
(Reed and Rowe 2004; Rowe et al. 2006) and its more recent successor, Online
Visualization of Argument (OVA) 21 (Janier et al. 2014), and when they include
argument diagramming as part of a more comprehensive suite of applications, as it is
the case with Carneades22 (Gordon 2010; Walton and Gordon 2012). Even if some of
these tools support some degree of interactions among different users, their standard
goal is to engage an individual user in analyzing a natural language text according to
some theoretical framework, e.g., Toulmin’s model of argument (Toulmin 1958) or
Walton’s argumentation schemes (Walton et al. 2008). From the user perspective, the
software is asking her to perform an analytical task on a written text, with little or no
interaction with other users (for further discussion, see Kirschner et al. 2003). Impor-
tantly, this is a design decision (albeit often unconscious), not a constraint of the
technology itself: as work done on mixed-initiative argumentation has demonstrated
(Chang et al. 2010; Lawrence et al. 2012), it is possible to collaboratively engage
groups of human and artificial agents in argumentation analysis and evaluation, as
exemplified by the Arvina 23 application for navigating complex debates or by the
AnalysisWall24 installation for supporting collaborative analysis in real time.
However, many existing applications force user interactions to be mediated by
expert super-users with various degrees of veto power, even when social engagement
is part of the core business of argumentation technologies, i.e., when mapping tools are
designed to support collective deliberation in large groups, as it happens for instance
with Compendium 25 (Conklin et al. 2001; Buckingham Shum et al. 2006) and
Deliberatorium 26 (Klein 2012). Thus, for instance, every new contribution to an
argument map is moderated in Deliberatorium and will not be visible to other users
until some tutor validates it. In this age of immediate communication on SNSs, such
delays and constraints are simply unacceptable to most users; hence, the inability of
such systems to scale up beyond relatively small niches (typically, the institution where
they were originally developed)—which is ironic, considering these applications are
intended to harness effectively the so-called wisdom of crowds (Surowiecki 2004).
Granted, these constraints are not without a rationale: as Klein and Convertino (2014)
convincingly argued, a key concern to make collective deliberation work is filtering the
best ideas out of the super-abundance of suggestions proposed by large groups of
people. Nonetheless, there is a delicate balance between constraining interaction
enough to make it productive and constraining it too much, thus scaring users away.
Given the current problems in scaling up collective deliberation systems, that balance is
yet to be found, and future attempts should pay greater attention to support a satisfac-
tory social experience for users (for comprehensive discussion of large-scale argumen-
tation technologies, their limits and potential, see Buckingham Shum 2008; Rahwan
20
See http://staff.computing.dundee.ac.uk/creed/araucaria/.
21
See http://www.arg.dundee.ac.uk/index.php/ova/.
22
See https://carneades.github.io/Carneades/.
23
See http://www.arg.dundee.ac.uk/?p=492.
24
See http://www.arg-tech.org/index.php/projects/argument-analysis-wall/.
25
See http://compendium.open.ac.uk/.
26
See http://cci.mit.edu/klein/deliberatorium.html.
228 F. Paglieri
2008; Bex et al. 2013; on their connections with the Semantic Web, see Rahwan et al.
2011; Schneider et al. 2012a).
The last design principle derived from laypeople native argumentative inclinations,
the no structure imperative, is also the hardest to incorporate in argument technologies
and not by happenstance. Argument technologies are structure-hungry: they literally
feed on structured information, in order to operate. This is true both for technologies
based on abstract argumentation frameworks, inspired by Dung’s seminal work (1995)
and later blossomed into a thriving research area (Dung et al. 2006; Baroni and
Giacomin 2007; Caminada and Amgoud 2007; Modgil 2009; Modgil and Caminada
2009), and for approaches dealing with so-called structured argumentation, in which
the relationship between premises and conclusion is explicitly represented (Besnar and
Hunter 2001; Prakken 2010; Garcia and Simari 2014; Modgil and Prakken 2014): in
the first case, the relevant structure is the network of attack relationships among various
atomic arguments (nodes), whereas in the second case, what matters are the structural
features internal to each argument, typically expressed in terms of premise-conclusion
compounds. Either way, structural information is presupposed as a conditio sine qua
non for any argument technology. To parrot a famous tagline, “No structure, no
argument!”—nor is this fact surprising, since an argument is constitutively defined
by a certain structural relationship between premises and conclusion. The problem,
however, is that most argument technologies turn to users to provide the relevant
structural information they so crucially need to adequately function. But users, as
discussed in the previous section, are structural misers, who are at best unwilling,
and at worst unable, to deliver detailed structural information on their argumentative
engagements. And the more fine-grained is the structural information required by a
given argument technology, the less hope there is for users to actively provide it. 27
Asking users to discriminate between, say, premises and conclusion, without further
qualification, is well within their cognitive powers: so, provided we also give them a
reason to care for this kind of activity (more on that later), it is not impossible to harness
large-scale collaboration in creating a huge database of premise-conclusion structures.
But if we start demanding too much from users, e.g., asking them to classify arguments
according to a taxonomy with dozens of entries, such as Walton’s argumentation
schemes, then all hope for their active engagement is lost—and understandably so.
An area where this concern is especially vivid is argumentation mining, where the
aim is “to automatically detect, classify and structure argumentation in text” (Mochales
Palau and Moens 2011), possibly complementing argument analysis with other closely
related techniques, independently developed in computational linguistics (Peldszus and
Stede 2013a). The main challenge is twofold: on the one hand, as it is customary in
corpus linguistics, natural language texts are full of complexities, ambiguities, and
other forms of textual noise; on the other hand, the type of information to be
27
Interestingly, expecting users to dwell too much on argument structure may be not only unrealistic but even
undesirable. A recent study on using Facebook for learning purposes (Tsovaltzi et al. 2015b) showed that
participants who had carefully prepared their own arguments on a given topic were less likely to interact
productively with other SNS users on that topic, compared to people without that sort of individual
preparation—yet another finding fully compatible with the argumentative theory of reasoning. That led the
authors to conclude that “directly interacting with the support of argumentation scripts and without long
individual preparation and reflection may be preferable to carefully preparing arguments before joining
discussion in a SNS” (Tsovaltzi et al. 2015b, p. 588).
automatically detected (arguments and their structure) does not have a unique textual
representation, making its automatic retrieval highly problematic. The most developed
approach to the task (Moens et al. 2007; Mochales Palau and Moens 2009) first
segments the text into sentences and then uses them as atomic components of argu-
ments: although reasonably successful in terms of accuracy, this method is problematic
for a variety of reasons, including the fact that the same sentence often plays different
roles in more than one argument (e.g., in chains of arguments, the conclusion of an
argument serves as the premise of another) and that the same sentence may contain
various propositions (the actual atomic elements of argument), or on the contrary, a
single proposition may be expressed by more than one sentence (Lawrence et al. 2014).
But an even more basic obstacle to the development of successful argumentation
mining is the need for manually annotated corpora, to be used as training samples for
machine learning algorithms or as gold standard to evaluate the performance of
automatic systems. As noted by Lawrence et al., “the availability of training data is a
major hurdle. Developing these training sets is demanding and extremely labour
intensive [and] as scale increases, quality management (e.g. over crowdsourced con-
tributions) becomes an increasing challenge” (Lawrence et al. 2014, p. 85). My
suggestion is that the hope of harnessing collaborative tagging to create “argumentation
folksonomies” (Buckingham Shum 2008) in support of argumentation mining will be
realistic only insofar as (i) tags are kept at a very general level of abstraction, e.g.,
conclusion, premise, argument/non-argument, etc., and therefore (ii) argumentation
mining also focuses mostly on such coarse-grained level of analysis, before (and
possibly instead of) aiming at extracting more fine-grained argumentative information
from natural language texts. This is consistent with recent proposals (e.g., Schneider
2014) for a joint community research agenda in argumentation mining, in which part of
what needs to be agreed upon is the level of granularity of the analysis (coarse,
according to the present argument) and the most appropriate models to be used as
theoretical framework (simple, as discussed).28
Indeed, a corollary of the no structure principle, one that argument technologies can
and should meet, is the following: Keep It Simple! When it comes to querying users for
input on structural relationships within and across arguments, only the bare minimum
should be required, lest the demand becomes both puzzling and annoying to users.
Even better, such structural information should be collected as a by-product of the
regular usage of that particular technology, without any explicit request being ad-
dressed to users—very much like targeted advertising tries to infer our tastes from
online navigation patterns, without asking us to fill in a survey. Examples of how this
idea can be put into practice are Debatepedia29 (Lindsay 2009; Cabrio and Villata 2013)
and, more recently, ArguBlogging30 (Bex et al. 2014): these systems ask users to do
28
Crucially, complex argument annotation systems are problematic also for experts, not only for laypeople.
When using a simplified version of Walton’s argumentation scheme taxonomy (including 14 schemes out of
60), with coders trained on argumentation and four iterations of the annotation procedure, the final level of
inter-coder agreement can still be as low as 0.48, measured by Cohen’s kappa (Schneider et al. 2013). This
figure raises to more acceptable levels only when simpler annotation schemes are adopted (Schneider et al.
2012b), and it is further improved by isolating ex post the most reliable sub-groups of annotators (Peldszus and
Stede 2013b).
29
See http://dbp.idebate.org/en/index.php/Welcome_to_Debatepedia%21.
30
See http://www.argublogging.com/.
230 F. Paglieri
something very natural and, in the context of online discussion, even quite entertaining,
i.e., identifying pros or cons on a certain public issue (Debatepedia) and expressing
agreement or disagreement on any textual segment present online (ArguBlogging), and
then the systems store the structural information implicitly conveyed by these argu-
mentative acts—in fact, ArguBlogging, contrary to Debatepedia, even codifies the
information in the Argument Interchange Format (AIF; see Chesnevar et al. 2006),
thus making it reusable by all those tools committed to the vision of the World Wide
Argument Web (Rahwan et al. 2007; Rahwan et al. 2011). In terms of user experience,
this is far superior to any explicit query, and thus, this approach has the potential to
appeal to a much larger segment of online arguers—provided enough attention is given
to motivate their contribution.
A concern for proper motivation, indeed, is too often absent from the design of
argument technologies. User participation seems to be taken for granted, as soon as
they will be given access to the joys of online argumentation. Once again, the fact that
most argument technologies are designed by argumentation scholars may generate an
attribution error: if the developers project onto the masses their own unending desire to
argue, and argue rationally and orderly, they will not even consider the possibility that
users may find the proposed system uninteresting, obscure, hard to use, or plainly
boring. On this view, the fact that arguments on Facebook are relatively rare and
comparatively weak and/or nasty is to be blamed only on the platform, which is not
conducive of rational discussion, and not on arguers themselves. Thus, it follows that,
as soon as a more civil outlet for discussion will be proposed to users, they will flock to
it. Unfortunately, although Facebook is indeed not designed to support argumentation
(Kirschner 2015), that is probably part of the reason why users like to spend so much
time on it: those users find it gratifying to peak into each other’s life and chat away the
hours, yet they may not consider rigorous argumentation and in-depth debate as equally
rewarding. More precisely, for most people, argumentation is not an end in itself but
rather a means towards some other goal of practical relevance (Paglieri and
Castelfranchi 2010): if argument technologies are to promote well-structured discussion
and careful reasoning, they must do so while serving other relevant needs of their users.
Argumentation-based decision support systems (Carbogim et al. 2000; Karacapilidis
and Papadias 2001; Morge 2008; Amgoud and Prade 2009; Introne and Iandoli 2014)
are a good example of how this concern for motivating users can be satisfied, without
forsaking the overarching pedagogical aims of argument technologies, i.e., to improve
the quality of argumentation among users. In expert systems, there is often the need to
provide reasons for the recommendations proposed to users, and argumentation-based
techniques are ideally suited for this purpose: so it is not surprising that many of the
first applications of argument technologies were developed in this area (for discussion,
see Rahwan and McBurney 2007). When using these tools, people are engaged in an
activity of practical interest that, in and by itself, has nothing to do with arguing with
others; instead, users take benefit of the argument structure already built in the system
to learn how to better navigate the decision they are facing, be it a clinical diagnosis for
medical practitioners or a difficult case to adjudicate for legal experts. By providing
feedback on the proposed decision and its supporting reasons, users further expand and
validate the argument database underlying the system, creating a virtuous cycle. And
the fact that these systems typically target relatively well-structured and circumscribed
knowledge domains (e.g., medicine, law) allows populating the original argument
database without excessive difficulties. But what matters the most for present purposes
is the fact that argumentation here is a means to an end, not an end in itself, and this
goes a long way in making these systems attractive to users.
More generally, a concern for properly motivating users should inform all aspects of
argument technologies, from attention to details in designing graphical interfaces and
user experience (Buckingham Shum 2008) to making sure these systems are as vastly
and as easily accessible to potential users as possible. Regarding ease of access, a
current problem in argument technologies is the tendency to “reinvent the wheel,” that
is, to develop new systems from scratch, instead of creating apps that can seamlessly
interact with existing systems. This is problematic because, all other things being equal,
a Facebook app that makes people argue well is much better than a standalone software
that does exactly the same, since the former will immediately become available to the
millions of people already using the pre-existing system. Simply put, when it comes to
making new technologies quickly available to large groups of people, parasites do it
better—thus argument technologies should learn to become more and more parasitic of
pre-existing platforms and infrastructures. A positive example is, again, the
ArguBlogging tool, which “requires no local installation, as it exists as a bookmarklet
in the user’s browser. Once the bookmarklet is installed […], a user can respond to an
opinion on a web page by highlighting the relevant piece of text and clicking the
bookmarklet. The ArguBlogging widget is then rendered on the page, providing
options with which to respond” (Bex et al. 2014, p. 10). Another good application of
the same principle is the MicroDebates 31 app for Android (Gabbriellini and Torroni
2012; Yaglikci and Torroni 2014), which allows users to rapidly argue with each other
from a handheld device, using Twitter. Ideally, all argument technologies should have
the same ease of access, as a key component in motivating user adoption.
4 Conclusions: Moving Towards Ecological Argument Technologies
Recent years have witnessed several proposals for a coordinated effort in bringing
argument technologies to the general public, such as the Argument Web initiative
(Rahwan et al. 2007; Bex et al. 2013), so that the yet untapped potential of these tools
may be fully unleashed (Modgil et al. 2013). To make concrete steps towards this
ambitious goal, suggestions have been made for improving the design of argument
technologies, both in general (de Moor and Aakhus 2006) and with special emphasis on
online deliberation systems (Towne and Herbsleb 2012). The considerations presented
in this paper are meant to complement such proposals, by connecting design principles
for argument technologies with the socio-cognitive features of their intended users, i.e.,
lay arguers. As discussed in Section 2, our native argumentative skills are less flawed
than popular wisdom and mainstream theories of reasoning would led us to believe.
Nevertheless, the argumentative practices of laypeople have their own quirks, which
argument technologies must take into account as part of their design space. In partic-
ular, the intuitive nature of argument appraisal, the importance of providing an adequate
social context, and the reluctance/inability of arguers to explicitly articulate structural
information on their own argumentative practices should serve as key constraints in
31
See https://play.google.com/store/apps/details?id=it.unibo.ai.microdebates&hl=en.
232 F. Paglieri
designing and deploying argument technologies. Importantly, it is not just a matter of

complementing users’ weaknesses but also of exploiting their strengths: thus, for
instance, the selective nature of reasoning laziness (Trouche et al. in press) should be
leveraged by argument technologies, asking users to critically scrutinize arguments
proposed by others, instead of expecting them to further articulate their own dialogical
contributions.
The notion of sustainability, while not sufficient, certainly acts as a necessary
condition for any argument technology that aims at satisfying the design constraints
projected onto argument technologies by our argumentative set up, and thus, it can
serve as a useful litmus test. In ecology, sustainability is described as the capacity to
endure, typically in the face of change: in the context of argument technologies, this
requires the proposed technological solutions to be self-sustaining—that is, users must
keep on using the new tools, possibly while further innovating them, even in the
absence of any external prodding. Whenever an argument technology violates one of
the design principles discussed in Section 3, that technology will end up failing the
sustainability challenge. If an argument diagramming system frames arguments as
abstract entities in need of theoretical articulation, without promoting their intuitive
appraisal in terms of agreement/disagreement (violation of the Intuition First principle),
most users will find it disconnected from their everyday argumentative practice, and
thus the system will end up being adopted only by niche users, if not abandoned
altogether. If an online deliberation system prevents users from having frequent real-
time interactions with each other, subjecting instead all new contributions to the
bottleneck of expert supervision (violation of the Social Engagement principle), then
users will quickly and gladly abandon the platform, in favor of less constrained and less
well-structured systems. If a crowdsourced tagging initiative expects contributors to
adopt an overly rich and highly technical argument taxonomy, instead of limiting the
number and complexity of the proposed tags (violation of the No Structure principle),
this will result not only in low inter-coder agreement but also in strong aversion
towards the task, thus severely limiting the possibility of scaling up. The list of possible
violations could continue, but the outcome would remain the same: a failure to make
argument technologies meaningful and attractive to large groups of potential users.
This type of failure is still too frequent, and this paper illustrated how many
argument technologies mismanage the sustainability challenge; at the same time, great
care was taken in pointing out virtuous exceptions, i.e., tools that, on the contrary,
successfully embody one or more of the design principles advocated in these pages.
Hopefully, this will clarify the exact extent of my skepticism on the future of argument
technologies: as mentioned in Section 1, there is no reason to be unconditionally
pessimistic on their prospects, and I would not want to be perceived as the Cassandra
of the argument revolution. On the contrary, the very fact that much could be done in
principle justifies a concern for what is being done in practice, insofar as some current
trends may turn out to be misguided and ultimately unproductive. Thus, this attempts at
steering the course of argument technologies in more promising directions.
In conclusion, if the present assessment of some argument technologies may have
seemed harsh at times, it should now be clear it was borne out of love, not spite—tough
love, to be sure, but love nonetheless. The intended audience are argumentation
scholars who dream big, believing argument technologies to have the potential to
transform the ways we interact online, “to encourage debate, facilitate good argument,
and promote a new online critical literacy,” as stated in a recent manifesto of the
Argument Web vision (Bex et al. 2013). These people should keep having big dreams,
and we should all join forces to help them turn this vision into reality. For that purpose,
ecological sustainability must become a key concern for argument technologies. At the
same time, this is not meant to rule out other, more modest aims for argument
technologies. Not all tools are meant to change the world, yet many of them turn out
to be useful nonetheless. The same applies to argument technologies. Stigmatizing the
tendency of argumentation scholars to develop technologies tailored to their own needs
served to emphasize that those needs do not necessarily align with those of lay arguers
in non-scholarly contexts. But of course, there is nothing wrong in developing tech-
nologies aimed at satisfying the needs of specific target groups. In fact, these projects
should be welcome, as long as we refrain from advertising them as something they are
not—to wit, large-scale socio-technical transformations on the brink of revolutionizing
online interactions. Take argument diagramming software as a case in point: its
application for teaching argumentation has led to interesting results over the years
(for discussion, see Scheuer et al. 2010; Reed et al. 2011), and further developing
similar tools is without doubt a worthy enterprise. Yet there is no indication that such
technology will radically change how education works, neither in the short term nor in
the long turn. In contrast, the successful development of, say, engaging social network-
ing platforms capable of promoting good argumentation and sound collective deliber-
ation would indeed have a major impact on our current and future society. It is in the
name of this ambitious vision that I submit this plea for ecological argument
technologies.
References
Amgoud, L., & Prade, H. (2009). Using arguments for making and explaining decisions. Artificial
Intelligence, 173(3–4), 413–436.
Antoci, A., Sabatini, F., & Sodini, M. (2015). Online and offline social participation and social poverty traps.
Journal of Mathematical Sociology. forthcoming.
Baroni, P., & Giacomin, M. (2007). On principle-based evaluation of extension-based argumentation seman-
tics. Artificial Intelligence, 171(10), 675–700.
Bench-Capon, T., & Dunne, P. (2007). Argumentation in artificial intelligence. Artificial Intelligence, 171(10),
619–641.
Besnar, P., & Hunter, A. (2001). A logic-based theory of deductive arguments. Artificial Intelligence, 128(1–
2), 203–235.
Bex, F., Lawrence, J., Snaith, M., & Reed, C. (2013). Implementing the argument web. Communications of the
ACM, 56(10), 66–73.
Bex, F., Snaith, M., Lawrence, J., & Reed, C. (2014). ArguBlogging: an application for the argument web.
Web Semantics: Science, Services and Agents on the World Wide Web, 25, 9–15.
Boudry, M., Pigliucci, M., & Paglieri, F. (2015). The fake, the flimsy, and the fallacious: demarcating
arguments in real life. Argumentation, 29(4), 431–456.
Buckingham Shum, S. (2008). Cohere: towards web 2.0 argumentation. In P. Besnard, S. Doutre, & A. Hunter
(Eds.), Computational models of argument: proceedings of COMMA 2008 (pp. 97–108). Amsterdam:
IOS Press.
Buckingham Shum, S., Selvin, A., Sierhuis, M., Conklin, J., Haley, C., & Nuseibeh, B. (2006). Hypermedia
support for argumentation-based rationale. In A. Dutoit, R. McCall, I. Mistrík, & B. Paech (Eds.),
Rationale management in software engineering (pp. 111–132). Berlin: Springer.
Butterworth, J., & Thwaites, G. (2013). Thinking skills: critical thinking and problem solving (2nd ed.).
Cambridge: Cambridge University Press.
234 F. Paglieri
Cabrio, E., & Villata, S. (2013). A natural language bipolar argumentation approach to support users in online
debate interactions. Argument & Computation, 4(3), 209–230.
Caminada, M., & Amgoud, L. (2007). On the evaluation of argumentation formalisms. Artificial Intelligence,
171(5–6), 286–310.
Carbogim, D., Robertson, D., & Lee, J. (2000). Argument-based applications to knowledge engineering. The
Knowledge Engineering Review, 15(2), 119–149.
Chang, C. F., Miller, A., & Ghose, A. (2010). Mixed-initiative argumentation: group decision support in
medicine. In P. Kostkova (Ed.), Electronic healthcare: proceedings of eHealth 2009 (pp. 43–50). Berlin:
Springer.
Chesnevar, C., McGinnis, J., Modgil, S., Rahwan, I., Reed, C., Simari, G., South, M., Vreeswijk, G., &
Willmott, S. (2006). Towards an argument interchange format. Knowledge Engineering Review, 21(4),
293–316.
Collins, P., Hahn, U., von Gerber, Y., & Olsson, E. (2015). The bi-directional relationship between source
characteristics and message content. In D. Noelle, R. Dale, A. Warlaumont, J. Yoshimi, T. Matlock, C.
Jennings & P. Maglio (Eds.), Proceedings of the 37th Annual Meeting of the Cognitive Science Society
(pp. 423–428). Austin, TX: Cognitive Science Society.
Conklin, J., Selvin, A., Buckingham Shum, S., & Sierhuis, M. (2001). Facilitated hypertext for collective
sensemaking: 15 years on from gIBIS. In K. Grønbæk, H. Davis &Y. Douglas (Eds.), Hypertext’01:
Proceedings of the 12th ACM Conference on Hypertext and Hypermedia (pp. 123–124). New York:
ACM.
Corner, A., & Hahn, U. (2012). Normative theories of argumentation: are some norms better than others?
Synthese, 190(16), 3579–3610.
de Moor, A., & Aakhus, M. (2006). Argumentation support: from technologies to tools. Communications of
the ACM, 49(3), 93–98.
Duggan, M. (2014). Online harassment. Washington: Pew Research Internet Project.
Dung, P. (1995). On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic
programming, and n-person games. Artificial Intelligence, 77, 321–357.
Dung, P., Kowalski, R., & Toni, F. (2006). Dialectic proof procedures for assumption-based, admissible
argumentation. Artificial Intelligence, 170(2), 114–159.
Ellison, N., Steinfield, C., & Lampe, C. (2007). The benefits of facebook friends: social capital and college
students’ use of online social network sites. Journal of Computer-Mediated Communication, 12, 114–
1168.
Ennis, R. (1989). Critical thinking and subject specificity: clarification and needed research. Educational
Researcher, 18(3), 4–10.
Ennis, R. (1993). Critical thinking assessment. Theory Into Practice, 32(3), 179–186.
Facione, P. (Ed.) (1990). Critical thinking: a statement of expert consensus for purposes of educational
assessment and instruction. American Philosophical Association: ERIC document ED 315–423.
Finocchiaro, M. (1981). Fallacies and the evaluation of reasoning. American Philosophical Quarterly, 18(1),
13–22.
Fiske, S., & Taylor, S. (1984). Social cognition. Reading: Addison-Wesley.
Floridi, L. (2009). Logical fallacies as informational shortcuts. Synthese, 167, 317–325.
Fogg, B. J. (2003). Persuasive technology: using computers to change what we think and do. San Francisco:
Morgan Kaufmann.
Frederick, S. (2005). Cognitive reflection and decision making. Journal of Economic Perspectives, 19(4), 25–
42.
Gabbriellini, S., & Torroni, P. (2012). Large-scale agreements via microdebates. In S. Ossowski, G. Vouros &
F. Toni (Eds.), AT 2012: Proceedings of the 1st International Conference on Agreement Technologies (pp.
366–377). Tilburg: CEUR-WS.org.
Garcia, A., & Simari, G. (2014). Defeasible logic programming: DeLP-servers, contextual queries, and
explanations for answers. Argument & Computation, 5(1), 63–88.
Garrison, D. R., Anderson, T., & Archer, W. (2001). Critical thinking, cognitive presence, and computer
conferencing in distance education. American Journal of Distance Education, 15(1), 7–23.
Gigerenzer, G., & Selten, R. (Eds.). (2001). Bounded rationality: the adaptive toolbox. Cambridge: The MIT
Press.
Gigerenzer, G., Hertwig, R., & Pachur, T. (Eds.). (2011). Heuristics: the foundations of adaptive behavior.
New York: Oxford University Press.
Goodwin, J. (1998). Forms of authority and the real ad verecundiam. Argumentation, 12(2), 267–280.
Gordon, T. (2010). An overview of the Carneades argumentation support system. In C. Tindale & C. Reed
(Eds.), Dialectics, dialogue and argumentation. An examination of Douglas Walton’s theories of reason-
ing (pp. 145–156). London: College Publications.
Govier, T. (1987). Problems in argument analysis and evaluation. Dordrecht: Foris.
Groarke, L. (2009). What’s wrong with the California critical thinking skills test? CT testing and account-
ability. In J. Sobocan & L. Groarke (Eds.), Critical thinking education and assessment: can higher order
thinking be tested? (pp. 35–54). London: The Althouse Press.
Habermas, J. (1984). The theory of communicative action. Boston: Beacon.
Hahn, U., & Oaksford, M. (2007). The rationality of informal argumentation: a Bayesian approach to
reasoning fallacies. Psychological Review, 114, 704–732.
Hahn, U., Harris, A. J. L., & Corner, A. (2009). Argument content and argument source: an exploration.
Informal Logic, 29(4), 337–367.
Hahn, U., Oaksford, M., & Harris, A. J. L. (2012). Testimony and argument: a Bayesian perspective. In F.
Zenker (Ed.), Bayesian argumentation (pp. 15–38). Dordrecht: Springer.
Hahn, U., Oaksford, M., & Harris, A. J. L. (2013). Rational inference, rational argument. Argument &
Computation, 4, 21–35.
Hamblin, C. (1970). Fallacies. London: Methuen.
Harris, A. J. L., Hsu, A. S., & Madsen, J. K. (2012). Because Hitler did it! Quantitative tests of Bayesian
argumentation using ad hominem. Thinking and Reasoning, 18(3), 311–343.
Hintikka, J. (1987). The fallacy of fallacies. Argumentation, 1(3), 211–238.
Hitchcock, D. (1995). Do the fallacies have a place in the teaching of reasoning skills or critical thinking? In
H. V. Hansen & R. C. Pinto (Eds.), Fallacies: classical and contemporary readings (pp. 319–327).
University Park: Penn State University Press.
Hitchcock, D. (2006). Informal logic and the concept of argument. In D. Jacquette (Ed.), Philosophy of logic
(Handbook of the philosophy of science, Vol. Volume 5, pp. 101–129). Amsterdam: Elsevier.
Hitchcock, D. (2007). Why there is no argumentum ad hominem fallacy. In F. H. van Eemeren & B. Garssen
(Eds.), Proceedings of the Sixth Conference of the International Society for the Study of Argumentation
(Volume 1, pp. 615–620). Amsterdam: Sic Sat.
Introne, J., & Iandoli, L. (2014). Improving decision-making performance through argumentation: an
argument-based decision support system to compute with evidence. Decision Support Systems, 64, 79–89.
Janier, M., Lawrence, J., & Reed, C. (2014). OVA+: an argument analysis interface. In S. Parsons, N. Oren, C.
Reed & F. Cerutti (Eds.), Computational Models of Argument: Proceedings of COMMA 2014 (pp. 463–
464). Amsterdam: IOS Press
Johnson, R., & Blair, A. (1977). Logical self-defense. Toronto: McGraw-Hill Ryerson.
Kahneman, D. (2011). Thinking, fast and slow. New York: Farrar, Straus and Giroux.
Karacapilidis, N., & Papadias, D. (2001). Computer supported argumentation and collaborative decision
making: the HERMES system. Information Systems, 26(4), 259–277.
Karunatillake, N., Jennings, N., Rahwan, I., & McBurney, P. (2009). Dialogue games that agents play within a
society. Artificial Intelligence, 173(9–10), 935–981.
Kirschner, P. (2015). Facebook as learning platform: argumentation superhighway or dead-end street?
Computers in Human Behavior, 53, 621–625.
Kirschner, P., Buckingham Shum, S., & Carr, C. (Eds.). (2003). Visualizing argumentation. Software tools for
collaborative and educational sense-making. Berlin: Springer.
Klein, M. (2012). Enabling large-scale deliberation using attention-mediation metrics. Journal of Computer-
Supported Cooperative Work, 21(4), 449–473.
Klein, M., & Convertino, G. (2014). An embarrassment of riches. Communications of the ACM, 57(11), 40–
42.
Kuhn, D. (1991). The skills of arguments. Cambridge: Cambridge University Press.
Kunda, Z. (1990). The case for motivated reasoning. Psychological Bulletin, 108, 480–498.
Laughlin, P., & Ellis, A. (1986). Demonstrability and social combination processes on mathematical intellec-
tive tasks. Journal of Experimental Social Psychology, 22, 177–189.
Lave, J., & Wenger, E. (1991). Situated learning: legitimate peripheral participation. Cambridge: Cambridge
University Press.
Lawrence, J., Bex, F., & Reed, C. (2012). Dialogues on the argument web: mixed initiative argumentation
with Arvina. In B. Verheij, S. Szeider & S. Woltran (Eds.), Computational Models of Argument:
Proceedings of COMMA 2012 (pp. 513–514). Amsterdam: IOS Press.
Lawrence, J., Reed, C., Allen, C., McAlister, S., Ravenscroft, A., & Bourget, D. (2014). Mining arguments
from 19th century philosophical texts using topic based modelling. In N. Green, K. Ashley, D. Litman, C.
236 F. Paglieri
Reed & V. Walke (Eds.), Proceedings of the First Workshop on Argumentation Mining (pp. 79–87).
Stroudsburg, PA: ACL
Levesque, H. J. (1986). Making believers out of computers. Artificial Intelligence, 30(1), 81–108.
Levi, D. S. (1999). The fallacy of treating the ad baculum as a fallacy. Informal Logic, 19(2–3), 145–159.
Lindsay, B. (2009). Creating “the Wikipedia of pros and cons”. In D. Riehle & A. Bruckman (Eds.),
WikiSym’09: Proceedings of the 5th International Symposium on Wikis and Open Collaboration (n.
36). New York: ACM.
Mackenzie, P. T. (1980). Ad hominem and ad verecundiam. Informal Logic, 3(3), 9–11.
Massey, G. (1981). The fallacy behind fallacies. Midwest Studies In Philosophy, 6(1), 489–500.
McPeck, J. (1990). Critical thinking and subject specificity: a reply to Ennis. Educational Researcher, 19(4),
10–12.
Mercier, H. (2010). The social origins of folk epistemology. Review of Philosophy and Psychology, 1(4), 499–
514.
Mercier, H. (2013). Our pigheaded core: how we became smarter to be influenced by other people. In K.
Sterelny, R. Joyce, B. Calcott, & B. Fraser (Eds.), Cooperation and its evolution (pp. 373–398).
Cambridge: MIT Press.
Mercier, H., & Sperber, D. (2009). Intuitive and reflective inferences. In J. S. B. T. Evans & K. Frankish
(Eds.), In two minds: dual processes and beyond (pp. 149–170). New York: Oxford University Press.
Mercier, H., & Sperber, D. (2011). Why do humans reason? Arguments for an argumentative theory.
Behavioral and Brain Sciences, 34(2), 57–74.
Mercier, H., Trouche, E., Yama, H., Heintz, C., & Girotto, V. (2015). Experts and laymen grossly underes-
timate the benefits of argumentation for reasoning. Thinking and Reasoning, 21(3), 341–355.
Mizrahi, M. (2010). Take my advice—I am not following it: ad hominem arguments as legitimate rebuttals to
appeals to authority. Informal Logic, 30(4), 435–456.
Mochales Palau, R., & Moens, M.-F. (2009). Argumentation mining: the detection, classification and structure
of arguments in text. In P. Casanovas & C. Hafner (Eds.), Proceedings of the 12th International
Conference on Artificial intelligence and Law (pp. 98–107). New York: ACM.
Mochales Palau, R., & Moens, M.-F. (2011). Argumentation mining. Artificial Intelligence and Law, 19(1), 1–
22.
Modgil, S. (2009). Reasoning about preferences in argumentation frameworks. Artificial Intelligence, 173(9–
10), 901–934.
Modgil, S., & Caminada, M. (2009). Proof theories and algorithms for abstract argumentation frameworks. In
I. Rahwan & G. Simari (Eds.), Argumentation in artificial intelligence (pp. 105–129). Berlin: Springer.
Modgil, S., & Prakken, H. (2014). The ASPIC+ framework for structured argumentation: a tutorial. Argument
and Computation, 5(1), 31–62.
Modgil, S., Toni, F., Bex, F., Bratko, I., Chesñevar, C., Dvorák, W., & Woltran, S. (2013). The added value of
argumentation. In S. Ossowski (Ed.), Agreement technologies (pp. 357–403). Berlin: Springer.
Moens, M.-F., Boiy, E., Mochales Palau, R., & Reed, C. (2007). Automatic detection of arguments in legal
texts. In A. Gardner & R. Winkels (Eds.), Proceedings of the 11th International Conference on Artificial
intelligence and Law (pp. 225–230). New York: ACM.
Morge, M. (2008). The hedgehog and the fox. An argumentation-based decision support system. In I.
Rahwan, S. Parsons & C. Reed (Eds.), Argumentation in Multi-Agent Systems: Proceedings of ArgMAS
2007 (pp. 114–131). Berlin: Springer.
Moshman, D., & Geil, M. (1998). Collaborative reasoning: evidence for collective rationality. Thinking and
Reasoning, 4(3), 231–248.
Nickerson, R. (1998). Confirmation bias: a ubiquitous phenomena in many guises. Review of General
Psychology, 2, 175–220.
O’Keefe, D. (1977). Two concepts of argument. Journal of the American Forensic Society, 13, 121–128.
Ossowski, S. (Ed.). (2012). Agreement technologies. Berlin: Springer.
Paglieri, F. (2016). Don’t worry, be gappy! On the unproblematic gappiness of fallacies. In F. Paglieri, L.
Bonelli, & S. Felletti (Eds.), The psychology of argument: cognitive approaches to argumentation and
persuasion (pp. 153–172). London: College Publications.
Paglieri, F., & Castelfranchi, C. (2010). Why argue? Towards a cost–benefit analysis of argumentation.
Argument and Computation, 1(1), 71–91.
Peldszus, A., & Stede, M. (2013a). From argument diagrams to argumentation mining in texts: a survey.
International Journal of Cognitive Informatics and Natural Intelligence, 7(1), 1–31.
Peldszus, A., & Stede, M. (2013b). Ranking the annotators: An agreement study on argumentation structure.
In S. Dipper, M. Liakata & A. Pareja-Lora (Eds.), Proceedings of the 7th Linguistic Annotation Workshop
& Interoperability with Discourse (pp. 196–204). Stroudsburg, PA: ACL.
Perkins, D., Farady, M., & Bushey, B. (1991). Everyday reasoning and the roots of intelligence. In J. Voss, D.
Perkins, & J. Segal (Eds.), Informal reasoning and education (pp. 83–105). Hillsdale: Lawrence Erlbaum
Associates.
Possin, K. (2008). A field guide to critical-thinking assessment. Teaching Philosophy, 31(3), 201–228.
Prakken, H. (2010). An abstract framework for argumentation with structured arguments. Argument and
Computation, 1(2), 93–124.
Rahwan, I. (2008). Mass argumentation and the semantic web. Web Semantics: Science, Services and Agents
on the World Wide Web, 6(1), 29–37.
Rahwan, I., & McBurney, P. (2007). Argumentation technology. IEEE Intelligent Systems, 22(6), 21–23.
Rahwan, I., & Simari, G. (Eds.). (2009). Argumentation in artificial intelligence. Berlin: Springer.
Rahwan, I., Ramchurn, S., Jennings, N., McBurney, P., Parsons, S., & Sonenberg, L. (2004). Argumentation-
based negotiation. The Knowledge Engineering Review, 18(4), 343–375.
Rahwan, I., Zablith, F., & Reed, C. (2007). Laying the foundations for a world wide argument web. Artificial
Intelligence, 171(10–15), 897–921.
Rahwan, I., Banihashemi, B., Reed, C., Walton, D., & Abdallah, S. (2011). Representing and classifying
arguments on the semantic web. The Knowledge Engineering Review, 26(4), 487–511.
Rahwan, I., Krasnoshtan, D., Shariff, A., & Bonnefon, J.-F. (2014). Analytical reasoning task reveals limits of
social learning in networks. Journal of the Royal Society, Interface, 11(93), 20131211.
Rainie, L., Lenhart, A., & Smith, A. (2012). The tone of life on social networking sites. Washington: Pew
Internet Research Center.
Reed, C., & Norman, T. (Eds.). (2004). Argumentation machines. Berlin: Springer.
Reed, C., & Rowe, G. (2004). Araucaria: software for argument analysis, diagramming and representation.
International Journal on Artificial Intelligence Tools, 13(4), 961–980.
Reed, C., & Walton, D. (2003). Argumentation schemes in argument-as-process and argument-as-product. in
J. A. Blair, D. Farr, H. Hansen, R. Johnson and C. Tindale (Eds.), Informal Logic @ 25: Proceedings of
the 5th OSSA Conference. Windsor, Ontario: OSSA.
Reed, C., Wells, S., Snaith, M., Budzynska, K., & Lawrence, J. (2011). Using an argument ontology to
develop pedagogical tool suites. In P. Blackburn, H. van Ditmarsch, M. Manzano, & F. Soler-Toscano
(Eds.), Tools for teaching logic: proceedings of TICTTL 2011 (pp. 207–214). Berlin: Springer.
Rowe, G., Macagno, F., Reed, C., & Walton, D. (2006). Araucaria as a tool for diagramming arguments in
teaching and studying philosophy. Teaching Philosophy, 29(2), 111–124.
Sà, W., West, R., & Stanovich, K. (1999). The domain specificity and generality of belief bias: searching for a
generalizable critical thinking skill. Journal of Educational Psychology, 91(3), 497–510.
Sabatini, F., & Sarracino, F. (2014). Online networks and subjective well-being. ArXiv, 1408, 3550.
Scheuer, O., Loll, F., Pinkwart, N., & McLaren, B. (2010). Computer-supported argumentation: a review of
the state of the art. International Journal of Computer-Supported Collaborative Learning, 5(1), 43–102.
Schneider, J. (2014). An informatics perspective on argumentation mining. In E. Cabrio, S. Villata, & A.
Wyner (Eds.), Proceedings of the workshop on frontiers and connections between argumentation theory
and natural language processing (pp. 1–4). Aachen: CEUR-WS.org.
Schneider, J., Groza, T., & Passant, A. (2012a). A review of argumentation for the Social Semantic Web.
Semantic Web-Interoperability, Usability, Applicability, 4(2), 159–218.
Schneider, J., Passant, A., & Decker, S. (2012b). Deletion discussions in Wikipedia: decision factors and
outcomes. In C. Lampe (Ed.), WikiSym2012: Proceedings of the 8th Annual International Symposium on
Wikis and Open Collaboration (n. 17). New York: ACM.
Schneider, J., Samp, K., Passant, A., & Decker, S. (2013). Arguments about deletion: how experience
improves the acceptability of arguments in ad-hoc online task groups. In A. Bruckman, S. Counts, C.
Lampe & L. Terveen (Eds.), CSCW2013: Proceedings of the 2013 Conference on Computer Supported
Cooperative Work (pp. 1069–1080). New York: ACM.
Scriven, M. (1987). Fallacies of statistical substitution. Argumentation, 1, 333–349.
Simon, H. A. (1956). Rational choice and the structure of the environment. Psychological Review, 63, 129–
138.
Sperber, D., Clément, F., Heintz, C., Mascaro, O., Mercier, H., Origgi, G., & Wilson, D. (2010). Epistemic
vigilance. Mind & Language, 25(4), 359–393.
Stanovich, K., & West, R. (2007). Natural myside bias is independent of cognitive ability. Thinking and
Reasoning, 13(3), 225–247.
Steinfield, C., Ellison, N., & Lampe, C. (2008). Social capital, self-esteem, and use of online social network
sites: a longitudinal analysis. Journal of Applied Developmental Psychology, 29, 434–445.
Stone, M. (2012). Denying the antecedent: its effective use in argumentation. Informal Logic, 32(3), 327–356.
Surowiecki, J. (2004). The wisdom of crowds. New York: Doubleday.
238 F. Paglieri
Toplak, M., West, R., & Stanovich, K. (2011). The cognitive reflection test as a predictor of performance on
heuristics-and-biases tasks. Memory & Cognition, 39, 1275–1289.
Toulmin, S. (1958). The uses of argument. Cambridge: Cambridge University Press.
Towne, W. B., & Herbsleb, J. (2012). Design considerations for online deliberation systems. Journal of
Information Technology & Politics, 9(1), 97–115.
Trouche, E., Sander, E., & Mercier, H. (2014). Arguments, more than confidence, explain the good perfor-
mance of reasoning groups. Journal of Experimental Psychology. General, 143(5), 1958–1971.
Trouche, E., Johansson, P., Hall, L., & Mercier, H. (in press). The selective laziness of reasoning. Cognitive
Science, doi: 10.1111/cogs.12303
Tsovaltzi, D., Greenhow, C., & Asterhan, C. (2015a). When friends argue: learning from and through social
network site discussions. Computers in Human Behavior, 53, 567–569.
Tsovaltzi, D., Judele, R., Puhl, T., & Weinberger, A. (2015b). Scripts, individual preparation and group
awareness support in the service of learning in Facebook: how does CSCL compare to social networking
sites? Computers in Human Behavior, 53, 577–592.
Tversky, A., & Kahneman, D. (1971). Belief in the law of small numbers. Psychological Bulletin, 76(2), 105–
110.
Tversky, A., & Kahneman, D. (1982). Evidential impact of base rates. In D. Kahneman, P. Slovic, & A.
Tversky (Eds.), Judgment under uncertainty: heuristics and biases (pp. 153–160). Cambridge:
Cambridge University Press.
Tversky, A., & Kahneman, D. (1983). Extensional versus intuitive reasoning: the conjunction fallacy in
probability judgment. Psychological Review, 90(4), 293–315.
van Eemeren, F., & Grootendorst, R. (1992). Relevance reviewed: the case of argumentum ad hominem.
Argumentation, 6(2), 14–159.
van Eemeren, F., & Grootendorst, R. (1995). The pragma-dialectical approach to fallacies. In H. V. Hansen &
R. C. Pinto (Eds.), Fallacies: classical and contemporary readings (pp. 130–144). University Park: Penn
State University Press.
Walton, D. (1992). Nonfallacious arguments from ignorance. American Philosophical Quarterly, 29(4), 381–
387.
Walton, D. (1996). Argumentation schemes for presumptive reasoning. Mahwah: Lawrence Erlbaum
Associates.
Walton, D. (1997). Appeal to expert opinion: arguments from authority. University Park: The Pennsylvania
State University Press.
Walton, D. (1998). Ad hominem arguments. Tuscaloosa: The University of Alabama Press.
Walton, D. (1999). The appeal to ignorance, or argumentum ad ignorantiam. Argumentation, 13(4), 367–377.
Walton, D. (2000). Scare tactics: arguments that appeal to fear and threats. Dordrecht: Kluwer.
Walton, D., & Godden, D. M. (2007). Informal logic and the dialectical approach to argument. In H. Hansen &
R. Pinto (Eds.), Reason reclaimed (pp. 3–17). Newport News: Vale Press.
Walton, D., & Gordon, T. (2012). The Carneades model of argument invention. Pragmatics & Cognition,
20(1), 1–31.
Walton, D., Reed, C., & Macagno, F. (2008). Argumentation schemes. Cambridge: Cambridge University
Press.
Wason, P. C. (1966). Reasoning. In B. Foss (Ed.), New horizons in psychology: I (pp. 106–137).
Harmandsworth: Penguin.
Weinberger, A., Stegmann, K., Fischer, F., & Mandl, H. (2007). Scripting argumentative knowledge con-
struction in computer-supported learning environments. In F. Fischer, I. Kollar, H. Mandl, & J. M. Haake
(Eds.), Scripting computer-supported collaborative learning (pp. 191–211). Berlin: Springer.
Woods, J. (1998). Argumentum ad baculum. Argumentation, 12(4), 493–504.
Woods, J. (2013). Errors of reasoning. Naturalizing the logic of inference. London: College Publications.
Woods, J., & Walton, D. (1974). Argumentum ad verecundiam. Philosophy and Rhetoric, 7(3), 135–153.
Woods, J., & Walton, D. (1978). The fallacy of ‘ad ignorantiam’. Dialectica, 32(2), 87–99.
Yaglikci, N., & Torroni, P. (2014). Microdebates app for Android: a tool for participating in argumentative
online debates using a handheld device. In A. Andreou & G. A. Papadopoulos (Eds.), Proceedings of
ICTAI 2014: IEEE 26th International Conference on Tools with Artificial Intelligence (pp. 792–799). Los
Alamitos, CA: IEEE.

A Plea For Ecological Argument Technologies

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Plea For Ecological Argument Technologies

Uploaded by

Copyright:

Available Formats

Philos. Technol.

A Plea for Ecological Argument Technologies

Abstract In spite of significant research efforts, argument technologies do not seem

Keywords Argumentation . Argument technologies . Psychology of reasoning .

The ordinary definition of argumentation describes it as “the act or process of giving

phenomenon including aggressive and disrespectful behaviors, vile comments, harass-

oriented argument technologies will never scale up effectively enough to make a

By definition, a fallacy is a mistake in reasoning, a mistake which occurs with

The conception expressed in these and other popular definitions of fallacies is

avoid error and a host of social problems—misunderstandings, polarization, flaming,

2.1 We, the Arguers: Bad at Abstract Analysis…

A A vegetarian diet can be better for health than a traditional diet.

B Adequate protein is available from a vegetarian diet.

C A traditional diet is very high in protein.

D A balanced diet is more important to health than any particular food.

E Vegetarians are unlikely to suffer from heart disease and obesity.

apparently even experts, struggle to engage successfully in the abstract analysis of

Paul is looking at Linda and Linda is looking at Patrick.

Paul is married but Patrick is not married.

Is a person who is married looking at a person who is not married?

Yes/No/We cannot tell

2.2 …But Good at Social Engagement

personally advantageous and (ii) collectively counterbalanced by the critical scrutiny

intelligence of individuals that make groups smart (under certain conditions, as

B1: Hollywood celebrities claim that homeopathy is effective, so it is likely to be

B2: Medical practitioners claim that homeopathy is effective, so it is likely to be

In contemporary argumentation theories, the fact that there is a marked difference in

AC and DA can be coherently interpreted as “Bayesian ‘quick and dirty’ informational

3 Practical Implications: Argument Technologies for Real Arguers

1. The valuable argumentative skills people exhibit in real-life argumentation, i.e.,

For the development of argument technologies, these facts should immediately

At present, most argument technologies seem set to disregard some or

4 Conclusions: Moving Towards Ecological Argument Technologies

designing and deploying argument technologies. Importantly, it is not just a matter of

You might also like