The Lachmann's method at the time of Google.

University Sapienza of Rome

Faculty of Literature and Philosophy
Bachelor of Arts in Italian Studies

The oxymoronic relationship between the stemmatic method, fundamental for the
classical philology, and the most used web search engine, lends itself to explain the potential
of the future philologist and linguist, especially in fields different from literature and
narrative, such as investigative journalism or computational and cognitive linguistics.
The post-Lachmann debate could be also the foundation of the future philology, as it had
been for the classic philology.

Prof. Pasquale Stoppelli
Paolo Maria Addabbo
Academic year: 2011-2012


INTRODUCTION: Past and future of philology and linguistics.

CHAPTER 1: Copied texts will be ascertained
1.1 Who copies and who quotes? The principle of authorship and the plagiarism in the web
1.2 The chiefly errors of Italian local authorities.
CHAPTER 2: From Lachmann to Google, through the notebooks
2.1 The collations of the news, the bon manuscript and the good source: the
most reliable
2.2 Digital variants and the recensio of lost pages
2.3 Blunders and cognitive-linguistics analyses: the case of Dante's last-name
2.4 Googling and results in different posts or parts of web-pages

INTRODUCTION: past and future of philology and linguistics

What are the challenges for future philologists and linguists? How can they enhance the
principle of authorship in the ocean of texts on the web and the internet, especially for
contents that are not fiction or literature, such as newspapers, acts, chats, blogs etc. ?
The Lachmann's method, that has been essential to teach and understand the textual
criticism1, it could be also a starting point to know how philology and linguistics can have a
confrontation with the universe of the digital writings: indeed in this work experiments and
projects connecting IT to humanities, and a few proposals and examples of applications of
this in the Seo and Google era will be showed.
It is a matter of fact that some features of the philology of the past and the way of
searching information online both for narrative aims and the daily life are strictly connected,
not even considering the subject itself of the textual criticism in a broad sense, that is
analysing texts in general...
Probably we should have a more careful approach to many new types of texts, as the
fragmentary sequences of words or the proper sentences in the statistics of the queries

1Definition of Textual Criticism on Encyclopedia Britannica

The pictures above show the basic statistics of a blog realized on the web-platform of Google, Blogger. websites, links, queries and web
search engines that have redirected readers to different posts of the blog. In addition there are also extra data such as the country of origin
of the users, their browser and their disk operating system.

used on a web search engine2, that lead readers to our website for instance. The experts of
media and texts will analyse the stream of data and the habits of the persons who types
words on Google, Bing etc.
For example, it could be possible to find more about somebody with a specific IP address
and particular writing habits, clicking advertisements of their own articles. Doing so they
are defrauding the sponsors believing that are clicks by genuine readers... Indeed this kind
of methodologies are already used to find click frauds by companies providing
advertisements in adsense-style: it is the system, mostly used by Google, that allows to be
paid depending on the ways that consumers interact with these advertisements, an
interaction that most of the times consists in considering the number of clicks (pay-perclick). While another method is represented by remunerating simply the times an
advertisement is exposed for and its size, almost like what happens on off-line journals.
That is why the main idea of this writing is that literary analysis should be more connected
to other fields of communication, from linguistics to IT through marketing and other more
specialized sectors, first of all the SEO (search engine optimization)3.
I am firmly convinced that apart from spatial and temporal disputes and differences about
the meaning of philology, we should consider the connotation of the definition of the
discipline that tries to solve attributive, linguistic, historical and critical problems of texts in
I strongly believe we spent a too long time trying to retrace the authors literary wills, either
we want to know the evolution of a text or its original form. On the other hand we might
have invested insufficient resources to develop ideas helping people to orientate themselves
in the contemporary ocean of the information...
By doing so many humanists had lost sight of the historical and philosophical connotation
of words like textual critic and philology (and the identification of this matter with
linguistics when the scientific study of the language did not formally existed yet4). The same
At the beginning Internet consisted of of a group of websites used to transfer files through a specific File
Transport Protocol, knowing in that way the exact address of every file. The web had been growing day by day, thats
why hhe first systems to classify those files were born, like Archie in 1990. They were basically lists of file names.
Later on the systems to index the textual contents like Gropher (1991), () Only after further developments the proper
web search engines were conceived. They were based on sophisticated researches referred to the natural language.
Behind a search engine there is first of all the procedure to collect data from all the accesible web pages () and then a
series of formula to find in the websites all the factors that give extra information about their content. This huge amount
of data can be collected by a software component (called crawler, spider, robot) or thanks to the help of human
resources. (...) Then many complex analysis have to be made to classify and evaluate (with a score of relevancy)
everything is stored in the database, depending on the query typed by the users.
Everything must be covered by industrial secret and continously changes following the users preferences.(...)
The criteria considered are numerous: first of all the content and the keywords, or the ones that are semantically less
related to the search and considered by theories such the Zipf's Law (...). Other criteria are to be find out outside the
website, like the links to the websites (...) Daniele Fusi, Informatica per le scienze umane, Edizioni Nuova Cultura
2011pag. 162
3 Search Engine Optimization: activities of different professional figures or of the specialized search engine optimizer
to make a website appearing in the highest position possible in the list of the results, when somebody is searching for
specific words. If this is not done in a licit or fair way, the border of the SEO is crossed into the spam or in creating
contents cut just for being indexed at a high rank, rather than create original and appropriate ones.
Difference between linguistics and philology, Britannica Encyclopedia, consulted 11/11/2014: The differences
were and are largely matters of attitude, emphasis, and purpose. The philologist is concerned primarily with the
historical development of languages as it is manifest in written texts and in the context of the associated literature and
culture. The linguist, though he may be interested in written texts and in the development of languages through time,

concepts could be found in the lesson of Giorgio Pasquali5 the philologist who gave an
important contribution to the method, using also the ideas of others as Paul Maas 6.
We could find some of these concepts of philology also in the definition critics of notes
and drafts (the critics of the scartafacci, as Benedetto Croce used to say in a pejorative
way), used by Gianfranco Contini7 to explain the aims of the critical exercise of the
authorial philology: he dreamed about entering the writers office and analyse all their texts,
also the discarded fragments, to understand their evolution.
This procedure, also known as criticism of variants, wanted to extablish a very close
relationship with the text and observe live the creative technique, although the border
between textual reconstruction and personal interpretation could be very thin 8. However,
the idea of rebuilding the relationships and evolution of a text it is surely applicable to all
kind of works.
These concepts are connected to the paradoxical title of this writing, that specifically
mentions Lachmann and it is not, for instance, The criticism of variants and Pasquali at
Steve Jobs time: firstly, because of the most popular web search engine that has made
almost useless type a web address, and that is symbol of focusing on the universe of
digital texts and in particular on the web and search engines; while the Lachmann method is
fundamental as a didactic model for humanists and of course, a must for students that are
interested in manuscripts and modern books, or in general in literary critique. Moreover the
debate developed around the technique marked a turning point between the classical and the
modern philology, among the reconstruction of the stemma 9or the emblem, that
represents the relationships between codices and the authorial philology 10: the most
significant innovation is the inverted relation between the text and the apparatus (all the
texts and information about a book); with the invention of the press the main aim of the
philologist switches from rebuilding the will of the writer to put order into the papers
of the author and show () the evolution of the text11 In other words the critics point of
view changed perspective, from the product (the text) to the consumer (the process), that
tends to give priority to spoken languages and to the problems of analyzing them as they operate at a given point in
Definition of philology, Britannica Encyclopedia, consulted 11/11/2014: It has been largely supplanted by
modern linguistics, which studies historical data more selectively as part of the discussion of broader issues in linguistic
theory, such as the nature of language change. However, some philologists continue to work outside a linguistics frame
of reference, and their influence can be seen in the names of some university departments (e.g., Romance philology)
and journals.
5 for Pasquali it was not possible to reach a persuading textual reconstruction without a scrutinized study of the
tradition and all the documents this is made of. But also the conditions, the environments where the tradition came out
from, even their nature of material objects has to be taken into account.
In other words, the quality of the textual critique depended also on the extra information possibly retrievable
about a manuscript or a printed book. For this reason all the historical disciplines were called to help philologists. It
was an important lesson of historicism, fruitful for the Italian philology. Indeed further improvements of the method
confirmed this way of thinking Pasquale Stoppelli, Filologia della letteratura italiana, Carocci Editore, Urbi 2009 pg.
6 Referring to Maas's theory and the contributions of Pasquali cfr. Pasquale Stoppelli, Filologia della letteratura
italiana, Carocci Editore, Urbino 2009 (Manuali universitari) pg. 94

7 Cfr. P. Stoppelli 2009 pg 118-119

8 Cfr. P. Stoppelli 2009 pg. 118
9 Cfr quotation n.15 about the stemmatic method and the stemma codicum
10 Cfr . P Stoppelli 2009 pg. 117
11 Domenico Fiormonte, Scrittura e filologia nell'era digitale, Bollati Borlinghieri 2003, pp. 204

is also close to what Domenico Fiormonte says, talking about Contini: he was amidst the
first to include time in the horizon of the hermeneut12.
With the pre-eminent position of the apparatus, the coming philologist, in particular for
what concerns literature, has to take into account the risk of comfortably leaving the text in
its plurality, a danger proper of the mechanism of the hypertext. The next philology
should not be just a tool to keep in the toolbox, as Croce used to say, but the whole
bagful of notions, the essential mentality for everybody to have in order to cope with a
For example, when a historian in 2200 will research about the arab spring he will find a
fertile field to achieve information about the unkown-nicknames soldiers that had
organized the protest on Twitter, the social network characterized by the brevity and the
velocity of the messages. In a context like this they could benefit of the knowledges of the
stemmatic relationships among writings in general, but they will also need IT knowledges
to make all the necessary checks. Furthermore, referring to the authorial philology and the
literary critique, we have to add that the web texts could show strong links with interviews,
video-clips, audio-tracks etc. That could show the artists or the authors changes in their
point of view or there could be misunderstandings between them and the interviewers.
It is a matter of fact that nowadays nobody would use the method of Lachmann slavishly to
realize a critical edition, although they are taught to understand how critical editions were
made or could be.The philologists of the future should try to reconstruct or contest the
truths coming from all the types of text: an example of unusual text are the
transcriptions of videos that often are not indexed (or simply not stored) on Google and,
to give another clue, that could be written automatically from audio-tracks. Furthermore
audio-textual, and even musical plagiarisms might be found... The future humanists will
determine the legitimacy of the contents and the relationships among them. They might use
classic notions of philology and other disciplines, with new undertones or meanings: indeed
a contamination13 of news will be shown in this publication. Furthermore the creativity
and the subjectivity inherent the ecdotic14 is fruitful especially for the artists, as well as
the scientific criteria and mindset that inspire ecdotic critique: the humanists role will be
more important for the society if we try, creatively, to find objective criteria to apply to
humanities. It is because of objective principles that, after decades, the method and some
phases of it are also used to create some softwares or, vice-versa, some of the softwares used
in biology were applied to the literary research15.
12 Definition of hermeneut from Merriam Webster dictionary: an interpreter, especially in the early church
13(...) the absence of the archetype -the original work from which the whole tradition comes- can cause problems, but
what really undermines the method is the contamination -in Latin contaminatio-. This phenomenon occurs when a
witness -jargon for copy- was originated from different sources at the same time. In this way there are not only
vertical relationships -in the tree of the tradition- but also transverse P. Stoppelli pg.89-90.
14(...) ecdotic critic examines the variants in order to find patterns of error, requiring to reconstruct the history of the
text. According to the principle that a community of error implies a unity of origin, the critic determines the relations
among the extant manuscripts, so as to place them in a family tree (stemma codicum). Rivista di filologia cognitiva
Paolo Canettieri, Vittorio Loreto, Marta Rovetta, Giovanna Santini, Ecdotics and Information Theory 2005
15With the aid of Robert O'Hara, then at the University of Wisconsin at Madison, and later with Chris Howe, Adrian
Barbrook and Matthew Spenser at the University of Cambridge, we were able to show that phylogenetic software
developed for biological sciences gave useful results when applied to manuscript traditions. That is: we can turn our
lists of agreements and disagreements among the manuscripts into a form which can be input into a program used by
biologists to hypothesize trees of descent among species Text Encoding Initiative Consortium (consulted on the 11/11/2014) Electronic Textual Editing: The
Canterbury Tales and other Medieval Texts, Peter Robinson, De Montfort University

To conclude this introduction, the lesson that we should learn from the classical and modern
philology is clear: the north star of the future humanist is still the criterion of
probability... Eclecticism is the main road to approach issues in literature, media and
in the information retrieval of our daily life. We should follow Pasqualis lesson,
confirmed many times in the history of ecdotic, very close to Michele Barbi's words: we
should approach the text humbly, not condemning the critics and its innate creative work,
but only the 'bad critics' .

CHAPTER 1: Copied texts will be ascertained

1.1 Who copies and who quotes? The principle of authorship and the plagiarism in the
It is widely accepted that the manipulable nature of the digital text can undermine the
principle of the authority16. On the other hand it is also true that in the global village we
live in it becomes harder to plagiarize without being discovered: in other words, I guess
there is a balance between the possibilities of copying and manipulating texts and the ways
of controlling these and researching their origins.
Indeed the ancient copyists tended to insert more mistakes and innovations, while with the
copy and paste, the possibility of copying with two clicks at least, the changes should be
less, in general.
Indeed it is not a case that there are many examples of famous plagiarism of Wikipedia, that
need to be mentioned to understand this: The Guardian found that the press office of the
Vatican copied about twenty biographies of cardinals and published these in an official
document17. For instance there were no-sense expression for the context, like: catholic
cardinal, since every cardinal in Vatican are obviously catholic.
The justification was that the time passed between their election and the presentation of
them was too short...
There was also a notorious Spanish Encyclopedia who copied from the most famous
partecipated encyclopedia several voices, including the biography of Boccacio 18; we can
also think about the leaflet for the celebration of the 150th anniversary of Italian unity,
published by the Minister of Education, again copied from Wikipedia19.
16It could be easily predicted that, on the long term, the philological awareness will progressively decrease, because
of the technical possibility of modifing and reproducing literary texts in a digital format with no cost. Everything could
be copied, modified, adapted to the readers preferences and cultur. The contents will be shared on the web with others,
in an endless process of circulation of which everybody and nobody will be responsible for. On that sort of reflection of
the world, on the internet, the most of the billions of texts are anonymous. From this point of view the future could be a
return to the Middle Ages... However even the philology and its roots made up of of the authorial principle, are not
natural and they might disappear from our culture and society. But these are epochal changes, that might be concrete in
a distant future
P. Stoppelli 2009 pg. 176

In conclusion here it is another case, even if not famous, of plagiarism, to introduce one of
the several softwares we could use to find these kind of fraud or relations between texts.
In the picture above we can see an article analysed by Copyscape, an online free software.
The picture shows parts of the original post on Wikipedia that are underlined (and without
any quotation mark).
1,2 The chiefly errors of Italian local authorities.
In my opinion we are overunned by contents and in particular news, always fresh, but too
often not containing proper information, which should be the foundations of our society: a
lot of necessary considerations and notions are frequently missed among the millions of
things we are offered.
Luckily there are also examples of journalists and politicians who regularly check if their
colleagues keep their promises in attesting true information, if they contradict themselves
and so on... Some members of a political party found a copy and paste in an official
document of the Italian authorities, regarding the recent state of emergency declared in
Rome for the recent crisis of waste disposal. In this document several sites were proposed,
and on one of these a new landfill should have been built. Parts of the document were
copied by a previous project, rejected in normal situations but recycled in an extreme
situation (the crisis we have just mentioned): in this case even the wrong distances on the
maps were useful to find the original text20:the project of a notorious businessman.


CHAPTER 2 From Lachmann to Google, through the notebooks

2.1 The collations of the news, the bon manuscript and the good source: the
most reliable
Let us begin this chapter with another example of how the mindset acquired through
studying the evolution of the textual criticism could be applied to the media.
An article of crime news will be used for a collatio of pieces, about a fugitive coming
back to Fiumicino Airport in Rome from Netherlands. It is perfect for the purpose of this
publication, since there are many news as well as different versions about the same event
and of course, numerous relationships between different articles: only comparing the
differences it is possible to reconstruct the original quality21.
An article on the web magazine Nuovo Paese Sera22 deals with an Albanian citizen, aged
35 years old, with several criminal records. He was wanted for drug trafficking and a
murder, that would have happened in a town near Rome, Aprilia.
A first way to make a selection of articles should be filtering contents: in a certain term
similar events happened in different periods should be ignored by Google, and the words
used for this advanced research could be Albanian wanted arrested at Fiumicino.
Numerous articles will come out23 together with the first differences, as well as the first
traces of chiefly mistakes, innovations and transversal relations: first of all the age of the
fugitive, probably confused with the the victims, and the town where the crime happened,
presumably mistaken with a close town where the police department had arrested him is

If we could not contact the police press office or find their records, we should adopt the
21P. Stoppelli 2009 pg. 65
23 Usually, when using Google, the first page does not report the effective number of results.

solution of the bon article24 and, of course, quoting it... A moral, and often also an
economic duty that most of the times, especially on not very spread journals, is not
considered. Indeed for this article it was not possible to find any text with enough words in
common or plagiarisms through the free software mentioned in the previous chapter.
However eclecticism, a basic knowledge of IT and search engines, are again the main path
to follow...
From the style and the details of the article that seems more trustable, it seems clear that the
source is likely to be a press release of the Italian police. This kind of writings is usually
sent to the editorial offices or published on their website. Well, the common sense suggest
us to type on Google a specific query for an advanced search that has to be done on a
particular website, that is: omicide Aprilia Fiumicino Albanian plus the special part of the
query site:www.policeofstate.it25. The first result was the press release with all the correct
details we met in the previous witnesses(as the different sources are called for the
reconstruction of the stemma).
2.2 Digital variants and the recensio of lost pages
There are various ways to retrieve contents from the World Wide Web26, the technology that
allows us to use a global web of hypertexts and other resources. For instance if we look
for the translation of a song in a certain language, we might search on a website with a list
of translations or we could type on Google the title of the song followed by the world
translation. If instead, we need to find a text written before Boccaccio's death, in
vernacular Italian instead of Latin, we will definitely use the databes OVI 27. In addition, if
we are looking for I Canti di Leopardi we need just to write his last-name on Google
Books to find the edition digitalized by Google, stored by the University of Michigan and
published in the 1860 in Florence by Le Monnier.
But how should we cope with other texts, especially the recent ones, disappeared from the
web because the server hosting them was destroyed, for example?
A first tool it is the cached copy28: indeed often the webmasters insert a specific part of
24 Bdier suggested to identify the best manuscript (bon manuscrit, the one carrying the largest number of readings
considered to be correct) and to reproduce it, performing only the obvious, or necessary corrections Towards Textual
Drift Modelling in Computational Philology MC PASSAROTTI - Linguistica Computazionale, 2006 -
Bdier launched the first attack on Lachmann's method, followed by Quentin shortly afterwards. His criticism
of the subjectivity inherent in Lachmannian procedure, undermined once and for all the primacy of the German school,
not only as regards restitution of the text, but also, and most importantly, on a hermeneutic level. The choice to publish
the bon manuscrit implied a reduction in the critical control exercised by the editor, shifting the focus onto the author
and onto the manuscript as a historical document The Text as Product and Process.
History, Genesis, Experiments
Domenico Fiormonte (Universit Roma Tre, Dipartimento di Italianistica, Italy) Cinzia Pusceddu (University of
Edinburgh, Division of European Languages and Cultures, UK)
25the original Italian query is:omcidio Aprilia Fiumicino Albanese . The link of the press
27The database of the Opera del Vocabolario Italiano contains 1780 texts in vernacular before the 1375, year of
Boccaccio's departed, with fundamental author of Italian literature such as Dante and Petrarca, and many other less
famous poets, historians and merchants.
28Google takes a snapshot of each page it examines and caches (stores) that version as a back-up. The cached version
is what Google uses to judge if a page is a good match for your query.
Practically every search result includes a Cached link. Clicking on that link takes you to the Google cached

html code in the page in order to delete the cached copy (or even not to make it appear on
web search engines). However the same removals occur more often on Google, probably
because the most of people are not interested in removing contents from other web search
engines: for example a cached copy of a web page might be deleted from Google but it can
be still stored by Yahoo. And even if none of the research platform had stored it we can still
try with other softwares, like the Way Back Machine: it is an online program that scans
a part of the web, similarly to a web search engine, and stores pictures of it. One of the
particular feature and difference with the cached copies is that there is not only one
picture shown to the users but the different variants of the page at different times,
recording several editing stages and changes.
During my internship as an aspiring journalist this free online software has been really
useful, like when I was researching about the Millenium Project for hosting the Olympic
Games in the capital of Italy. Eventually Rome was not chosen for the event, and also most
of the contents on the websites that presented the project were deleted. However many of
the removed contents were still useful to understand the urban plan of the city. The
homepage was not deleted though, and the same message (in the picture below) appears
trying to open other pages or typing cache: before the web addresses in the googlebar.

There were no cached copies on other web search platforms, but luckily (or because
somebody expressly required it to the website) on the Way Back Machine there are two
captures of the web-page I was interested in.

version of that web page, instead of the current version of the page. This is useful if the original page is unavailable
because of:
Internet congestion
A down, overloaded, or just slow website
The owners recently removing the page from the Web
Sometimes you can access the cached version from a site that otherwise require registration or a subscription.
Note: Since Googles servers are typically faster than many web servers, you can often access a pages
cached version faster than the page itself.

Another interesting tool, that allows us to record a page in a particular moment (so, without
the dynamic changes of a usual web page), is named Freeze Page and the difference
with the previous one is that it can be used directly to freeze any Url. Moreover, talking
about the variants of a text, there is a literary project deserving to be mentioned: digital
variants29, an online archive of contemporary writers, aimed to show the several editorial
stages of every writing.

2.3 Blunders and cognitive-linguistics analyses: the case of Dante's last-name

Using a web search engine can be a way to research about linguistic habits, of specific
writers or periods. For instance, in the picture below it is showed a short, although
significant, onomastic research about Dante's last name

29Digital Variants (=DV) is a contemporary authors digital archive founded in 1996 by Domenico Fiormonte and
Jonathan Usher at the Department of Italian of the University of Edinburgh. (...)
The aim of the project is to make available on the Internet texts of living authors at different stages of writing.
Well-known writers of different literatures, at the height of their activity, agreed to open the "kitchen" of the text,
showing us the complex writing phenomena lying under the final version of a work.
As the manuscript writing space fades away replaced by electronic processes, we face the inevitable
disappearance of variants (along with that of traditional philological methodologies and concepts). Everyday fewer
writers save the different versions of their texts, and the new writing technology implies a loss for the knowledge of the
gense du texte. DV provides useful tools for exploring of the literary writing process through the digitalization of
writers' drafts, pre-texts, and brouillons d'criture in both text and image formats.

The query used this time on Google is: 'dante aligheri'30 plus -alighieri31
From the first results it can be clearly seen that the usual mistake comes from not-Italian
speakers, and so it is not really a matter of concern. Surprisingly though also a lot of
Italians, including some schools with the name of one of the most famous writer of ever,
made this unforgivable mistake, and only a few of these have corrected it... However, for
many of them, the results are still available on Google...
2.4Googling and results in different posts or parts of web-pages
We have seen different ways to consult Google. Notwithstanding I believe that the average
of the users are not aware of these functions or search operators, and they might have
barely tried these only using the slower advanced search...
This confirms the philosophy of not using often the special functions because they might
ignore important results and limit the possibility of finding new information. On the other
hand it is a matter of fact, especially when looking for something of specific, that a user
could be disoriented by the massive amount of results.
The next example illustrates a typical situation: when two different words, or concepts
expressed by sequences of them, are in the same web page but they are not what we were
searching for: that is because probably they are not in the same article, or maybe they are
only partially related if not at all. It is also true that sometimes we might look for different
words or expressions without a strict relation among these, or in different parts of the
Again a criminal case, and the texts related to this, constitute an example that suits this
publication: a criminal group of immoral masons, from the small town of Benevento,
according to an ongoing investigation and news-report, was trying to organize and finance a
military coup in Africa. Somebody could think that maybe also the most important politician
of the area could be connected, even without any penal responsibility, with some of these
30Using the inverted commas we will have exactly the same sequence of words, in this case the name and the lastname
(with the special mistake) of the most important Italian poet
31Typing the hyphen we exclude from the results all the pages where only the correct form of the last name or both the
correct and the wrong were used. In the correct form there is I after GH. In addition also the automatic correction
(that it works similarly to the mechanism of synonyms chains) that suggests us to search for the right surname will be

illegal masons. So Mastella -the name of the politician- plus masons Benevento or
masons coup d'etate Africa Mastella could be some of the queries used to search some
possible connections.

Well, the first results, obtained using a basic search without any kind of customization
was the one in the picture, were both the masons and Mastella are mentioned. Actually they
are mentioned in the same page but not in the same article (Mastella is mentioned in a
tag of a different post, but in theory it could be also a word from an advertisement or
another element of the page): indeed, as far as the press is concerned, there are no relations
at all with the powerful and controversial politician, involved in different scandals.
A first criterion to solve this kind of problem of finding results in the same page but not
strictly related or not at all, would be improving the formula in order to recognize clearly
when the different words are not likely to be related, especially if present in different posts.
It is probably a weak point on which Google competitors will be able to find new
possibilities of challenging the giant based on the Mountain View in California. It would
be really hard trying to impose to webmasters some special codes to separate different
posts and parts of the page. It is also true that the calculations of web search engines can
already categorize and separate the different parts of pages, like the title, the tags, videos
or other elements, etc.
And there was also a special function, not anymore supported by Google though: there was
a snapshot of the page as preview, that gives an idea of which part of the page is occupied
by the words we searched, that are underlined, and the relation between these... While
currently we can see only the symbol ... that does not tell us the effective positions of the
words (although the distance of the words is considered in the calculation), and we have to
open the page and use the specific search-tool (usually pressing CTRL+F or G) and look for
In conclusion a speech of Domenico Fiormonte is perfect for this publication, because it is
still in the literary universe but projected in the future of the IT: humanists have something
more than their colleagues with a scientific degree: the world of the information -meaning
researching, reading and analysing signs- is their world. Nowadays we are in a situation
similar to Edisons and Siemens when they created two important companies for the
electric energy. Both of them had to select the first technicians of electricity between
telegraphists, the only ones having the right skills. One example closer to the humanities
come from Walter Benjamin and his 'recycling' himself as a miniature painter. This kind of
figure started to evolve in that of the photographer in about 1840, starting it as a hobby and
then working full time. Because of the competences acquired through their previous

occupation: not for the artistic background but for their crafting ability. Both these
examples suggest us that the technological innovation creates hybrid professions.
Furthermore we understand that in a specific moment of the history new possibilities for
experiments and creativity of individuals are open. After this phase the system will become
more stable, and step by step these possibilities will disappear: only the monopolistic
balances, groups of companies and educational institutions establish the professional paths.
In this moment the problem is the increased awareness of humanists about their potential,
and the short time we have. Humanists have to raise their voice to make it clear until the
point where new technologies signs the borders of the future professional figures. A
future where the bias of textuality is not banned at all.
In conclusion, I believe I can say that thanks to the digitalization, the humanities are ending
their childhood. A golden era, spent in an elitist garden: the one of the alphabetization.
This is a problem that a book could not solve by itself. We surely can not blame (or thank)
the structures, but it is licit hoping that the computers will accelerate writing and reading
processes, to go further. The web is already a tool of communication and the architecture
of a more open and complex knowledge of the world of the press. The written documents
took ages to reach (and create, only in a few countries) a massive audience -we can just
think of laws and constitutions-. The TV made it in the spare of a generation. Internet will
do that in a couple of decades. The humanists of the IT, signs manipulators, have to be
contented for this32.

32 D. Fiormonte, 2003 pp. 248-249

