Professional Documents
Culture Documents
In Praise of Humanities Data
In Praise of Humanities Data
In Praise of Humanities Data
is
my
song
in
praise
of
humani1es
data,
of
primary
sources
and
their
digital
surrogates.
“Humani1es”
and
“data”
are
two
terms
that
sit
uneasily
beside
one
another,
because
in
the
humani1es
we
deliberately
and
on
purpose
do
not
prac1ce
the
scien1fic
method
(unless
we
do).
What
is
“the
humani1es
method”?
There
isn’t
one.
We
reserve
the
right
to
change
our
method
with
our
mood,
as
befits
a
human
studying
humans
in
a
human
way.
1
If
there
is
a
humani1es
method,
it
could
conceivably
consist
of
this:
a
person,
alone,
reading.
Not
conduc1ng
experiments
or
studies:
just
reading.
And
then,
wri1ng.
Philosophy,
history,
and
the
study
of
any
language’s
literature
all
see
this
as
their
archetypal
method,
I
think,
though
not
archaeology
(which,
yes,
was
classified
as
belonging
to
the
humani1es
by
no
less
than
the
Na1onal
Endowment
for
the
Humani1es
in
1965,
the
year
of
its
founding).
That
is
why
it
is
a
truism
and
cliché
to
say
that
the
library
is
the
humani1es
laboratory.
And
what
we
read
and
how
and
how
much
and
how
quickly
is
changing,
changing
uSerly.
We
live
in
an
age
where
answers
are
as
easy
to
come
by
as
parking
spaces.
It
wasn’t
always
so.
2
On
Saturday,
November
17,
1860,
the
masthead
of
the
semi-‐scholarly
London
periodical
Notes
and
Queries
described
itself,
as
it
had
for
the
last
thirteen
years,
as
“A
Medium
of
Inter-‐Communica1on
for
Literary
Men,
Ar1sts,
An1quaries,
Geneaologists,
Etc.”
3
Below
the
masthead
were
adver1sements
of
recently
published
or
soon-‐to-‐be-‐
published
books,
such
as
Carthage
and
Its
Remains.
4
An
ad
for
the
London
Library
boasted
of
80,000
volumes,
a
reading
room
“furnished
with
the
principal
Periodicals,
English,
French,
German,”
and
a
catalogue
that
could
be
purchased
for
only
nine
shillings
and
sixpence.
“ This
EXTENSIVE
LENDING
LIBRARY,
the
only
one
of
its
kind
in
London,”
was
open
from
10
to
6.
5
As
usual,
the
journal
was
an
ac1ve
bulle1n
board
of
ques1ons,
answers,
and
miscellaneous
contribu1ons
to
knowledge,
such
as
the
correc1on
to
Forster’s
Lives
of
Eminent
Statesmen
concerning
the
mistaken
iden1ty
of
one
Lord
Wentworth.
6
As
for
queries,
“A
Constant
Reader
and
Subscriber”
asked
for
“an
authen1c
account
of
Sawney
Bene,
the
Scotch
cannibal,”
7
while
X.
Y.
wondered
if
anyone
could
tell
him
who
wrote
the
1830
tragedy
called
“Wismar,”
and
Saxon
asked,
“Can
you,
or
one
of
your
correspondents,
inform
me
by
whom
the
term
‘God's
Acre,’
as
applied
to
a
churchyard,
was
first
used
in
English
literature?
It
appears
in
the
wri1ngs
of
Longfellow,
who
seems
to
have
adopted
it
from
the
German;
but
I
have
some
doubts
whether
it
had
not
been
previously
used
by
one
of
our
early
writers
—
George
Herbert
for
instance.”
8
Most
of
these
queries
are
easily
answered
today,
by
means
other
than
this
clever
one
of
asking
people.
There’s
a
Wikipedia
page
for
Sawney
Bean,
of
course,
which
avers
that
he
probably
never
existed
(although
the
legend
“is
part
of
the
Edinburgh
tourism
industry”).
9
The
Oxford
English
Dic1onary
suggests
that
George
Herbert
never
used
the
term
“God’s
Acre,”
though
it
had
appeared
a
few
1mes
at
least
in
the
17th
century,
10
and
a
Google
Book
Search
generally
confirms
this,
although
the
term
also
turns
up
in
the
1828
Harvard
Register
-‐-‐
a
possible
source
for
New
Englander
Longfellow
-‐-‐
as
well
as
in
a
few
other
interes1ng
sources.
11
The
ques1on
of
who
wrote
the
1830
drama
“Wismar:
A
Tragedy”
is
a
bit
harder,
however,
and
it
may
forever
remain
unanswered.
12
Notes
and
Queries
is
s1ll
published
today,
by
Oxford
Journals,
but
it
has
changed,
as
you
can
see
by
the
table
of
contents.
13
14
15
16
17
18
19
20
Notes,
notes,
notes,
notes
.
.
.
and,
finally,
a
reader’s
query.
21
The
notes
in
themselves
have
changed
somewhat,
as
well:
they
read
more
s1ffly
to
me,
they
sound
more
professional,
more
academic,
less
personal,
even
considering
the
more
formal
Victorian
dic1on
in
which
an
1860
author
explained
that
a
mistake
he
had
made
took
place
in
“a
1me
of
great
domes1c
anxiety.”
I
some1mes
wonder
whether
Victorian
humanists
are
staring
longingly
down
at
us
from
heaven,
longing,
just
longing,
to
get
their
hands
on
our
research
tools.
Of
course,
then,
as
now,
not
all
humani1es
ques1ons
–
perhaps
not
even
most
–
could
be
answered
with
data,
informa1on,
facts,
research,
and
then,
as
now,
there
are
scrupulous
researchers
and
not-‐so-‐scrupulous
researchers,
which
makes
a
big
difference
no
maSer
what
tools
you
have
at
your
disposal.
22
The
Victorian
translator,
cri1c,
poet,
librarian,
and
honorary
M.A.
Edmund
Gosse,
for
instance,
was
a
notoriously
bad
researcher.
In
the
fall
of
1886,
in
what
Gosse’s
biographer
Ann
Thwaite
calls
“the
central
episode
of
Edmund
Gosse’s
literary
career,”
the
cri1c
John
Churton
Collins
aSacked
Gosse’s
literary
history
From
Shakespeare
to
Pope
for
its
unscholarliness
(277).
“We
have
even
refrained
from
discussing
maSers
of
opinion,”
wrote
Collins
in
a
widely-‐read
Quarterly
Review
piece.
“We
have
confined
ourselves
en1rely
to
maSers
of
fact–to
gross
and
palpable
blunders,
to
unfounded
and
reckless
asser1ons,
to
such
absurdi1es
in
cri1cism
and
such
vices
of
style
as
will
in
the
eyes
of
discerning
readers
carry
with
them
their
own
condemna1on”
(qtd.
in
Thwaite
282).
Gosse
was
just
about
to
take
up
a
faculty
posi1on
of
Clark
Lecturer
at
Cambridge
when
the
denuncia1on
appeared.
In
the
comments
of
Gosse’s
biographer
on
the
episode,
we
get
a
portrait
of
another
kind
of
Victorian
researcher.
Thwaite
writes,
“ There
is
no
ques1on
that
Collins
was
a
fana1c
and
a
pedant.
Later
in
life
he
would
search
the
registers
of
forty-‐two
Norwich
churches,
trying
to
pin
down
the
elusive
birth-‐date
of
Robert
Greene
for
an
edi1on
he
was
edi1ng.
But,
as
far
as
Gosse’s
book
was
concerned,
Collins
happened
to
be
right…
From
Shakespeare
to
Pope
is
full
of
extraordinary
mistakes”
(278).
Gosse’s
career
did
survive
the
blow–he
took
up
his
posi1on
as
scheduled–but
his
reputa1on
as
a
scholar
was
never
the
same.
During
the
scandal,
Henry
James
remarked
in
a
leSer
that
Gosse
“has
[emphasis
original]
a
genius
for
inaccuracy
which
makes
it
difficult
to
dress
his
wounds”
(qtd.
in
Thwaite
339).
23
Gosse’s
research
inep1tude
or
carelessness,
however,
probably
contributed
to
the
existence
some
great
poetry,
for
instance
Dylan
Thomas’s
“Do
not
go
gentle
into
that
good
night.”
How,
you
ask?
24
“Do
not
go
gentle
into
that
good
night”
is
a
villanelle,
a
19-‐line
6-‐stanza
alterna1ng-‐
refrain
poe1c
form
with
only
two
rhymes
that,
with
a
lot
of
help
from
Gosse,
had
for
over
a
century
the
reputa1on
of
being
an
ancient
French
poe1c
form.
In
1877,
Gosse
published
an
ar1cle
in
the
Cornhill
Magazine
1tled
“A
Plea
for
Certain
Exo1c
Forms
of
Verse,”
in
which
he
explained
the
rules
of
six
ancient
(or
“ancient”)
French
forms
and
gave
examples,
wri1ng
them
himself
when
necessary.
In
the
ar1cle,
Gosse
reprinted
a
16th-‐century
poem
1tled
“J’ay
perdu
ma
tourterelle”
by
the
French
poet
and
professor
of
La1n
Eloquence
Jean
Passerat,
liSle
realizing
that
his
“example”
was
in
fact
the
only
early
poem
in
that
form.
Gosse
did
write
(in
a
slightly
puzzled
tone),
“I
do
not
find
that
much
has
been
recorded
of
[the
villanelle's]
history,
but
it
dates
back
at
least
as
far
as
the
fiueenth
century”
(64).
AdmiSedly,
at
this
point
Gosse
had
done
liSle
worse
than
rely
on
a
mistaken
source,
Théodore
de
Banville’s
PeBt
traité
de
poésie
of
1872,
but
he
was
later
to
repeat
his
error,
with
less
excuse.
25
In
1879,
two
years
auer
“A
Plea
for
Certain
Exo1c
Forms
of
Verse,”
a
Parisian
bibliophile
and
poet
named
Joseph
Boulmier
published
a
book
of
villanelles
in
French
all
modeled
auer
Passerat’s
poem.
It
is
likely
that
Boulmier
owned
a
copy
of
the
1606
work
in
which
Passerat’s
“J’ay
perdu
ma
Tourterelle”
first
appeared;
he
certainly
seems
to
have
been
the
first
nineteenth-‐century
admirer
of
the
villanelle
to
consult
it.
But
Boulmier
the
book
collector
did
more
than
consult
that
single
volume,
a
volume
Gosse
couldn’t
have
goSen
hold
of:
he
searched
through
everything
he
had,
and
came
to
the
correct
conclusion:
26
“One
fine
day,
auer
having
spoken
successively
of
the
rondeau,
of
the
triolet,
of
the
ballade,
of
the
lai,
of
the
virelai,
of
the
chant
royal,
the
author
of
I
no
longer
know
which
trea1se
on
versifica1on,
bungled
to
hell
like
they
almost
always
are,
finally
tackled
the
villanelle,
having
the
idea,
or
perhaps
the
luck,
to
cite
as
a
model
of
this
last
genre–and
auer
all
he
wasn’t
wrong–a
certain
naïve
masterpiece
escaped,
God
knows
how,
from
the
pen
of
the
scholar
Passerat….The
turtledove
of
Passerat
once
launched
into
circula1on,
what
happened
to
it?
All
the
trea1ses
on
versifica1on
that
succeeded
one
another
and
copied
one
another
in
single
file,
accompanying
this
or
that
grammar,
this
or
that
rhyming
dic1onary,
did
not
fail
to
drag
it
back
on
the
scene,
and
especially
to
present
it
as
a
type
from
which
it
was
absolutely
forbidden
to
depart….Well,
I
say
it
without
fear:
you
can,
as
I
have
done
myself,
page
through
all
the
essays
on
versifica1on
from
the
fiueenth
and
sixteenth
century,
one
auer
another;
you
will
not
find
there
the
least
trace
of
Passerat’s
turtledove,
which
is
to
say
nothing
that
resembles
this
lovely
form.”
27
In
his
entry
on
the
villanelle
for
the
monumental
1911
Encyclopaedia
Britannica,
Gosse
(by
then
supposedly
a
wise
elder)
hardly
retreated
from
the
asser1ons
he
had
made
over
thirty
years
earlier.
Ci1ng
Boulmier,
Gosse
conceded
that
there
were
no
schema1c
double-‐refrain
villanelles
before
Passerat,
yet
(like
Boulmier
himself)
he
did
not
conclude
that
it
was
he
and
his
contemporaries
who
were
responsible
for
defining
the
modern
form
of
the
villanelle
in
the
nineteenth
century:
“VILLANELLE,
a
form
of
verse,
originally
loose
in
construc1on,
but
since
the
16th
century
bound
in
exact
limits
of
an
arbitrary
kind.
.
.
.
It
appears,
indeed,
to
have
been
by
an
accident
that
the
special
and
rigorously
defined
form
of
the
villanelle
was
invented.
In
the
posthumous
poems
of
Jean
Passerat
(1534-‐1602),
which
were
printed
in
1606,
several
villanelles
were
discovered,
in
different
forms.
One
of
these
became,
and
has
remained,
so
deservedly
popular,
that
it
has
given
its
exact
character
to
the
subsequent
history
of
the
villanelle.”
Gosse’s
plea,
you
see,
had
been
successful,
and
because
of
his
influence
there
had
been
a
small
villanelle
vogue
in
England
among
the
Parnassians
at
the
end
of
the
nineteenth
century.
James
Joyce,
eighteen
years
old
in
1900,
played
along,
and
later
reprinted
a
piece
of
his
poe1c
juvenilia
in
1914’s
Portrait
of
the
ArBst
as
a
Young
Man.
From
there,
and
helped
along
by
poetry
handbooks
quo1ng
one
another
in
single
file,
the
villanelle
became
entrenched
in
English
poetry
with
a
reputa1on
as
an
ancient
French
form,
leading
not
only
to
“Do
not
go
gentle
into
that
good
night”
but
also
to
Elizabeth
Bishop’s
“One
Art”
(recited
in
the
Cameron
Diaz
flick
In
Her
Shoes
–
“the
art
of
losing
isn’t
hard
to
master.”)
28
When
I
first
conducted
the
research
on
the
villanelle
that
led
to
this
tale
of
good
(but
ignored)
and
bungled
(but
influen1al)
research,
I
took
it
upon
myself
to
find
the
text
or
texts
that
caused
Banville
(Gosse’s
1872
source)
to
believe
that
the
villanelle
was
an
ancient
French
form.
Banville
had
begun
wri1ng
villanelles
himself
in
1845,
so
I
began
to
search
for
any
and
all
French
poetry
handbooks
and
anthologies
published
between
1606
and
1845,
with
special
aSen1on
to
early
nineteenth-‐century
works
that
Banville
would
probably
have
had
to
hand.
I
worried
that
I
might
have
to
search
for
works
in
other
languages,
as
well,
but
it
was
surely
best
to
begin
with
works
in
French.
My
chief
resource
in
compiling
the
list
of
1tles
was
WorldCat,
which
I
regularly
plied
with
various
"poe*"
strings.
From
my
carrel
in
the
stacks
of
Alderman
library
at
the
University
of
Virginia,
I
began
to
make
forays
into
the
stacks
from
which
I
would
return
with
armfuls
of
books
that
I
would
then
page
through,
just
like
Boulmier
(how
much
had
changed
since
1879?),
looking
for
men1ons
of
the
villanelle
form
or
of
Passerat
or
of
"J'ay
perdu
ma
Tourterelle,"
and
looking
for
other
poetry
books
to
gather
or
to
order
from
Interlibrary
Loan.
Whenever
I
visited
the
shelves,
I
would
also
scan
the
proximate
volumes
and,
more
ouen
than
not,
scoop
them
up
to
take
back
to
my
carrel
-‐-‐
ouen,
I'm
sorry
to
say,
without
checking
them
out.
I
remember
that
it
was
a
week
or
so
into
this
process
that
I
discovered
a
1986
Slatkine
reprint
of
an
1844
work
by
an
author
named
Wilhelm
Ténint.
Standing
at
the
shelf,
I
paged
through
un1l
I
found
an
entry
that
both
cited
Passerat
and
claimed
that
the
villanelle
was
an
old
fixed
form.
Siegel
also
men1oned
that
Banville
himself
had
made
marginal
notes
on
the
manuscript
of
Ténint’s
Prosodie.
Remember,
now,
the
year
was
2003.
I
had
not
only
the
well-‐stocked
stacks
of
an
excellent
research
library
at
my
disposal,
but
also
Google,
and
also
the
WorldCat
database.
Google
-‐-‐
the
regular
search
engine,
mind
you,
not
Google
Books
-‐-‐
gave
me
a
few
par1cularly
good
leads
at
other
points
in
my
research.
Auer
the
Ténint
discovery,
I
con1nued
to
look
for
other
men1ons
of
the
villanelle
form
in
early
19th-‐century
French
texts,
but
I
found
very
liSle,
almost
nothing.
Flash
forward
five
years,
only
five
years,
and
imagine
me
now,
if
you
will,
engaged
in
co-‐wri1ng
a
new
entry
on
the
villanelle
for
the
forthcoming
revised
edi1on
of
the
Princeton
Encyclopedia
of
Poetry
and
PoeBcs,
edited
by
Stephen
Cushman,
my
disserta1on
advisor.
This,
obviously,
was
our
big
chance
to
correct
the
record
about
the
villanelle
in
the
gold
standard
of
poetry
handbooks.
And
so
I
revisited
my
search
for
men1ons
of
the
villanelle
and
of
"J'ay
perdu
ma
Tourterelle"
between
1606
and
1845,
and
this
1me
I
used
Google
Books.
29
30
Using
Google
Book
Search,
I
found
38
texts
that
might
have
influenced
Ténint,
38
more
sources
in
that
trail
of
textual
transmission,
more
evidence
of
what
was
known
and
thought
about
the
villanelle
in
that
fragile
1me
when
a
mistake
that
would
engrave
itself
in
the
record
for
more
than
a
century
was
just
beginning
to
flap
its
delicate
buSerfly
wings.
I
didn't
find
anything
that
directly
contradicted
my
claim
that
the
Ténint
work
can
be
considered
the
chief
entry
point
of
the
villanelle
error,
but
what
I
did
find
were
numerous
texts
that
smoothed
its
way.
To
sa1sfy
my
conscience,
I
included
two
of
the
more
popular
dic1onaries
and
encyclopedias
that
Google
Book
Search
turned
up
for
me
in
the
PEPP
entry.
I’m
not
sure
I
can
convey
properly
through
this
somewhat
procedural
narra1ve
the
thrill
I
felt
at
finding
the
Ténint
source
by
siuing
through
dozens
of
books
with
my
bare
hands,
and
the
dismay
I
felt
at
finding
( just
a
liSle
too
late)
thirty-‐eight
addi1onal
sources
by
siuing
through
millions
of
books
with
Google
Book
Search.
31
That
“data
dismay”
is
something
researchers
have
always
felt,
of
course.
Witness
Virginia
Woolf’s
descrip1on
of
a
trip
to
the
Library
of
the
Bri1sh
Museum,
feeling
as
though
she
would
“need
claws
of
steel
and
beak
of
brass
even
to
penetrate
the
husk”
of
all
her
data,
as
though
she
were
some
kind
of
steampunk
clockwork
woodpecker.
32
One
of
the
chief
aims
of
digital
humanists
since
the
90s
has
always
been
simply
to
get
more
stuff
online,
preferably
in
a
scholarly
way.
We’re
just
now,
especially
but
not
exclusively
with
text,
beginning
to
say,
Okay,
we’ve
put
a
lot
of
stuff
online.
Our
primary
sources,
our
data,
are
now
digital.
Google
has
put
a
lot
online,
TwiSer
has
put
a
lot
online,
humanity
has
put
a
lot
online.
Now
what
do
we
do
with
it?
In
2009,
the
Na1onal
Endowment
for
the
Humani1es’s
Office
of
Digital
Humani1es
put
that
ques1on
to
researchers,
but
almost
as
a
dare.
What
can
you
do
with
all
that
data,
they
asked.
Show
us.
The
Digging
into
the
Enlightenment
project,
for
interest,
will
look
at
53,000
18th-‐century
leSers.
33
The
Digging
into
Data
project
is
only
part
of
a
larger
trend,
some1mes
called
“distant
reading,”
in
a
term
taken
from
Franco
More{’s
Graphs,
Maps,
Trees,
shown
here
on
the
social
reading
site
GoodReads.
34
Examples
of
distant
reading
include
some
of
the
work
done
with
text
mining,
analysis,
and
visualiza1on
tools
such
as
the
MONK
project,
described
in
the
2008
ar1cle
“How
Not
to
Read
a
Million
Books.”
35
Tanya
Clement’s
work
with
Gertrude
Stein’s
Making
of
America
is
interes1ng
not
only
for
its
conclusion,
which
is
that
the
text
has
a
decided
structure
and
paSern
that
is
not
apparent
to
a
human
reader,
but
for
its
stated
premise:
that
the
work
is
unreadable
by
humans.
(A
neutral
observa1on,
not
an
aesthe1c
judgment.)
36
But
“distant
reading”
need
not,
perhaps,
entail
sta1s1cs
and
machines.
Speaking
with
a
journalist
at
the
New
York
Times
about
his
book
How
to
Talk
About
Books
You
Haven’t
Read,
Pierre
Bayard
described
some
very
human
and
qualita1ve
and
incomplete
and
yet
s1ll
valuable
modes
of
distant
reading:
37
“We
are
taught
only
one
way
of
reading,”
he
said.
“Students
are
told
to
read
the
book,
then
to
fill
out
a
form
detailing
everything
they
have
read.
It’s
a
linear
approach
that
serves
to
enshrine
books.
People
now
come
up
to
me
to
describe
the
cultural
wounds
they
suffered
at
school.
‘You
have
to
read
all
of
Proust.’
They
were
trauma1zed.”
“They
see
culture
as
a
huge
wall,
as
a
terrifying
specter
of
‘knowledge,’
“
he
went
on.
“But
we
intellectuals,
who
are
avid
readers,
know
there
are
many
ways
of
reading
a
book.
You
can
skim
it,
you
can
start
and
not
finish
it,
you
can
look
at
the
index.
You
learn
to
live
with
a
book.”
38
I
think
perhaps
that
large
sets
of
humani1es
data,
like
books,
can
be
read
in
the
way
Bayard
describes:
not
comprehensively,
but
by
living
with
them.
Their
sheer
size
suggests
but
need
not
entail
sta1s1cal
analysis
and
visual
display.
We
can
browse
very
large
humani1es
datasets,
skim
them,
live
with
them,
instead
of
reading
them
in
a
linear
fashion
with
computers.
Auer
all,
how
likely
is
it
that
the
database
itself
is
comprehensive?
Isn’t
it
very
likely
itself
simply
a
sample?
The
Reading
Experience
Database,
for
instance,
itself
a
compelling
example
of
a
very
large
humani1es
dataset,
admits
quite
charmingly
that
it
can
never
be
comprehensive:
39
“While
RED
may
never
be
the
comprehensive
database
that
would
allow
us
to
make
rigorously
sta1s1cal
arguments
for
reading
habits
in
given
places
or
1me
periods,
it
can
func1on
as
a
source
of
compelling
examples.
The
more
entries
that
go
in
it,
the
more
it
can
approach
the
ideal,
but
it
can
never
hope
to
be
a
comprehensive
database
of
every
archive,
every
annotated
page,
every
diary
manuscript,
in
the
Bri1sh
World,
1450-‐1945,
much
and
all
as
we
may
want
it
to!”
40
Like
our
datasets,
our
methods
need
not
be
comprehensive.
Lately
I’ve
been
interested
in
the
possibili1es
of
what
I
think
of
as
“datum
love”:
the
selec1on
(random,
serendipitous,
affec1onate)
of
compelling
examples.
In
the
Reading
Experience
Database,
for
instance,
some
idling
through
the
byways
turns
up
the
interes1ng
fact
that
Dickens
was
once
at
least
read
by
“a
revolu1onary
Russian
rag
merchant.”
Isn’t
that
1dbit
a
spur
to
further
inquiry?
In
my
experience,
faced
with
the
“Fordist,
func1onalist”
impera1ve
to
write
that
Kathleen
men1oned
yesterday,
humani1es
scholars
of
any
rank
generally
begin
with
a
text,
a
topic,
a
theory,
or
a
text
and
a
topic
and
a
theory,
and
we
proceed
on
the
assump1on
that
we
must
produce
an
original
interpreta1on
or
argument.
What
I
wonder
is
whether
instead
we
can
begin
with
the
data,
or
with
a
datum,
and
simply
watch
for
what
it
may
tell
us,
even
if
what
it
tells
us
is
simply
a
story.
What
I
hope
is
that
all
our
data
will
bring
forth
a
new
age
of
humanis1c
induc1on,
induc1on
that
can
but
need
not
necessarily
rely
on
sta1s1cs
and
visualiza1ons.
41
And
what
I
hope,
too,
is
that
more
compilers
of
databases
will
recognize
that
they
are
at
least
as
well-‐fiSed
as
anyone
to
tell
us
what
the
data
can
tell
us.
Archivists
and
librarians,
especially,
know
the
data,
because
they
feed
and
groom
it.
Tim
SherraS,
for
instance,
is
an
archivist,
historian,
and
programmer
in
Australia
who
has
recently
begun
a
project
called
Invisible
Australians.
When
he
worked
for
the
Na1onal
Archives
of
Australia,
SherraS
no1ced
that
there
were
a
great
many
print
records
that
could
be
converted
into
structured
data.
42
One
such
type
of
print
record
is
the
“Cer1ficate
Exemp1ng
from
Dicta1on
Test,”
or
CEDT.
The
CEDT
was
a
bureaucra1c
outgrowth
of
the
White
Australia
Policy,
which
restricted
non-‐white
immigra1on
to
Australia
from
1901
to
1973:
it
was
a
form
that
enabled
exis1ng
non-‐white
residents
of
Australia
to
leave
and
re-‐enter
the
country
without
being
mistaken
for
immigrants.
Over
50,000
paper
CEDTs
reside
in
the
Na1onal
Archives
of
Australia,
and
these
forms
have
a
great
deal
to
tell
history
about
some
of
the
people
on
the
margins
of
history.
Obviously,
they
could
tell
more
if
their
data
were
digital:
enter
the
Invisible
Australians
project.
43
Choosing
a
CEDT
subject
surely
almost
at
random,
SherraS
narrates
some
of
the
life
of
Charlie
Allen,
a
half-‐Chinese
man:
“Charlie
was
born
in
Sydney
in
1896.
His
mother
was
Frances
Allen
(some1me
sweet
shop
owner
and
brothel
keeper),
his
father
Charlie
Gum
(a
buyer
for
Wing
On
company).
Charlie
was
raised
by
his
mother,
but
in
1909,
at
the
age
of
thirteen,
he
was
taken
to
China
by
his
father.
His
father
returned
to
Sydney,
leaving
Charlie
in
China.
He
lived
with
rela1ves
in
the
town
of
Shekki
(inland
from
Hong
Kong)
for
six
years.
Charlie
was
homesick,
but
had
no
means
of
ge{ng
back
to
Australia.
His
mother
aSempted
to
enlist
government
help
but
to
no
avail.
Charlie
finally
returned
in
1915.
The
following
year
he
enlisted
in
First
AIF
(well,
actually
he
enlisted
three
1mes,
and
was
discharged
as
medically
unfit
each
1me).
Charlie
married
in
Sydney
in
1917
and
had
two
daughters
soon
auer.
He
returned
to
China
in
1922
for
seven
months.
Charlie
Allen
died
in
1938
as
the
result
of
an
industrial
accident.
He
was
forty-‐one.”
To
my
mind,
SherraS
is
nearly
the
ideal
digital
humanist,
not
only
because
he
is
a
builder
of
databases,
but
because
his
ins1nct,
once
he
has
built
a
database,
is
to
use
it
to
tell
stories.
Few
or
no
graphs,
maps,
and
trees
for
him.
44
If
you
would
become
a
digital
humanist,
then,
what
I
would
encourage
you
to
do
is
to
go
and
find
compelling
examples.
Browse.
Play.
Observe.
Induce.
Roll
around
in
the
data.
Then
tell
me
which
pieces
stuck
to
you.
I
for
one
will
be
fascinated.
45