Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Klein and D'Ignazio

The process of converting qualitative experience into data can be empowering, and even has
the potential to be healing

When thoughtfully collected, quantitative data can be empowering too. HOW?

● Questioning classification systems


● Rethinking binaries in data visualization
● refusing data, recovering data
● Counting as healing and as accountability

Acts of counting and classification, especially as they relate to minoritized groups, must always
balance harms and benefits. When data are collected about real people and their lives, risks
ranging from exposure to violence are always present. But when deliberately considered, and
when consent is obtained, counting can contribute to efforts to increase valuable and desired
visibility.

1800s -

disparities introduced into datasets had to do with much larger and much more profound
asymmetries of power.

The asymmetries are often directly reflected in the power dynamics between who is doing the
counting and who is being counted.

But when a community is counting for itself, about itself, there is the potential that data collection
can be not only be empowering but also healing.

rethink Binaries and Hierarchies

- Counting and classification can be powerful parts of the process of creating knowledge.
But they’re also tools of power in themselves.
- An intersectional feminist approach to counting insists that we examine and, if
necessary, rethink the assumptions and beliefs behind our classification infrastructure,
as well as consistently probe who is doing the counting and whose interests are served.
- Counting and measuring do not always have to be tools of oppression. We can also use
them to hold power accountable, to reclaim overlooked histories, and to build collectivity
and solidarity.
-

The Ancient World in Nineteenth-Century Fiction; or, Correlating Theme, Geography, and
Sentiment in the Nineteenth Century Literary Imagination

Macroanalysis

First was the problem of ​identification.​ How could we use computers to accurately identify
place-names in novels?

To tackle the first problem of identification, we scrapped the gazetteer in favor of Named Entity
Recognition (NER). NER is a Natural Language Processing (NLP) tool that identifies places
using a trained statistical model that is sensitive to semantic and syntactic information in the
text.

Second was the problem of ambiguity; place names are ambiguous. Charlotte, for example, is
used only as a first name in our corpus and never as a city.

-devised a way of muting that problem of place ambiguity using a technique called topic
modeling.

The topic modeling process has a way of identifying and measuring these textual features as
they exist within their individual plates and within the “buffet” as a whole.

books = plates that you fill up at a buffet full of topics,


topics = places

It turns out that a similar approach can be used to effectively identify and disambiguate the
place names in the corpus. This topic modeling method works because when a writer sets a
book in a particular place that writer tends to mention other places that are geographically
related.
● How does Jockers use correlation? 
Jockers uses correlation to create clusters
- show that topic modeling can be used as an effective way to identify collocated place
names and thereby aid in place-name disambiguation by giving us a general sense of
the regions being talked about in the corpus.

 
 
● What data does Jockers use?
a corpus of 3500 novels

You might also like