Final Paper ZB

The Future of Archives. Is it Worth Saving?
Zach Baker
LIS 60652 Foundation of Recordkeeping
Final Paper
12/11/20
Introduction
Do not worry. I do not want to get rid of archives—quite the contrary. As will be
discussed, massive growth of archives will be needed in the decades and centuries to come. I do
not expect the growth that is needed will be matched, though. If recent history and the current
state of archives are any indications of their future, then we are in for troubled times. I do not
want to summarize the past, present, or future of archives too tightly. Doing so would be
ill-advised and most assuredly be wrong. What I can do is discuss what has happened and what is
happening of importance and, with those thoughts and ideas, attempt to hypothesize on the
possible future of archives in the US and even globally.
The title asks a straightforward question that is probably asked hundreds of times a day.
Is it worth saving? In the era of ‘big data,’ it seems every morsel of information one creates,
intentionally or otherwise, falls into the yes category. From knowing what I posted on social
media since the beginning of use to knowing what purchase I made two years ago, I can find it
all. I can still visit social media sites of those who died ten plus years ago. Rarely does anything
ever completely disappear from the internet. Just ask our current President who wishes he has
that kind of power. But it goes even further than a simple reminder of what I posted or bought
years ago. When I use my Kroger card, it records everything I purchased and uses that
information in future advertising specific to me. What else they are doing with this massive
amount of data is buried in the user agreement and little read shopping card contracts.
While many see this mass collection of data as problematic, some see the potential gain.
Millar states that “many archivists and digital experts suggest that with the unlimited capacity of
cloud computing systems and the tremendous potential for research into ‘big data,’ keeping
more- in theory, keeping all- is easier and potentially more fruitful than ever before” (Millar,
179). Though the other side, which Millar sides with, asks the most critical question. “How can
we know which bit of information is core evidence and which is just dross that clutters our hard
drives and our minds” (Millar, 179)? While Millar’s question seems like it should have an easy
answer, it takes time for that answer to surface. Time that archivists are already lacking.
The argument above is never simple, though. We witness the use of big data in research
worldwide and in all subjects, and it makes meaningful impacts on that research, but this
enormous mass of data is under the tutelage of the corporations that collect it. Some close off
access while others open portions of their collection to research and study. This brings up many
possible ethical dilemmas along with privacy issues. Let us not dive into that in this short paper.
The question is eventually asked: what is an archivist to do with such an amassed amount of data
of a single individual? Each company retains the data that each of its users creates, but everyone
is still creating much more than before, even though it may be taking up less space. Archives are
commonly seen by the populace as collections of tangible materials or records. This is changing,
but it is a slow and meticulous change that relies heavily on physical records. Online records
continue to grow every day, and with it, so does reach and access for users, but what happens
when that “save everything” attitude ends up on the doorstep of archivists? Is it all worth saving
then?
The Fonds
Millar states that keeping a collection whole is “not realistic” (Millar, 211). This is
understandable when one looks at the sheer volume of records that an individual creates. If we
factor in the wishes of the owners of the materials and the length of history of those records, it is
very difficult to keep a single collection together. I agree with Millar’s short point, but I do
understand the importance of the fonds in certain situations and foresee a future where an entire
life can be represented and presented in the records/data that was collected and created by/on an
individual throughout their life. Imagine an individual who grew up with social media and
actively participated in it throughout their whole life up to death. If everything that a person
created were to be collected and presented, then people hundreds of years from now would be
able to follow the daily rituals of many past people of the world. This is described and discussed
by Anne Gilliland, who sees this as an “opening up of new avenues for exploring how
individuals think about themselves as creators of records and manager of their memories and
documentary traces” (Gilliland, 202). Currently, everyone is their own archivist without training.
They are collecting and keeping without regard for if it is useful or not.
Who knows where this self-reflection will take an individual? I believe it will lead to
even more creation and even more extensive collections in the future. Collections that archivists
will have to sift through and find the “core evidence.” I am currently reading the second book in
a series that started with Ready Player One. In this series, an important character has passed, but
he amassed records of his life and interests that could fill the largest cloud repository. I bring this
up because the other characters in the story believed every piece to be of value. It was mentioned
that the collection was eight zettabytes. This amount is unimaginable for a collection of a single
person, but I digress. As the future unfolds, it seems that we are constantly creating more and
more records on ourselves, whether we know it or not. The idea of the fonds will slowly shrink
away as we continue to create and piece out collections to places where they can be of use. I do
not think the fonds will entirely die out. I think it will have a strong resurgence once there is
ample technology to fully archive whole digitally created collections.

The Digital Beyond
The problems with digital preservation are many, and I am sure answers will be
developed quickly. One problem that has arisen is differing stories or multiple truths. Johns
Hopkins was, by all accounts, an abolitionist until a recent discovery in an 1850 census showing
that Johns Hopkins was a slave owner (Schuessler, 2020). This interesting archival find does not
specifically mean that Johns Hopkins was a racist. There is some evidence to the contrary, but
the story the university backed did not include this piece of information. Secrets can be hidden
no matter how much we know of a person. Any set of records always comes with an asterisk of
authenticity. The authenticity inherent in a collection is only authentic to the person who created
it. It is their view based on their own experience. It is rarely the whole story.
This is not a new truth. However, we must be careful in our duty to preserve. The
processes that we use today are not perfect, and with the growth of numbers of records, backlogs
grow larger, and processing becomes more difficult. Green and Meissner highlight this fact and
offer a couple of pushes to a new direction. A direction stressing that “we have to start doing
things differently if we hope to begin reducing backlogs and serving our patrons, resource
allocators, and donors better than we have done” (Green & Meissner, 255). This sentiment is
understandable as Green states, “our profession has been struggling with backlogs for at least
sixty years” and that by taking a “larger view” in processing collections, “we can anticipate real
progress in reducing our backlogs'' (Green & Meissner, 255). The actions truly needed are a bit
more drastic and important than described by Green &Meissner, though their recommendations
are excellent starting points. Creation is constantly growing, and inherent within this growth will
be the difficulty of processing collections when that point comes.

The question must be asked, though; what happens if action is not taken and the problem
of backlogs only becomes even more compounded than ever before with larger collections being
donated? Will there be a time when records simply cannot be accepted, and droves of collections
are destined to rot in family homes? The potential outcomes of the future will become a reality as
time goes on. Archival processing may need to take the steps back as described by Green &
Meissner (2005) and look at the forest and not just the trees. This sentiment will create faster
processing but will not be as in-depth as the processing once was. It needs to be paired with open
access to researchers who can process collections at a more micro level.
This presents its own challenges. If open to researchers to process records, there will be
no cohesive singular process. Each researcher will process records differently unless a
standardization is adopted. Of course, this adoption would be difficult to achieve worldwide,
leaving each potential organization, country, and even institution with their own standards.
Therein lies the silver lining of the new digital—a potential for global standardization along with
in-depth connectivity of all archival institutions. A hive-mind dynamic is potentially what this
problem needs for it to be solved. We have not yet tried to attempt a universal standardization
yet, but as collections and digitization continue to be revolutionized and updated by new
technologies, there may be a time in the not too distant future where we see an attempt to
standardize. We may also be lucky enough to have even more help from technology than we
know now.
Artificial Intelligence
Artificial Intelligence through machine learning is at the forefront of almost every
profession. The hype involved is extremely high, but the realities are a bit lacking. However,
there is hope, and in the past few years, there have been motions in attempting to tie AI into the
archival profession. Rolan et al. (2019) sum it up succinctly, stating that “AI has arrived in our
field and it will produce profound changes in our working environments in the years to come.”
This is backed by four separate initiative case studies in Australia. Email management,
classification of records, retention and disposal decisions, and records management are just
several areas that AI is being implemented to streamline processes that would typically take
many worker hours and resources to complete (Rolan,2019).
These implementations were not seamless, nor were they easily done. For AI to be
applicable, it needed to be rethought with records and archives in mind. Tailoring it to this was a
big endeavor, but early results show a positive application and results. Of course, this is early and
requires more testing and even more comprehensive data. Other areas have also been opened to
AI, including data collection and annotation. This attempted collaboration is an attempt to curb
the problems in fairness, accountability, transparency, and ethics in AI. It is proposed by Jo &
Gebru that these problems can be weakened if not solved by the inclusion of archival data
collection methodologies. This interweaving of subjects hopes to strengthen areas where AI is
weak, such as inclusivity and consent, which are seen as strengths in the archival and library
fields (Jo & Gebru, 2019).
Another area that shows promise is that of AI and archival classifications. Shang et al.
have attempted, to some success, to apply advanced machine learning algorithms to systems of
archival classification. It proved that it could be done with enhancements to the accuracy, speed,
and efficiency of the classification process (Shang et al., 2019). With continued work and
application, the figures presented only hope to go up, strengthening the need for AI in
classifications saving precious worker hours to be used elsewhere.

Conclusion
The future can hold anything, but from the above examples and problems faced, serious
development needs to be applied to the whole of archives. Dipping a toe in the pool to test the
water will only prolong the struggles of understaffed and backlogged archives. The potential life
preservers that are being created in advanced technology should be welcomed and rigorously
applied. Only then will the archive make headway on the constantly growing backlogs and other
problems faced.
References
Gilliland, Anne. (2014). Conceptualizing 21st Century Archives. ALA Editions
Green, M & Meissner, D. (2005). More Product, Less Process: Revamping Traditional Archival
Processing. The American Archivist. Vol. 68. (208-263). Accessed through
https://meridian.allenpress.com/american-archivist/article/68/2/208/24011/More-Product-
Less-Process-Revamping-Traditional
Jo, E. S., & Gebru, T. (2019). Lessons from Archives: Strategies for Collecting Sociocultural
Data in Machine Learning.
https://doi-org.proxy.library.kent.edu/10.1145/3351095.3372829
Millar, Laura. (2017). Archives; Principles and Practices. ALA Neal-Shuman.
Rolan, G., Humphries, G., Jeffrey, L., Samaras, E., Antsoupova, T., & Stuart, K. (2019). More
human than human? Artificial intelligence in the archive. Archives & Manuscripts, 47(2),
179–203. https://doi-org.proxy.library.kent.edu/10.1080/01576895.2018.1502088
Schuessler, Jennifer. (2020, December 9). Johns Hopkins Reveals That Its Founder Owned
Slaves. The New York Times.
https://www.nytimes.com/2020/12/09/arts/johns-hopkins-slavery-abolitionist.html
Shang, E., Liu, X., Wang, H., Rong, Y., & Liu, Y. (2019). Research on the Application of
Artificial Intelligence and Distributed Parallel Computing in Archives Classification.
2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control
Conference (IAEAC), Advanced Information Technology, Electronic and Automation

Control Conference (IAEAC), 2019 IEEE 4th, 1, 1267–1271.
https://doi-org.proxy.library.kent.edu/10.1109/IAEAC47372.2019.8997992

Final Paper ZB

Uploaded by

Copyright:

Available Formats

You might also like

Final Paper ZB

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Final Paper ZB

Uploaded by

Copyright:

Available Formats

The Future of Archives. Is it Worth Saving?

LIS 60652 Foundation of Recordkeeping

possible future of archives in the US and even globally.

ample technology to fully archive whole digitally created collections.

be the difficulty of processing collections when that point comes.

access to researchers who can process collections at a more micro level.

standardization is adopted. Of course, this adoption would be difficult to achieve worldwide,

Artificial Intelligence through machine learning is at the forefront of almost every

many worker hours and resources to complete (Rolan,2019).

collection methodologies. This interweaving of subjects hopes to strengthen areas where AI is

fields (Jo & Gebru, 2019).

classifications saving precious worker hours to be used elsewhere.

Gilliland, Anne. (2014). Conceptualizing 21st Century Archives. ALA Editions

Processing. The American Archivist. Vol. 68. (208-263). Accessed through

Data in Machine Learning.

Millar, Laura. (2017). Archives; Principles and Practices. ALA Neal-Shuman.

Slaves. The New York Times.

Artificial Intelligence and Distributed Parallel Computing in Archives Classification.

Conference (IAEAC), Advanced Information Technology, Electronic and Automation

You might also like