Professional Documents
Culture Documents
Imr 659-Assignment 1
Imr 659-Assignment 1
ASSIGNMENT 1:
ARTICLE SUMMARY
PREPARED BY:
UMUL AISYAH BINTI NORDIN (2019458404)
PREPARED FOR:
PUAN NIK AZLIZA BINTI NIK ARIFFIN
SUBMISSION DATE:
28th MAY 2023
ABSTRACT
The Bovary Project, conducted by Stéphane Nicolas, Thierry Paquet, and Laurent Heutte,
focuses on the digitization of cultural heritage manuscripts, specifically the Bovary manuscripts.
This article presents the methodologies and technologies employed in the project, addressing
image acquisition, text recognition, annotation, and digital dissemination. The project aims to
contribute to the field of digital humanities by providing practical approaches to effectively
digitize and preserve cultural heritage manuscripts. Through the Bovary Project, the article
showcases the potential of digital technologies in enhancing access, analysis, and exploration of
cultural heritage. The findings and experiences shared in this article serve as valuable guidance
for future endeavors in digitizing cultural heritage manuscripts, ultimately enriching our
understanding of literary works and fostering cultural appreciation.
Keywords
Digital libraries, Genetic edition, Hypermedia, Indexation, Document image analysis.
2
INTRODUCTION
Collections of significant interest can be found in libraries and museums, but due to their
worth and state of preservation, they cannot be displayed to a large audience. With the
advancement of numerical technology, it is now possible to exhibit this cultural heritage by using
high-quality numerical replicas in place of the original documents. This allows for knowledge to
be shared while the originals are protected. Many libraries have begun the process of digitising
their collections. For personal or classroom use, permission is granted without charge to make it
digital or paper copies of all or part of this work. Any additional copies, reprints, server postings,
or list distributions require prior authorization and payment.
The pursuit of knowledge becomes crucial to the evaluation of this cultural heritage. Such
a work demands prior research into the technical tools to be used because it is expensive and
demanding. The complexity of modern manuscripts and the absence of specialised equipment
have prevented many digitization programmes from addressing them, despite the fact that they
have a keen interest in the study and interpretation of literary works. In this summary, we will
discuss about Bovary initiative, a digitization initiative of contemporary manuscripts, particularly
those of FLAUBERT, and we talk about the guiding principles, challenges, and technological
considerations of such a project.
OBJECTIVES
1. To define the concept of a modern manuscript digitization project.
2. To identify the problems faced from the technical aspect in developing the project.
3
DISCUSSION OF THE ARTICLE
The municipal library in ROUEN has begun a project to digitise the material it holds. The goal is
to create a digitization system that is effective at digitising purchased papers and capable of
displaying them in high resolution. The digitalization of the manuscript folder containing over
5,000 original manuscripts taken from "Madame Bovary," the well-known book by French
author Gustave Flaubert, is one of the objectives of this programme. This group of manuscripts
represents the source of the text; they represent iterative draughts that show the author's writing
and revising process. The eventual goal of this programme is to offer a hypertextual edition that
makes this content freely accessible and interactive on the web. Researchers, students, and
anyone who has seen the FLAUBERT manuscripts will be interested in this electronic version
because there isn't a critical edition of the author's whole literary output available online. This
multidisciplinary initiative, known as the "Bovary Project," anticipates interest from a wide
range of professionals, including librarians, literary scientists, and computer scientists. The
structure of Flaubert's draught is intricate. They include various editorial marks and a number of
non-linear text sections. It is difficult to produce this draught in an electronic format since it is so
easily interpreted.
4
Even with the advancement of multimedia technology and the opportunities offered by structured
languages and hypertext, there haven't been many electronic releases of genetic publishing up to
now. The following provides a review of the current genetic versions that are available in
electronic format (CD-ROM or internet), along with a discussion of their features and
restrictions. The most recent efforts involving the digitalization of manuscripts are included in
the second section.
Two thematic editions pertaining to the origins of Marcel Proust's work "Le temps retrouvé" and
Emile Zola's "Le rêve" are suggested among the various digitised literary works that are
accessible in text or image mode on Gallica, the website of the French National Library. These
electronic publications enable readers to view photographs of the author's handwritten notes (in
TIFF or Adobe PDF files in black and white) and the corresponding HTML text transcriptions.
These releases don't offer any tools for working on the manuscripts, but they do contain a lot of
notes explaining the history of these works. They are less genetic and more pedagogical editions.
Some projects have attempted in recent years to define the needs of users of virtual libraries and
to propose various environments devoted to the study of handwritten material. This is due to the
lack of tools suitable for the work on manuscripts and the edition of electronic documents from
handwritten sources. The majority of them resulted in the creation of a workstation prototype.
We quickly outline projects that are comparable to the Bovary project and talk about the
suggested technical fixes.
5
CONCLUSION
As we've seen, document analysis can aid in the transcription of contemporary manuscripts by
enabling the text-image coupling between the structured textual representation needed for
computer processing and the image representation describing the graphical aspect of the
manuscript. However, no document analysis tool now available can handle documents that are
this complicated and poorly structured. Therefore, the focus of our future work will be on
creating a reliable solution that uses machine learning techniques and takes into consideration
special characteristics of such papers.
6
BIBLIOGRAPHY
Nicolas, S., Paquet, T., & Heutte, L. (2003b). Digitizing cultural heritage manuscripts.
https://doi.org/10.1145/958220.958231
7
APPENDIX