Translations in library catalogs: how to find them?
Translation-related fields in catalogs based on MARC21
Although the foundational data source of our project is not a library catalogue, looking at catalogs is a logical direction for this kind of research, and we hope to share findings from based on these explorations.
Library catalogs contain a wealth of information, a resource that is not yet exploited enough in research. The information contained in a catalog is encoded according to certain metadata schemas, the most common of which is MARC (MAchine-Readable Cataloging), more recently MARC21. That’s what this post is about.
To the best of my knowledge it is quite rare that scholars who do quantitative study in either translation history or more generally in book history work directly with “raw” library records in MARC21, UNIMARC, or PICA metadata schema. Most of them use derivative formats made available by the library itself or another data service provider. Some researchers have utilized MARC21, but concentrated on a subset of the whole picture, and rarely applied the same critique to catalogue records as they would apply to other historical sources.
Library catalogues are great sources of historical facts, however none of the catalogues are complete. Nor are they uniform. Even if a catalog claims to follow a particular metadata schema (such as MARC21), the schema itself is changing over time (in the case of MARC21, there are schema updates every 6 months), as are the cataloging customs and practices. Moreover, no catalog is an island: libraries are actively exchanging records, and the incoming records inevitably bear the traces of a different cataloging tradition. Sometimes the available data had been exported from another metadata schema, and because of the differences between the schemas, or because of imperfect data transformation, some data elements might be completely missed or meaningless.
The MARC21 standard provides an explicit method to record if a given book is a translation or not (041/ind1), but due to the above mentioned problems the supporting data element is missing, incomplete or incorrect. So in order to detect translations we should investigate several data elements that somehow reflect one or another aspect of the translation (e.g. is there a code for the original language? is the translator named? are the bibliographic data of the original work recorded? etc.).
The following list contains the initial set of fields we should check in order to decide if a MARC21 bibliographic record describes or not a translated work. This is not the full story, during the research this list will be extended as we will learn more about how translated works are recorded.
Literature
The following texts give you a short introduction to the MARC21 schema and processing catalogue records in practice:
Library of Congress. 2006. [MARC 21 Bibliographic] Introduction https://www.loc.gov/marc/bibliographic/bdintro.html
Karen Coyle. 2011. MARC21 as Data: A Start. Code4Lib Journal, Issue 14, 2011-07-25. https://journal.code4lib.org/articles/5468
Jason Thomale. 2010. Interpreting MARC: Where’s the Bibliographic Data? Code4Lib Journal, Issue 11, 2010-09-21 https://journal.code4lib.org/articles/3832
As this is a novel research area, I haven’t found any papers that describe how one can extract translation information from raw MARC data. The research that uses library data usually follows some imprecise workaround: it either uses the catalog’s user interface or the library’s data extraction service. I just contacted DNB (German National Library) colleagues for this research, and they said that the CVS export (utilized by Teichmann 2022) is subject to constant change, and therefore they can not provide a mapping of MARC21 fields and the column names of CSV. The most important papers that gives a short introduction to the problems are:
Zhou, Xiaoyan, and Sanjun Sun. 2017. “Bibliography-Based Quantitative Translation History.” Perspectives 25 (1): 98–119. https://doi.org/10.1080/0907676X.2016.1177100.
Ivaska, Laura. 2020. “Identifying (Indirect) Translations and Their Source Languages in the Finnish National Bibliography Fennica: Problems and Solutions.” Mikael: Kääntämisen Ja Tulkkauksen Tutkimuksen Aikakauslehti 13 (April):75–88. https://doi.org/10.61200/mikael.129304.
Teichmann, Lisa Maria. 2022. “Mapping German fiction in translation in the German National Library catalogue (1980-2020).” Doctoral thesis, Montreal: McGill University. https://escholarship.mcgill.ca/concern/theses/0p096d03z. Chapter 2.0 2.0 Extracting Translations from the DNB