Digital editions survival kit

2021-10-26, Magdalena Turska

Reconstructing an edition

Computer systems are not meant to last, to the contrary – not only do they require regular maintenance but we need to take into account the unavoidable cycle of major refurbishments. This paper, just presented at the virtual TEI conference, aims to demonstrate how critically important aspects of an edition can be reconstructed from a rather minimal data set and how such a survival kit can be useful not only for disaster recovery but also as a sustainable approach for the maintenance of scholarly publications.

The following schemata are the key components of the survival kit:

  • Source documents
  • Document encoding and transformation scheme
  • Layout templates
  • Interoperable metadata mapping specification

TEI is the perfect archival format for text-centric data: human readable and easy to process – as long as we have the capacity to read text files we can recover the information from a repository of TEI texts. TEI encoded documents with an associated ODD schema and documentation already form a solid basis for reconstruction even if the ODD would say nothing about the final form of the publication as intended by the editors.

TEI source and rendition via the Processing Model
TEI source and rendition via the Processing Model

The TEI Processing Model covers part of this territory, describing how a source document should be transformed for publication. Nevertheless, in the virtual realm, the document is always accompanied by a certain context on the page: controls to zoom in or out, facing facsimile image or switch between normalized and original spellings, just to name a few options. To explicitly define such a context and specify how a publication page would look and behave we can rely on HTML5 layout templates. A modern, web components based approach to website design gives us a beautifully simple and expressive method of assembling web pages from a virtual Lego block equivalent.

HTML5 page layout using web components
HTML5 page layout using web components

The last missing piece is to document how abstract concepts, e.g. author or date of creation are realized in the encoding so we can recover and use them for queries within the publication as well as for data interchange with other systems. Given the richness of TEI it’s impossible to prescribe what metadata needs to be gathered and how exactly it should be encoded in any given project. On the other hand, it is rather simple to express the mapping in XML, e.g. with an index configuration syntax.

Sample index configuration with fields and facets
Sample index configuration with fields and facets for a TEI document

Such a set of specifications preserves all the information necessary to rebuild the edition from scratch, focusing on the intentions and decisions of the editor while filtering out the ephemeral or secondary presentation aspects. Good to put in the vault and send into space but equally useful when the time to migrate to a new infrastructure comes.

original Dodis layout
Original Dodis document view

How does it work in practice? You might want to have a closer look at one of the TEI Publisher’s demo apps When the Wall came Down which we managed to recreate on the basis of TEI sources and accompanying ODD released by Dodis on the 30th anniversary of the Fall of the Berlin Wall. We managed to get a draft version in only 165 lines of custom code and during just one day of pre-conference workshop. Our task would be still simpler if we also had the web page template and index configuration available.

reconstructed layout
Recreated document view

Given that there’s barely an extra effort involved in assembling the survival kit, preparing it is a clear win. After all, we already have the sources and the ODD! Enriching it with a processing model is not particularly difficult, especially if we use it to generate our transformations. Similarly, in most database systems we will need to prepare the index configurations. At this point we probably don’t need to mention that TEI Publisher already implements this approach since quite a few versions (ODD with the processing model from inception, web components for user interface since version 4 and fields and facets since version 5).

Just think about it, if you pack your edition nicely, it becomes a present which archives and libraries would very much like to keep safe in their vaults and running on their servers forever…

Annotation editor released with new TEI Publisher 7.1.0

Answering the secret dream of many TEI users, the new TEI Publisher version 7.1.0 incorporates a — beautifully simple to use, yet powerful — way to enrich existing TEI documents. Just select a text passage, click on a button and within seconds — and without a pointy bracket in sight! — mark it as one of many supported annotation types. A place or person? Sure, and with built-in connectors for external authority files, too. Critical apparatus entries? We got you! Dates, corrections, regularizations and even quick fixes for typos in your transcription.

As usual, everything is customizable and extendable, so if you want a particular kind of annotation we do not support out of the box, it’s not difficult to add your own or tinker existing ones. Read more in the documentation.

The good news doesn’t end there: you can now use the TEI formula element with TeX notation for math. See the component’s demo page which presents some elaborate formulae or visit Publisher’s Demo collection which now sports shiny new examples: Euler’s Algebra for a wee help with your quadratic equations or The Italienische Madrigal by Alfred (not Albert!) Einstein, with musical scores encoded with MEI. It is nicely rendered with Verovio library through a dedicated pb-mei component and you can even listen to the piece to cheer up. And you can now set Publisher’s interface even to simplified or traditional Chinese.

TEI Publisher 7.1.0 is available as an application package on top of the eXist XML Database. Install it into a recent eXist (5.0.0 or newer) by going to the dashboard and selecting TEI Publisher from the package manager.

For more information refer to the documentation or visit the homepage to play around with it.

It’s not for the first time that our special thanks go to the Office of the Historian of the United States Department of State – this time for funding the major portion of the annotation editor. The Math support has been kindly funded by Bernoulli-Euler Zentrum in Basel.

Newsletter 2021/1

2021-02-02, Andreas
Tags: ,


We’re happy to send our first newsletter covering the past year of activities and developments within e-editiones and would like to encourage institutions or individuals who are not yet members to join forces.

e-editiones as a Society

The e-editiones society was founded on 4 May 2020 by more than 20 international partners. Our institutional members are academies, archives and libraries, edition projects and their sponsors, as well as companies. Among individual members we are proud to count archivists, librarians, editors, researchers from the (digital) humanities, and software developers. After our first attempt to obtain a non-profit association status from the St. Gallen tax office failed, e-editiones was recognized as a non-profit association with tax exemption, following an amendment to the statutes (5th November 2020). A huge thanks to the Karl Barth Foundation for generously sponsoring the necessary legal advice. Building on these foundations and developments and events listed below, our still young society is off to a great start. We hope to broaden the community and increase our membership in 2021.

More Information

2020 Events

e-editiones organized various meetings and workshops in 2020 for members and the wider community to get to know each other and exchange ideas.

Meetings

  • 04.05.2020: Virtual foundation of e-editions and first community event for members
  • 18.05.2020: First Public Community Event: Presentation of e-editiones by Andreas Kränzle with a short introduction to TEI Publisher for digital editions by Wolfgang Meier, discussion moderated by Joe Wicentowski Video
  • 20.10.2020: TEI «Vanilla» by Magdalena Turska Report

Workshops

  • Three-part online Course: Learn TEI Publisher by Wolfgang Meier
    • Learn TEI Publisher 1: «Stay Home Learn TEI Publisher From Scratch» , 8th Jun 2020 Video
    • Learn TEI Publisher 2, 15th Jun 2020 Video
    • Learn TEI Publisher 3, 22nd Jun 2020 Video
  • Music is in the air – MEI and TEI Publisher by Giuliano Di Bacco and Dennis Ried, 8th Jul 2020 Video
  • Common eXist-db/TEI Publisher. Deployment Scenarios by Olaf Schreck and Lars Windauer covers different deployment scenarios of eXist-db and TEI Publisher, 9th Sep 2020 Video
  • Open Source Advent: Presentation of a number of open source developments, 11th – 31st Dec 2020 Link

More Information

Communication

The association immediately set up communication facilities for its members and the community. For technical questions about TEI Publisher, the central communication channel is Slack. Recurring issues and questions are incorporated into the new FAQ page. For announcements and important messages, we use Twitter. Follow us!

More Information

TEI Publisher Developments: Versions 6 and 7

In 2020 two major versions of TEI Publisher were released: versions 6 and 7. The plan for these versions was born during a meeting in Basel in November 2019, when several Swiss edition projects using TEI Publisher came together to discuss how the software could be developed into an even more sustainable platform. A number of them later became founding members of e-editiones. Their primary goal was to enable institutions to manage and maintain a larger number of editions, either on the same or several servers. To reduce maintenance costs, keeping editions up to date with newer TEI Publisher versions needed to be as simple as possible. Editions relying on different versions needed to be able to run alongside each other.

Two areas had to be addressed to achieve these goals:

  1. On the client side TEI Publisher was already a collection of small, modular web components, which together formed the user interface. These components had to be extracted into a separate library, so projects could benefit from bug fixes and new components without having to update the rest of the edition’s application.
  2. On the server side TEI Publisher needed a well defined API. Such an API would provide a standardized communication channel, through which the web components on the client could talk to the server. To accommodate future updates and support backwards compatibility, the API would need to be versioned, ensuring that both sides could speak the same language.

Fortunately the Swiss Nationale Infrastruktur für Editionen – Infrastructure nationale pour les éditions agreed to support the outlined ideas, funding a substantial part of the work.

With the release of TEI Publisher 7, both areas mentioned above have been addressed and the goal has been fully reached.

Contributions also came from member institutions: the Collection of Swiss Law Sources Online and the St. Galler Missiven were built on an early version of TEI Publisher 4. Both funded a major update of their code to version 7. A number of regressions and new bugs were encountered during the migration, and appropriate fixes were contributed back into the TEI Publisher code base.

The Karl Barth Gesamtausgabe likewise served as a test bed during TEI Publisher 7 development. Furthermore, the Karl Barth project financed the Microsoft Word to TEI transformation, which is now part of TEI Publisher.

The source code of TEI Publisher and its related repositories have been moved under the roof of e-editiones and are hosted within its GitHub organization.

Other Software Packages

Beyond TEI Publisher itself, e-editiones released a number of other software packages:

  • The Open API routing library at the heart of TEI Publisher 7, called Roaster, proved to be useful even outside the humanities and has already been adopted by a project in the medical realm.
  • Born out of the Karl Barth Gesamtausgabe, e-editiones published an extension for the Visual Studio Code editor to help editors work on TEI. This includes an entity explorer to help with the markup of entities in a text by querying external authority databases from within the XML editor.
  • The Cross Search package provides a blueprint for a portal incorporating multiple editions, supporting cross-edition search facilities. This will become the basis for future hosting services.

Community Contributions

Last but not least, we would like to point out the many contributions we received with respect to translations: TEI Publisher has now been translated to more than 20 languages!

We hope to see similar interest in other areas, e.g., the newly established FAQ website. We would like to encourage everyone to add entries and turn this into a knowledge base for all users. It already hosts questions and answers concerning encoding, workflows, or best practice recommendations.

The Future

Continuing our cooperative development model, the following new components and features are scheduled to appear in the forthcoming minor version of TEI Publisher:

  • A component for displaying mathematical formulae, sponsored by the Bernoulli project (Uni Basel).
  • A feature to automatically register persistent DOI identifiers with every resource uploaded to TEI Publisher (financed by the DIPF).

e-editiones is actively apply for funding: a first proposal has been sent to a Swiss foundation, and we’re waiting for feedback. The features we hope to obtain sponsorship for include:

  • Better navigation within an edition by allowing arbitrary identification schemes for resources.
  • Speaking and bookmarkable URLs.
  • Improved accessibility.
  • A timeline component.
  • Components to improve the editorial workflow:
    • Git integration into the user interface for synchronization.
    • An annotation feature allowing editions to perform semantic markup tasks directly on the rendered text.

If you have other features on your list, please do not hesitate to propose them. We can achieve more at lower costs if we all work together. If you need a certain feature and plan to apply for funding, why not ask other projects which may have similar requirements? e-editiones is a welcoming umbrella under which your project can come together with other projects from different academic disciplines, coordinate needs, and potentially convince funding agencies to look beyond the perspective of a single project.

1 Comment on Newsletter 2021/1