Newsletter 2021/1

2021-02-02, Andreas
Tags: ,


We’re happy to send our first newsletter covering the past year of activities and developments within e-editiones and would like to encourage institutions or individuals who are not yet members to join forces.

e-editiones as a Society

The e-editiones society was founded on 4 May 2020 by more than 20 international partners. Our institutional members are academies, archives and libraries, edition projects and their sponsors, as well as companies. Among individual members we are proud to count archivists, librarians, editors, researchers from the (digital) humanities, and software developers. After our first attempt to obtain a non-profit association status from the St. Gallen tax office failed, e-editiones was recognized as a non-profit association with tax exemption, following an amendment to the statutes (5th November 2020). A huge thanks to the Karl Barth Foundation for generously sponsoring the necessary legal advice. Building on these foundations and developments and events listed below, our still young society is off to a great start. We hope to broaden the community and increase our membership in 2021.

More Information

2020 Events

e-editiones organized various meetings and workshops in 2020 for members and the wider community to get to know each other and exchange ideas.

Meetings

  • 04.05.2020: Virtual foundation of e-editions and first community event for members
  • 18.05.2020: First Public Community Event: Presentation of e-editiones by Andreas Kränzle with a short introduction to TEI Publisher for digital editions by Wolfgang Meier, discussion moderated by Joe Wicentowski Video
  • 20.10.2020: TEI «Vanilla» by Magdalena Turska Report

Workshops

  • Three-part online Course: Learn TEI Publisher by Wolfgang Meier
    • Learn TEI Publisher 1: «Stay Home Learn TEI Publisher From Scratch» , 8th Jun 2020 Video
    • Learn TEI Publisher 2, 15th Jun 2020 Video
    • Learn TEI Publisher 3, 22nd Jun 2020 Video
  • Music is in the air – MEI and TEI Publisher by Giuliano Di Bacco and Dennis Ried, 8th Jul 2020 Video
  • Common eXist-db/TEI Publisher. Deployment Scenarios by Olaf Schreck and Lars Windauer covers different deployment scenarios of eXist-db and TEI Publisher, 9th Sep 2020 Video
  • Open Source Advent: Presentation of a number of open source developments, 11th – 31st Dec 2020 Link

More Information

Communication

The association immediately set up communication facilities for its members and the community. For technical questions about TEI Publisher, the central communication channel is Slack. Recurring issues and questions are incorporated into the new FAQ page. For announcements and important messages, we use Twitter. Follow us!

More Information

TEI Publisher Developments: Versions 6 and 7

In 2020 two major versions of TEI Publisher were released: versions 6 and 7. The plan for these versions was born during a meeting in Basel in November 2019, when several Swiss edition projects using TEI Publisher came together to discuss how the software could be developed into an even more sustainable platform. A number of them later became founding members of e-editiones. Their primary goal was to enable institutions to manage and maintain a larger number of editions, either on the same or several servers. To reduce maintenance costs, keeping editions up to date with newer TEI Publisher versions needed to be as simple as possible. Editions relying on different versions needed to be able to run alongside each other.

Two areas had to be addressed to achieve these goals:

  1. On the client side TEI Publisher was already a collection of small, modular web components, which together formed the user interface. These components had to be extracted into a separate library, so projects could benefit from bug fixes and new components without having to update the rest of the edition’s application.
  2. On the server side TEI Publisher needed a well defined API. Such an API would provide a standardized communication channel, through which the web components on the client could talk to the server. To accommodate future updates and support backwards compatibility, the API would need to be versioned, ensuring that both sides could speak the same language.

Fortunately the Swiss Nationale Infrastruktur für Editionen – Infrastructure nationale pour les éditions agreed to support the outlined ideas, funding a substantial part of the work.

With the release of TEI Publisher 7, both areas mentioned above have been addressed and the goal has been fully reached.

Contributions also came from member institutions: the Collection of Swiss Law Sources Online and the St. Galler Missiven were built on an early version of TEI Publisher 4. Both funded a major update of their code to version 7. A number of regressions and new bugs were encountered during the migration, and appropriate fixes were contributed back into the TEI Publisher code base.

The Karl Barth Gesamtausgabe likewise served as a test bed during TEI Publisher 7 development. Furthermore, the Karl Barth project financed the Microsoft Word to TEI transformation, which is now part of TEI Publisher.

The source code of TEI Publisher and its related repositories have been moved under the roof of e-editiones and are hosted within its GitHub organization.

Other Software Packages

Beyond TEI Publisher itself, e-editiones released a number of other software packages:

  • The Open API routing library at the heart of TEI Publisher 7, called Roaster, proved to be useful even outside the humanities and has already been adopted by a project in the medical realm.
  • Born out of the Karl Barth Gesamtausgabe, e-editiones published an extension for the Visual Studio Code editor to help editors work on TEI. This includes an entity explorer to help with the markup of entities in a text by querying external authority databases from within the XML editor.
  • The Cross Search package provides a blueprint for a portal incorporating multiple editions, supporting cross-edition search facilities. This will become the basis for future hosting services.

Community Contributions

Last but not least, we would like to point out the many contributions we received with respect to translations: TEI Publisher has now been translated to more than 20 languages!

We hope to see similar interest in other areas, e.g., the newly established FAQ website. We would like to encourage everyone to add entries and turn this into a knowledge base for all users. It already hosts questions and answers concerning encoding, workflows, or best practice recommendations.

The Future

Continuing our cooperative development model, the following new components and features are scheduled to appear in the forthcoming minor version of TEI Publisher:

  • A component for displaying mathematical formulae, sponsored by the Bernoulli project (Uni Basel).
  • A feature to automatically register persistent DOI identifiers with every resource uploaded to TEI Publisher (financed by the DIPF).

e-editiones is actively apply for funding: a first proposal has been sent to a Swiss foundation, and we’re waiting for feedback. The features we hope to obtain sponsorship for include:

  • Better navigation within an edition by allowing arbitrary identification schemes for resources.
  • Speaking and bookmarkable URLs.
  • Improved accessibility.
  • A timeline component.
  • Components to improve the editorial workflow:
    • Git integration into the user interface for synchronization.
    • An annotation feature allowing editions to perform semantic markup tasks directly on the rendered text.

If you have other features on your list, please do not hesitate to propose them. We can achieve more at lower costs if we all work together. If you need a certain feature and plan to apply for funding, why not ask other projects which may have similar requirements? e-editiones is a welcoming umbrella under which your project can come together with other projects from different academic disciplines, coordinate needs, and potentially convince funding agencies to look beyond the perspective of a single project.

1 Comment on Newsletter 2021/1

Roaster: an Open API Router for eXist

We’re happy to announce that the request routing library, which was originally developed for TEI Publisher 7, has been released as a separate package with extended functionality. The library, called roaster, is generic and can be used for any eXist-based project. It implements the Open API 3.0 standard to support well-documented, versioned and formally specified APIs.

Background

In previous versions of TEI Publisher, clients (i.e. your web browser) would communicate with the server by directly calling a variety of XQuery scripts. The server-side API, if you can even call it one, was thus scattered over many different files. Finding your way through the code, figuring out what parameters are expected or how the response should look like was rather difficult. Overwriting the default behaviour – e.g. to replace the generated table of contents – required substantial coding skills. The scripts also changed between TEI Publisher versions. So, if you were working on a standalone edition generated from Publisher, updates could be tricky.

With TEI Publisher 7, the entire server-side API can be viewed on a single documentation page. It clearly describes the URL paths you can use, as well as any parameter you can pass in and the type it should conform to. One can also see the different possible responses and what kind of content they would return.

The new API

Looking at the first route of the documents section in the API (see screenshot below), it is easy to construct a URL which returns the source XML: the path template to use is /api/document/{id} and {id} should contain the path to a document – relative to the data root of TEI Publisher.

Documents API screenshot

So to retrieve the TEI/XML for Graves’ letter, located in the file path test/graves6.xml, we can use the following URL:

https://teipublisher.com/exist/apps/tei-publisher/api/document/test%2Fgraves6.xml

Note that the / in the path needs to be URL encoded with %2F. This is a requirement of the Open API specification.

If instead of the TEI/XML we would like to see the letter rendered to HTML, we can use the third route in the list and simply add /html to the end of the URL:

https://teipublisher.com/exist/apps/tei-publisher/api/document/test%2Fgraves6.xml/html

or if we prefer a PDF:

https://teipublisher.com/exist/apps/tei-publisher/api/document/test%2Fgraves6.xml/pdf

For sure, as an ordinary user, you don’t need to know any of this: using the web interface of TEI Publisher, the web components on the page take care of constructing and calling above URLs for you. But if you are a developer, having a well-defined API is a game changer. Just imagine that you want to support your co-workers with a script which allows them to preview a local TEI document as HTML on the fly: sending the content of the document with an HTTP POST request to /api/preview is all you need! Our Visual Studio Code plugin does it like this.

You can use any script or programming language you like, say bash, python, perl – you name it. And because Open API is a widely used standard, there are plenty of tools for documentation, testing or code generation. The API documentation page in TEI Publisher is generated by such a tool (Swagger UI).

Implementation

roaster is essentially an implementation of the Open API standard in pure XQuery. It reads the formal API specification (in JSON format) and determines for each HTTP request coming in, which route to take. It will also check if the parameters, headers or request bodies passed in comply to the rules given in the definition of the route. An error will be generated if the request is not in compliance with the definition, e.g. because a required parameter is missing or has a wrong type. It can also fill in default values for parameters, enforce correct content types for the response etc.

From a developer perspective, this means you don’t have to worry about parameters. Your handler function will receive a single parameter, containing all the necessary information about the request and you can safely assume that it complies with the spec you provided.

If you are interested in the details, please refer to the README. For a TEI Publisher-related example, check out the FAQ article, which describes how to replace the default table of contents with a custom one.

Roaster 1.0.0 can be installed into your local eXist via the package manager in the dashboard. TEI Publisher 7 shipped with a slightly older version, 0.5.1., but you can run both versions side by side.

Cross search

With a growing number of editions realized with the TEI Publisher it is a logical next step to wish for a search service which can run queries across multiple corpora at the same time.

Usually the problem to solve would be the great diversity of encoding across projects, even if they all use TEI as a vocabulary of choice. Even commonly represented information, like the language of the source document, can be stored in various locations in a TEI document. Lucene-based fields and facets, introduced in eXist-db 5.0 provide a mechanism to smoothly abstract away these encoding differences – we can just define, say, a language facet and it’s the collection index configuration’s role to take care of specifying where exactly to grab data from.

The next potential issue would be actually running the queries across corpora, particularly with the decentralized infrastructure where editions are hosted on diverse servers. The answer here is to define an API which individual editions need to expose, so that the aggregate search engine can just poll all its registered ‘members’, regardless of their location or how they implement the search internally.

cross-search results
Cross-search results page

The cross-search prototype is exactly such a search engine. With a simple configuration one can register all ‘member’ editions. Only requirement for the editions themselves is that they expose the api/search/document API endpoint, which is a matter of simple customization for all TEI Publisher 7 applications which support Open API specifications out of the box. The api/search/document endpoint must accept a number of parameters defined in the specification. For this prototype the title, author and lang(uage) fields as well as genre, language and corpus facets were assumed.

We are very happy to report that our prototype works really well as a proof of concept with the eclectic collection of documents from TEI Publisher demo apps, all originating with vastly different projects with diverse encoding styles. Next, we intend to extend this idea into a general portal for archives and libraries and we would welcome collaboration from such institutions.

Our sincere thanks go to the Bibliothek für Bildungsgeschichtliche Forschung des DIPF / Research Library for the History of Education at DIPF for supporting this project.