Cross search

With a growing number of editions realized with the TEI Publisher it is a logical next step to wish for a search service which can run queries across multiple corpora at the same time.

Usually the problem to solve would be the great diversity of encoding across projects, even if they all use TEI as a vocabulary of choice. Even commonly represented information, like the language of the source document, can be stored in various locations in a TEI document. Lucene-based fields and facets, introduced in eXist-db 5.0 provide a mechanism to smoothly abstract away these encoding differences – we can just define, say, a language facet and it’s the collection index configuration’s role to take care of specifying where exactly to grab data from.

The next potential issue would be actually running the queries across corpora, particularly with the decentralized infrastructure where editions are hosted on diverse servers. The answer here is to define an API which individual editions need to expose, so that the aggregate search engine can just poll all its registered ‘members’, regardless of their location or how they implement the search internally.

cross-search results
Cross-search results page

The cross-search prototype is exactly such a search engine. With a simple configuration one can register all ‘member’ editions. Only requirement for the editions themselves is that they expose the api/search/document API endpoint, which is a matter of simple customization for all TEI Publisher 7 applications which support Open API specifications out of the box. The api/search/document endpoint must accept a number of parameters defined in the specification. For this prototype the title, author and lang(uage) fields as well as genre, language and corpus facets were assumed.

We are very happy to report that our prototype works really well as a proof of concept with the eclectic collection of documents from TEI Publisher demo apps, all originating with vastly different projects with diverse encoding styles. Next, we intend to extend this idea into a general portal for archives and libraries and we would welcome collaboration from such institutions.

Our sincere thanks go to the Bibliothek für Bildungsgeschichtliche Forschung des DIPF / Research Library for the History of Education at DIPF for supporting this project.

TEI Publisher 7: an example of collaborative development

2020-12-18, Wolfgang Meier

We’re happy to announce the final release of TEI Publisher 7. This is the second release coordinated by e-editiones and another important step to improve interoperability, sustainability and long-term maintenance of editions created with TEI Publisher. While release 6 came with a major redesign of the client side parts of TEI Publisher, version 7 focusses on the server, featuring a clean and extensible API based on the Open API standard.

The plan for TEI Publisher 7 (and 6 before) was born during a meeting in Basel in November last year, when some of the Swiss edition projects using TEI Publisher came together to discuss how the software could be developed into an even more sustainable platform. Many of them became founding members of e-editiones a few months later.

Fortunately the Swiss Nationale Infrastruktur für Editionen – Infrastructure nationale pour les éditions agreed to support the outlined ideas, funding a substantial part of the work. Other member projects contributed time and money, and served as testbeds for the new versions.

TEI Publisher 7 is thus more than just another technological milestone – it is the product of a cooperative development model in which users, researchers, institutions and developers work hand in hand towards a shared goal.

We’re looking forward to continue this success story next year and would again like to invite everybody to become part of it. Support the common cause by becoming a member, convince your institutions to take part, and contribute in one of the many possible ways.

Read the full release notes on the TEI Publisher website.

1 Comment on TEI Publisher 7: an example of collaborative development