Automating the validation of TEI metadata processing
|Title||Automating the validation of TEI metadata processing
Encoded by Vanessa Hannesschläger
Encoded by Daniel Schopper
The Creative Commons Attribution 4.0 International (CC BY 4.0) License applies to this text.
This proposal focuses on a TEI metadata ingest procedure, its automated validation and presents the technical implementation as a feature for the data ingest process into the LAUDATIO repository.
LAUDATIO is an open access research data repository for the persistent storage of historical texts and its annotations based on the software framework Fedora. Requirements concerning the metadata structure of historical corpora are identified along with the data depositors. Hence, we will concentrate on the state of the ingest process when depositor and repository manager have already agreed upon the use of a standardized metadata schema. In LAUDATIO, we use TEI P5 as an established and widely used disciplinary metadata format.
The initial validation of (depositor) submitted metadata is a necessary step to ensure a minimum degree of consistency and represents the starting point for any later processing steps. In most repositories, the data-ingest process is form-based. We present a case where depositors have metadata structured in TEI format for which we provide a Graphical User Interface to upload and automatically validate the submitted TEI metadata against standardized schemes. We use the libXML2 library to provide its validation support against the schema. Our proposed module replaces a form-based approach for creating structured repository metadata. The validated metadata is not flawless though.
It relocates and strengthens the necessary cooperation of the repository manager and the depositor. The depositor gets direct feedback and can correct the metadata immediately and independently.