Abstract

There is a discussion in the TEI community about the possibilities to encode ontologies with the TEI vocabulary and there is even a SIG on the subject. With the move to P5, the TEI already added several concepts dedicated to the description of "things" (msDesc, person, place, event etc.) and thus left the ground of text encoding in a narrow sense. In the last decade, another technological development dealing with digital representations of "things" has taken momentum: the semantic web. The paper will contribute to the reflection about the relationship between the Semantic Technologies proposed by the W3C and the TEI. Its argument starts sticking to the narrow sense of "text" encoding, i.e. the concept of text as a structure of linguistic symbols. The relationship between "things" and text can then be considered as a reference. The TEI offers many methods to establish this kind of references, e.g. the att.canonical attribute class. This gives the possibility to express references to "things" which are named individuals. References to classes or analytic concepts can be expressed with the ana, type and the inst attributes, which are globally available.

The power of the W3C Semantic Web standards lies in the highly flexible basic definitions, which still can effectively be processed: Abstract IRI for the symbolization of concepts and "things" and the expression of assertion about the "real world" in simple statements, formalized as "subject-predicate-object" triples. With RDFa, the W3C has proposed a method to integrate formal semantic statements into HTML text.

Projects in which textual information in a digital edition is related to abstract data have shown that easy extraction of highly structured data from texts is a benefit. Theoretically, this approach can draw on the basic insights by Manfred Thaller on source oriented data processing since the 1980s and information theory by Börje Langefors.

The question the TEI community should discuss, is therefore how TEI markup can facilitate the conversion of encoded text into semantic web statements. Is the bunch of key, ref, ana, inst, link / linkGroup, arc mechanisms more effective than RDFa markup?

At the current stage, the question is open. In my paper, I will try to develop some features of RDFa, which seem to be useful to be considered in the further development of the TEI standards on handling the relationship between text and things.

References

  • Langefors, Börje/Dahlbom, Bo: Essays on Infology. Summing up and planning for the future , Stockholm, Lund, 1995.
  • Thaller, Manfred: Data Bases vs. Critical Editions in: HSR / Historische Sozialforschung , 13/3, 129-139, 1988.
  • Thaller, Manfred: Datenbasen als Editionsformen in: Historische Edition und Computer ed. by: Schwob,Anton/Kranich-Hofbauer,Karin, Graz, 215-241, 1989.
  • Thaller, Manfred: Gibt es eine fachspezifische Datenverarbeitung in den historischen Wissenschaften? Quellenbanktechniken in der Geschichtswissenschaft in: Geschichtswissenschaft und elektronische Datenverarbeitung ed. by: Kaufhold,Karl Heinrich/Schneider,Jürgen, Wiesbaden, 45-83, 1988.
  • Thaller, Manfred: What is a text within the Digital Humanities, or some of them, at least? in: dh2012 – Book of Abstracts , Hamburg, 2012. http://www.dh2012.unihamburg.de/conference/programme/abstracts/beyond-embedded-markup