Messy data, messy results

Assigning people to their affiliated organizations seems to be an easy task, especially when the contributors to the conference were in charge of providing the data on their affiliations themselves. Unfortunately, neither does ConfTool, the software that was used to manage conference contributions and registrations, provide a controlled vocabulary (or automatic norm data linking) of organisations / institutions, nor do the individual institutions enforce the use of normalized names. Due to this situation, we found e.g. four different names for the ACDH-OeAW when trying to provide a list of contributors' affiliations.

A first consolidation of this clutter of names happened when the printed Book of Abstracts and its index was prepared (see the PDF). We used these data to prepare the institution list, which were then processed automatically (using python) in order to link the contributors to their organisations. This worked quite well (about 75% success rate). The rest of the data and linking was added manually. Still, we cannot claim that the current result is free of errors. Therefore, everyone is invited to look into the data and correct any errors they might find, ideally by editing the files data/indices/listorg.xml and data/indices/listperson.xml and making a pull-request to this application's code repo on GitHub.