Evaluating Existing Lemmatisers on Unedited Byzantine Greek Poetry

Publication type
U
Publication status
Published
Authors
Swaelens, C., De Vos, I., & Lefever, E.
Series
Proceedings of the Ancient Language Processing Workshop
Pagination
111-116
Publisher
Recent Advances in Natural Language Processing (Varna, Bulgaria)
Download
(.pdf)
View in Biblio
(externe link)

Abstract

This paper reports on the results of a com- parative evaluation of four existing lemmatizers, all pre-trained on Ancient Greek texts, on a novel corpus of unedited, Byzantine Greek texts. The aim of this study is to get insights into the pitfalls of existing lemmatisation approaches as well as the specific challenges of our Byzantine Greek corpus, in order to develop a new lemmatizer that can cope with its peculiarities. The results of the experiment show an accuracy drop of 20% on our corpus, which is further investigated in a qualitative error analysis.