Detecting grammatical errors in machine translation output using dependency parsing and treebank querying

Publication type
A2
Publication status
Published
Authors
Tezcan, A., Hoste, V., & Macken, L.
Journal
BALTIC JOURNAL OF MODERN COMPUTING
Volume
4
Issue
2
Pagination
203-217
Download
(.pdf)
View in Biblio
(externe link)

Abstract

Despite the recent advances in the field of machine translation (MT), MT systems cannot guarantee that the sentences they produce will be fluent and coherent in both syntax and semantics. Detecting and highlighting errors in machine-translated sentences can help post-editors to focus on the erroneous fragments that need to be corrected. This paper presents two methods for detecting grammatical errors in Dutch machine-translated text, using dependency parsing and treebank querying. We test our approach on the output of a statistical and a rule-based MT system for English-Dutch and evaluate the performance on sentence and word-level. The results show that our method can be used to detect grammatical errors with high accuracy on sentence-level in both types of MT output.