Introduction to Digital Text Analysis

Els Lefever and Veronique Hoste
Target group
Bachelor in Applied Linguistics

About Introduction to Digital Text Analysis

This course provides an overview of the most important digital techniques within the different research fields of the humanities, with a focus on digital text analysis. The first part introduces the more general research domain of digital humanities, in which digital text analysis takes up a central place. A theoretical overview is provided of the main digital techniques and methods for automatic text analysis, starting with the collection, digitization and enrichment of resources, and the automatic search in large corpora or text collections. The introduction is followed by an overview of methods for natural language understanding, ranging from the more shallow lexical level to more complex semantic levels. We also elaborate on the problems inherent to processing natural language, such as ambiguity and world knowledge. In addition to this theoretical foundation, much attention is also paid to critical reflection on the selection and use of resources for data-based research, the building and application of technology, the impact of machine learning systems and AI applications on society and the importance of a good understanding of the underlying algorithms for correctly applying and interpreting the output of these tools. In the second part of the course, various use cases are discussed from the broader research domain of the digital humanities, again with a focus on digital text analysis. This way, an overview of state-of-the-art research and current trends is provided for some specific research domains (E.g., automatic text analysis for classical languages, the use of computational methods for the exploration and analysis of large text collections or contemporary digital media) and one or more representative research projects are presented. During the guided self-study, students read scientific articles to prepare specific course components and complete modules of a digital learning path.