Text based user comments as a signal for automatic language identification of online videos

Publication type
C1
Publication status
Published
Authors
Doğruöz, A.S., Ponomareva, N., Girgin, S., Jain, R., & Oehler, C.
Series
ICMI'17 : Proceedings of the 19th ACM International Conference on Multimodal Interaction
Pagination
374-378
Publisher
Association for Computing Machinery (ACM)
Conference
ICMI '17: International Conference on Multimodal Interaction (Glasgow, UK)
Download
(.pdf)
View in Biblio
(externe link)

Abstract

Identifying the audio language of online videos is crucial for industrial multi-media applications. Automatic speech recognition systems can potentially detect the language of the audio. However, such systems are not available for all languages. Moreover, background noise, music and multi-party conversations make audio language identification hard. Instead, we utilize text based user comments as a new signal to identify audio language of YouTube videos. First, we detect the language of the text based comments. Augmenting this information with video meta-data features, we predict the language of the videos with an accuracy of 97% on a set of publicly available videos. The subject matter discussed in this research is patent pending.