LT3 at SemEval-2020 Task 9 : cross-lingual embeddings for sentiment analysis of Hinglish social media text

Publication type
C1
Publication status
Published
Authors
Singh, P., & Lefever, E.
Series
Proceedings of the 14th International Workshop on Semantic Evaluation (SemEval 2020)
Pagination
1288-1293
Publisher
Association for Computational Linguistics (Barcelona, Spain)
Conference
the Fourteenth Workshop on Semantic Evaluation (SemEval 2020) (Barcelona, Spain)
Download
(.pdf)
View in Biblio
(externe link)

Abstract

This paper describes our contribution to the SemEval-2020 Task 9 on Sentiment Analysis for
Code-mixed Social Media Text. We investigated two approaches to solve the task of Hinglish
sentiment analysis. The first approach uses cross-lingual embeddings resulting from projecting
Hinglish and pre-trained English FastText word embeddings in the same space. The second
approach incorporates pre-trained English embeddings that are incrementally retrained with a set
of Hinglish tweets. The results show that the second approach performs best, with an F1-score of
70.52% on the held-out test data.