Invited talk by Simone Ponzetto on "Get Richer or Die Tryin': Knowledge-Rich Natural Language Processing for the Semantic Web"

Posted by Véronique Hoste on Oct. 4, 2012

You are cordially invited to a attend the following LT3 seminar on "Get Richer or Die Tryin': Knowledge-Rich Natural Language Processing for the Semantic Web" by Simone Ponzetto, University of Rome La Sapienza Date: October 23, 2012. Time: 10:30-12:00 Location: Faculty of Applied Language Studies, University College Ghent, Abdisstraat 1, 9000 Ghent, Room A004. Participation is free, but registration is recommended. Please confirm your registration by mail to Abstract The Web contains vast amounts of textual content which needs to be automatically semantified (i.e. fully structured and annotated with semantic information) in order to conform to the vision of a web of semantic data and enable next-generation applications like semantic search. Semantic information, furthermore, is highly intertwined with knowledge, since knowledge-rich methods have been shown to achieve state-of-the-art performance on tasks which are essential for generating semantic structure like word sense and entity disambiguation and, conversely, semantified data can be used to further extend existing repositories of machine-readable knowledge. In this talk I will elaborate on this vision of a synergistic approach to structured knowledge and semantic information by presenting recent work on mining knowledge from Wikipedia, integrating it with lexical resources like WordNet, and translating its concepts to produce BabelNet, a wide-coverage multilingual ontology. I will next show how knowledge from different languages can be effectively used in a joint fashion to achieve state-of-the-art performance on different lexical disambiguation tasks, both in a monolingual and multilingual setting. Our results confirm the notion that Natural Language Processing applications can benefit substantially from large amounts of knowledge to achieve human-level performance on complex language processing tasks. Nevertheless, we argue that much still remains to be done in terms of more sophisticated modeling and depth of representation for both conceptual knowledge and textual content. This talk is based on joint work with Roberto Navigli.