Request access

About EventDNA

On this page you can get access to the EventDNA corpus: a Dutch-language corpus comprising 1,773 news documents in which news events, entities, IPTC Media Topic codes and coreference links have been manually annotated following these guidelines.

WIth the corpus the results can be reproduced as reported in Colruyt, C., De Clercq, O., Desot, T. and Hoste, V. (forthcoming). EventDNA: a dataset for Dutch news event extraction as a basis for news diversification. To appear in Language Resources and Evaluation. The data for the IAA agreement study has also been made available.

The code for both the event extraction experiments and IAA study can be found on Github: https://github.com/NewsDNA-LT3/.github.

You can access both datasets by filling in your credentials at the top of this page. Please note that by downloading the data you agree to the following terms and conditions:

  • The authors and their affiliated institutions makes no warranties regarding the datasets provided. They cannot be held liable for providing access to the datasets or the usage of the datasets.
  • The dataset should only be used for scientific or research purposes. Any other use is explicitly prohibited.
  • The datasets must not be redistributed or shared in part or full with any third party. Redirect interested parties to this page.
  • If you use any of the datasets, you agree to cite the associated paper.