The NewsReader MEANTIME corpus

The MEANTIME Corpus (the NewsReader Multilingual Event ANd TIME Corpus) consists of a total of 480 news articles: 120 English Wikinews (http://en.wikinews.org/) articles on four topics (i.e. Airbus and Boeing, Apple Inc., Stock market, and General Motors, Chrysler and Ford) and their translations in Spanish, Italian, and Dutch.
It has been annotated manually at multiple levels, including entities, events, temporal information, semantic roles, and intra-document and cross-document event and entity coreference.

For a more detailed description see the video (on YouTube) or the slides.

Creative Commons License
The NewsReader MEANTIME corpus is licensed under a Creative Commons Attribution 4.0 International License.

If you use this corpus, please cite the following paper:

Anne-Lyse Minard, Manuela Speranza, Ruben Urizar, Begona Altuna, Marieke van Erp, Anneleen Schoen, and Chantal van Son. 2016. MEANTIME, the NewsReader Multilingual Event and Time Corpus. In Proceedings of LREC 2016. TO APPEAR.

Download data

Manually annotated data (version 1.0)

Raw texts (NAF format)

Shared tasks

SemEval 2015

The English section has been used as trial and evaluation data for the Task “TimeLine: Cross-Document Event Ordering” at SemEval 2015.
In this context timelines have been created from the annotated articles.
For more information please visit the task’s website: http://alt.qcri.org/semeval2015/task4/.

CLIN26

The Dutch section of the MEANTIME corpus has been used for the CLIN26 Shared Task, the first collocated Shared Task for Dutch.
For more information please visit the task’s website: http://wordpress.let.vupr.nl/clin26/shared-task/.

Annotation Guidelines

Publications

  • Anne-Lyse Minard, Manuela Speranza, Ruben Urizar, Begona Altuna, Marieke van Erp, Anneleen Schoen, and Chantal van Son. 2016. MEANTIME, the NewsReader Multilingual Event and Time Corpus. In Proceedings of LREC 2016. TO APPEAR.
  • Manuela Speranza and Anne-Lyse Minard. Cross-language projection of multilayer semantic annotation in the NewsReader Wikinews Italian corpus (WItaC). In Proceedings of the Second Italian Conference on Computational Linguistics (CLiC-it 2015). Proceedings of CLiC-it
  • Anne-Lyse Minard, Manuela Speranza, Eneko Agirre, Itziar Aldabe, Marieke van Erp, Bernardo Magnini, German Rigau and Ruben Urizar. SemEval-2015 Task 4: TimeLine: Cross-Document Event Ordering. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). http://www.aclweb.org/anthology/S15-2132

Technical Report