Video release explaining NewsReader’s Reading Machine: https://youtu.be/rYLaVN3oqLI

NewsReader releases the Storyteller demo

 

 

Storyteller-2Storyteller-1

 

 

NewsReader released a new way for representing events on timelines to approximate news stories. Try out the demo here !!!!

 

 

 

NewsReader workshop/hackathon announcement on VU Faculty of Humanities website (Dutch)

See the announcement on the VU Faculty of Humanities website (Dutch).

globalsearch

NewsReader at European Data Forum

Come check out the NewsReader stand at European Data Forum today and tomorrow.

At the stand you can see our  demos, pick up a brochure, find out more about our upcoming events and grab a bag of our limited edition NewsReader winegums!

IMG_5731

 

Workshop and Hackathon November 2015: Car Wars – Industrial Heroes Going Down Fighting

On 24 and 25 November 2015, we will showcase the NewsReader project and invite you to come explore our technology and its results yourself during our NewsReader Workshop and Hackathon. 

Our dataset encompasses 12 years of news charting the struggle of automotive players to rule the global market, to satisfy the expectations of the shareholders, and their suffering from the financial crisis and new economies: industrial heroes going down!

The Workshop

Tuesday 24 November 2015, 14:00 – 18:00 Amsterdam Public Library 

In this workshop, we will bring together start-ups, companies, researchers and developers to present and discuss the NewsReader project, the technological domains it draws from and future applications for these technologies.

This afternoon will feature invited talks, demos, a panel discussion and a networking reception.

Confirmed Speaker:

  • Prof. dr. Frank van Harmelen, Vrije Universiteit Amsterdam.  Frank van Harmelen is a professor in Knowledge Representation & Reasoning in the AI department (Faculty of Science) at the Vrije Universiteit Amsterdam. After studying mathematics and computer science in Amsterdam, he moved to the Department of AI in Edinburgh, where he was awarded a PhD in 1989 for his research on meta-level reasoning.
  • Bernardo Magnini, Bernardo Magnini is senior researcher at FBK, where he is the scientific coordinator of the Cognitive Computing research line. His interests are in the field of Natural Language Processing, particularly lexical semantics, question answering and textual entailment. He has launched EVALITA, the evaluation campaign for both NLP and speech tools for Italian and has co-chaired CLIC-it 2014, the first Italian conference on Computational Linguistics. He currently serves as President of the Italian Association for Computational Linguistics.
  • Sybren Kooistra, De Volkskrant and Yournalism. Sybren Kooistra is a data journalist at De Volkskrant and co-founder and editor-chief of Yournalism, a platform for investigative journalism. In 2008, he went to the United States as an aide to Obama’s presidential campain. In 2013 he won a prize at an international press innovation contest for the news website of the future. He studied sociology, political science and social geography at Radboud University Nijmegen.

The Hackathon 

Wednesday 25 November 2015, 10:00 – 18:00 Amsterdam Public Library 

In June 2014 and January 2015 we ran several hackathons in both London and Amsterdam in which NewsReader enabled the attendees to pull out networks of interactions between entrepreneurs, politicians, companies and thoroughly test drive our technology. This November, we’re releasing a new version of our processing pipeline and we’re scaling up to 10 million processed news articles from sources about the automotive industry to obtain a searchable database of the news. At the hackathon, you can play with this dataset and explore the processing pipeline.

The global automotive industry has a value in the order of $1 trillion annually. The industry comprises a massive network of suppliers, manufacturers, advertisers, marketeers and journalists. Each of these players has his/her own story, often with unexpected origins or endings; one day you may be CEO of a big car company, the next you are out and making pizzas. With NewsReader, you can uncover these stories to reconstruct the past.

This event may be of interest to you if:

  • You’re interested in natural language processing and/or semantic web technology
  • You’re a data journalist on an automotive desk;
  • You’re an analyst sifting daily news looking for information on your company or on competitors;
  • You’re a data analyst looking to understand how your customers operate their supply chain
  • You’re an analyst trying to find secondary events that could influence an investment decision;
  • You’re interested in visualising big data

Attendance is free but please register by 22 November 17:00 CET. 

NewsReader at ISWC

With Semantic Web technology being a huge part of NewsReader, it is no wonder that we will showcase some of our technology at the 14th International Semantic Web Conference (ISWC) next week in Bethlehem, Pennsylvania.

Here’s a roundup of the sessions in which NewsReader is involved.

Sunday 11 October

The third NLP&DBpedia workshop (location: RBC 91): This workshop combines the two main themes in NewsReader, namely Natural Language Processing and Semantic Web. NewsReader team member Marieke van Erp is a co-organiser of this workshop and she will present the position paper “Missing Mr. Brown and buying an Abraham Lincoln – Dark Entities and DBpedia” (Marieke van Erp, Filip Ilievski, Marco Rospocher and Piek Vossen)

Monday 12 October

Filip Ilievski will present the paper “LOTUS: Linked Open Text UnleaShed” (Filip Ilievski, Wouter Beek, Marieke van Erp, Laurens Rietveld and Stefan Schlobach) at the Sixth Consuming Linked Data Workshop. 12:00 – 12:20 in room RBC 91.

In the afternoon, Marieke van Erp will be co-organising the Linked Science workshop, which is on the use of linked data for publishing, sharing and interlinking scientific resources, data and complete experiments. 14:00 – 17:30 in room RBC 241.

Tuesday 13 October

Marco Rospocher will present the following two demos at the Poster and Demos session (18:30 – 21:00, location: Vision Bar and Bucks)

 

 

KnowledgeStore Demonstration Video (2015 Version)

A new demonstration video of the KnowledgeStore “in action”, with voice comments, has been released.

You can access it directly here or from the Demos section.

London Hackathon

Which cars crash most? Which automobile companies had to recall their cars over the last ten years and how does current news relate to news in the financial automobile industry domain from the last ten years?

Hacking at RICS

Hacking at RICS in London

On Friday 30 2015 three teams tried to answer these questions during the NewsReader Hack Day in London. At the foot of the Big Ben (to be precise, at the Royal Institution of Chartered Surveyors), participants explored NewsReader’s analyses of 1.3 million articles about the automobile industry from the last decade. Despite the complexity of the data, the crash investigation team managed to analyse which cars may be the most dangerous around,* the recall team revealed how we can make our data even more useful and, finally, the NewsReader word cloud team built a neat little visualisation that allows users to select input from current news and produces a word cloud from the most related terms in the NewsReader data.

Crash team's winning car

Crash team’s winning car

Overall, a nice variety of applications for the NewsReader data and new insights for the NewsReader team on how our data may be used and improved. The outcome of both hackathons are nicely illustrated by Pim Stouten, Strategy Director for LexisNexis,’s words: “I’m positively excited to see two years of research and development coming to life, with NewsReader being used for real cases, and with real data.”

*we will keep the outcome to ourselves to avoid frightening you/lawsuits

Amsterdam Hackathon Recap

DSC_4078

On January 21st, the second NewsReader hackathon and the first part of our Y2 user evaluation took place at the Amsterdam Public Library. For the hackathon, around 30 participants from research groups, as well as companies, public institutions and even some students came to the 6th floor of the Amsterdam Public Library. The participants formed teams of varying size resulting in 8 presentations at the end of the day. The NewsReader team was super happy to see so many different ideas come out of this such as in-depth analyses of the age of CEOs when they get hired or fired from companies, integration with annotation tools to provide enrichments from the NewsReader dataset and recommender systems. These are ideas we hadn’t thought of ourselves and we think could lead to interesting applications for the project.

croppedDSC_4238_working_hard
DSC_4244_Lora

We know that the NewsReader dataset is quite complex,  due to its size and the many different layers of information embedded into it. Fortunately, the hackathon participants were up for the challenge, dug in and got some cool visualisations and analyses out. In the course of this process, the NewsReader team got lots of feedback on how to improve the API and several bugs were reported (it’s still research). We are now working on analysing our query log (100,000 queries were fired during the day, resulting in a 371MB log) and prepping for our second hackathon of this year to take place in London this Friday.

Check out some content of participants to the hackathon:

NewsReader Hackathon – blog post by Jaap Blom (Netherlands Institute for Sound and Vision)

NewsReader Hackathon – blog post by Paul Groth (Elsevier)

Selfdriving cars and sentiment analysis – presentation by Anca Dumitrache and her team members at the hackathon

 

IMG_7059

 

 

NewsReader: the developers story

newsreader

Our role at ScraperWiki is in providing mechanisms to enable developers to exploit the NewsReader technology, and to feed news into the system. As part of this work we have developed a simple REST API which gives access to the KnowledgeStore, the system which underpins NewsReader. The native query language of the KnowledgeStore is SPARQL – the query language of the semantic web. The Simple API provides a set of predefined queries which are easier for end users to work with than raw SPARQL, and help us as service managers by providing a predictable set of optimised queries. If you want to know more technical detail then we’ve written a paper about it (here).

The Simple API has seen live action at a Hack Day on World Cup news which we held in London in the summer. Attendees were able to develop a range of applications which probed violence, money and corruption in the realm of the World Cup. I blogged about our previous Hack Day here and here. The Simple API, and the Hack Day helped us shake out some bugs and add features which will make it even better next time.

“Next time” is another Hack Day to be held in the Amsterdam on 21st January 2015, and London on the 30th January 2015. This time we have processed 6,000,000 articles relating to the car industry over the period 2005-2014. The motor industry is a trillion dollar a year business, so we can anticipate finding lots of valuable information in this horde.

From our previous experience the three things that NewsReader excels at are:

  1. Finding networks of interactions, identifying important players. For the World Cup Hack Day we at ScraperWiki were handicapped slightly by having no interest in football! But the NewsReader technology enabled us to quickly identify that “Sepp Blatter”, “Jack Warner” and “Mohammed bin Hammam” were important in world football. This is illustrated in this slightly cryptic visualisation made using Gephi:beckham_and_blatter
  2. Finding events of a particular type. the NewsReader technology carries out semantic role labeling: taking sentences and identifying what type of event is described in that sentence and what roles the participants took. This information is then aggregated and exposed using semantic web technology. In the World Cup Hack Day participants used this functionality to identify events involving violence, bribery, gambling, and other financial transactions;
  3. Establishing timelines. In the World Cup data we could track the events involving “Mohammed bin Hammam” through time and the type of events he was involved in. This enabled us to quickly navigate to pertinent news articles.Timeline

You can see fragments of code used to extract these data using the Simple API in these GitHub Gists (here and here), and dynamic visualisations illustrating these three features here and here.

The Simple API is up and running already, you can find it (here). It is self-documenting, simply visit the root URL and you’ll see query examples with optional and compulsory parameters. Be aware though: the Simple API is under active development, and the underlying data in the KnowledgeStore is being optimised for the Hack Days so it may not be available when you visit.

If you want to join our automotive Hack Day then you can sign up for the Amsterdam event (here) and the London event (here).