Consortium

VU University Amsterdam (VUA)

The Faculty of Arts at VU University Amsterdam is specialized in the development of complex semantic networks and ontologies and their application in the analysis of text. The relation between conceptual planning, linguistic form and effective communication is the main focus of the faculty. VUA has extensive experience in evaluation of effective communication. The Computational Lexicology & Terminology Lab (CLTL) is specialized in WordNets, semantic networks and lexicons, especially in relation to ontologies and semantic natural language processing applications. The group represents the home of the Global Wordnet Association and is directing the research using WordNet-based technologies.

Key staff:

 

Universidad Del Pais Vasco (EHU)

The IXA Research Group from the University of the Basque Country (UPV in Spanish and EHU in Basque) was created in 1988. Today, it is composed of 30 computer scientists and 14 linguists. The IXA Group was created with the aim of promoting the modernization of Basque by means of developing advanced computational resources and systems for it.

Key staff:

Fondazione Bruno Kessler (FBK)

Via S.Croce 77 – 38122 Trento – ITALY

Fondazione Bruno Kessler was founded on March 1st 2007. It is a non-profit body with a public interest mission having private law status and inheriting the history of Istituto Trentino di Cultura (ITC was founded in 1962 by the Autonomous Province of Trento). Scientific excellence and innovation as well as technology transfer to companies and public services are Fondazione Bruno Kessler’s main objectives. In its areas of competence, FBK collaborates with the main actors in global research and works in accordance with European Union Programs.

Key staff:

LexisNexis (LN)

Radarweg 29 – 1043 NX Amsterdam – THE NETHERLANDS

LexisNexis, part of Reed Elsevier, an Anglo-Dutch FTSE-100 company, is a world leading provider of professional information and online workflow solutions in the Science, Medical, Legal, Risk Information and Analytics, and Business sectors.

Key staff:

ScraperWiki (SCW)

Liverpool Science Park – 146 Brownlow Hill – Liverpool L3 5RF – UNITED KINGDOM
ScraperWiki is a platform for using code to make data do things across the web. They are a hightech startup based in Liverpool in the North-West of England.

Key staff:

 

Synerscope (SYN)

Kastanjelaan 14 – 5268 CA  Helvoirt – THE NETHERLANDS

SynerScope provides advanced BI/”Big Data” analysis software that directly allows actual domain experts and analysts to make sense of their “Big Data”.

Key staff:

 

 

Affiliated Partners:

image002 Global WordNet Association (GWA)
The Global WordNet Association is a free, public and non-commercial organization that provides a platform for discussing, sharing and connecting wordnets for all languages in the world.
w3c World Wide Web Consortium (W3C)
The World Wide Web Consortium (W3C) is an international community where Member organizations, a full-time staff, and the public work together to develop Web standards.
clef-initiative-logo-full The CLEF Initiative (CLEF)
The CLEF Initiative (Conference and Labs of the Evaluation Forum, formerly known as Cross-Language Evaluation Forum) is a self-organized body whose main mission is to promote research, innovation, and development of information access systems with an emphasis on multilingual and multimodal information with various levels of structure.

 

 

 

Related Projects:

  • KYOTO project : KYOTO (EC FP7 ICT-211423) makes knowledge sharable between communities of people, culture, languages and computers, by assigning meaning to text and giving text to meaning. The goal of KYOTO is a system that allows people in communities to define the meaning of their words and terms in a shared Wiki platform so that it becomes anchored across languages and cultures but also so that a computer can use this knowledge to detect knowledge and facts in text. Whereas the current Wikipedia uses free text to share knowledge, KYOTO will represent this knowledge so that a computer can understand it. For example, the notion of environmental footprint will become defined in the same way in all these languages but also in such a way that the computer knows what information is necessary to calculate a footprint. With these definitions it will be possible to find information on footprints in documents, websites and reports so that users can directly ask the computer for actual information in their environment, e.g. what is the footprint of their town, their region or their company. NewsReader will use KYOTO as a starting point.
  • Live Memories : Live Memories project (PAT): In the digital age, our records of past and present are growing at an unprecedented pace. Huge efforts are under way in order to digitize data now on analogical support; at the same time, low-cost devices to create records in the form of e.g. images, videos, and text are now widespread, such as digital cameras or mobile phones. This wealth of data, when combined with new technologies for sharing data through platforms such as Flickr, Facebook, or the blogs, open up completely new, huge opportunities of access to memory and of communal participation to its experience. The KnowledgeStore developed in the LiveMemories project will be the starting point for the KnowledgeStore in the NewsReader project.
  • PoseidonThe Poseidon project rises to the challenge to discover new ways on how to build advanced systems of systems, and therefore on how to allow for flexibility, adaptability and evolvability in systems of systems while ensuring reliability – a crucial requirement, not only in the domain of maritime safety systems that provides Poseidon’s exemplary application and the industrial laboratory needed for its success.
  • PESCaDO: PESCaDO aims to meet this need for environmental service orchestration. It will offer an interconnected multipurpose environmental user-oriented service for a federated community of citizens, public services (such as tourist offices and environmental institutions), public administrations, and entrepreneurs active in sectors sensitive to environmental conditions.
  • BiographyNet: The Biography Portal of the Netherlands links a wide variety of Dutch online reference works and data sets, written in different times and from different perspectives, through a limited number of metadata. This project will build a semantic layer on top of the current Biography portal in order to enrich our sources and analytical tools for history writing.
  • Can we handle the news” – winning project of the EYR contest: this project will use big data for the analysis of large quantities of news releases in the project ‘Recording history in large news streams‘. Participants competed for a 2-years access to data storage, computing facilities, and visualisation infrastructure provided by SURFsara, for advanced network connections provided by SURFnet, and for support in the mapping of research solutions onto these e-infrastructure services by the Netherlands eScience Center (NLeSC). The winners also received a cash prize of EUR 20,000.
  • Casemap Wikipedia: At its core, CaseMap is a database connecting two spheres of information: facts and objects. The facts spreadsheet is used to generate a time-line of undisputed facts for a case. The objects spreadsheet functions as a parent directory for multiple child-spreadsheets; each child-spreadsheet indexes a sphere of data relevant to a case (i.e. persons, places, pleadings, physical evidence, etc.).
  • Cross-lingual Knowledge Extraction: the goal of the XLike project is to develop technology to monitor and aggregate knowledge that is currently spread across mainstream and social media, and to enable cross-lingual services for publishers, media monitoring and business intelligence.
  • OpeNER: OpeNER’s main goal is to provide a set of ready to use tools to perform some natural language processing tasks, free and easy to adapt for SMEs to integrate them in their workflow. More precisely, OpeNER aims to be able to detect and disambiguate entity mentions and perform sentiment analysis and opinion detection on the texts, to be able for example, to extract the sentiment and the opinion of customers about certain resource (e.g. hotels and accommodations) in Web reviews.
  • Boilerpipe:  boilerpipe provides algorithms to detect and remove the surplus “clutter” (boilerplatetemplates) around the main textual content of a web page.
  • CONCISUS Corpus: The CONCISUS Corpus is  an annotated dataset of comparable Spanish and English event summaries in four application domains.
  • Chronolines: The problematic of our project is the generation of innovating interfaces for viewing information according to temporal criteria. The manipulated objects, called « Event-based Chronologies », prepared from semi-automated position-finding of events and of datative temporal expressions in essentially “breaking news” type texts (written in French and in English), will be associated with visualisation (multimedia) widgets enabling to visualise events associated with a “mediatic event” in chronological order; wherein said event acts somehow as the “trigger” for information search so that said event is presented relative to a context forming the collection of events which may be associated therewith.
  • Clusteredition: Top news updated every 10 minutes, 24 hours per day.
  • Eventedition:  Events Detection updated every 10 minutes, 24 hours per day.
  • Tarsqi Toolkit: The Tarsqi Toolkit (TTK) provides one-stop shopping for all your temporal needs (or, hopefully, at least some of your needs). It integrates extraction of events and time expressions with creation of temporal links, using a set of mostly independent modules, while ensuring consistency with a constraint propagation component. The glue that keeps all the modules together is the TimeML language.
  • NewsMills: Newsmills is an attempt to make sense out of news text. It aims finance news and tries to dig out key piece of information finance analysts and stock consultants are interested at (NewsMills beta).
  • Rumor Mill 2.0: aim of this project is assessing the truthfulness of information that goes viral on social media.
  • SKATeR: this project will develop content enabling systems that will provide deep semantic capabilities to process large quantities of multilingual data. SKATeR will process documents in English, Spanish, Catalan, Basque and Galician.