Workshop on Natural Language Processing and Linked Open Data (NLP&LOD)

Collocated with RANLP 2013

12 September 2013

Hissar, Bulgaria

Here is the proceedings of the workshop.

Workshop Programme

Deadline extension: Submission deadline is 15 July 2013.

Description

In the last decade, the mainstream research in Natural Language Processing (NLP) both – in academic and industrial contexts – has focused primarily on statistical approaches, which have proved very competitive in view of textual data becoming vast in quantity, web-based in availability, highly semantic in representation, and dynamic in nature.

A somewhat less mainstream but still quite visible trend has focused on knowledge-rich approaches for NLP; this trend has typically complemented statistical approaches. Examples include using domain knowledge to enhance learning for high-quality automatic NLP in a given domain, adaptation of statistical modules to knowledge-rich structures, and hybrid mechanisms for language analysis and generation.

The Linked Open Data (LOD), understood as published structured data, which is interlinked and which builds upon standard Web technologies, such as HTTP and URIs, as well as on RDF-presented world facts datasets in various domains, has become a necessary component within all modern NLP-related tasks and applications since it provides large quantities of useful knowledge about people, facts, organizations, events, etc.

In a long-term development, we might expect that richer world knowledge would be available even beyond the current Linked Open Data (LOD) with respect to larger structured and interconnected data. This would include semantics that are richer in world facts and dynamic conceptual knowledge, on the one hand. On the other hand, the trends in NLP tools development show a strong movement from knowledge-poor towards knowledge-rich and hybrid language processing using deep grammars, deep language resources and handling big knowledge bases, such as DBPedia, FreeBase, GeoNames, FOAF, etc.

In this workshop, we build on the complementarity of the two pillars of Natural Language Processing — symbolic and probabilistic, further reinforced by exploring the recent advances in the area of Linked Open Data (LOD). Many contemporary applications rely on the mapping of big amounts of texts to world fact databases and ontologies. They also rely on explicating the various important relations among entities and events depending on the specific task and domain for research and industrial usage. Last, but not least, there exist semantic repositories and management systems, such as OWLIM (http://www.ontotext.com/owlim), that are highly scalable and support inference within big data.

The workshop aims at gathering NLP researchers and developers, interested in hybrid NLP methods and enhancing its connections to LOD.

Topics of interest

NLP processing for LOD
enhancing NLP applications with LOD
information extraction from LOD using NLP techniques
manipulating LOD (cleaning, adding information, deleting information, reconstructing facts) with NLP techniques
LOD as a corpus
mapping LOD to common sense ontologies and language data
storing LOD in RDF bases
methodological and theoretical approaches to LOD
case studies and/or real applications, based on LOD in NLP
other issues involving NLP and LOD

Important dates

Submission deadline: 15 July 2013
Notification of acceptance: 7 August 2013
Camera-ready copies due: 22 August 2013
Workshop date: 12 September 2013

Submission

Multiple submission policy: We welcome papers that are under review for other venues, but, in the event of multiple acceptances, authors are requested to notify us and choose which meeting to present and publish the work at as soon as possible – we cannot accept for publication or presentation work that will be (or has been) published elsewhere.

Reviewing: Reviewing will be blind. No information identifying the authors should be in the paper: this includes not only the authors’ names and affiliations, but also self-references that reveal authors’ identities; for example, „We have previously shown (Smith 1999)“ should be changed to „Smith (1999) has previously shown“.

Paper length and presentation: We invite long (8) and short (4) papers. Accepted short papers will be presented either as short oral presentations or as posters.

Submission format: Authors should follow the RANLP’2013 submission format and paper size. The submissions should be uploaded via START system: Submission NLP&LOD

Invited speakers

Christian Chiarcos, Goethe-Universität Frankfurt am Main, Germany

Borislav Popov, Ontotext AD, Bulgaria

Organizers

Petya Osenova, Sofia University, Bulgaria
Kiril Simov, Bulgarian Academy of Sciences, Bulgaria
Georgi Georgiev, OntoText Lab, Bulgaria
Preslav Nakov, Qatar Computing Research Institute, Qatar Foundation, Qatar

Programme committee

Eneko Agirre, University of the Basque Country, Spain
Isabelle Augenstein, Sheffield University, UK
Guido Boella, Università di Torino, Italy
Kalina Boncheva, Sheffield University, UK
António Branco, University of Lisbon, Portugal
Nicoletta Calzolari, Istituto di Linguistica Computazionale, Italy
Thierry Declerck, DFKI, Germany
Georgi Dimitroff, Germany
Kuzman Ganchev, Google, the USA
Valia Kordoni, Humboldt University in Berlin, Germany
Jarred McGinnis, King’s College London, UK
Pavel Mihajlov, Ontotext AD, Bulgaria
Maciej Piasecki, Wroclaw University of Technology, Poland
Laura Tolosi, Ontotext AD, Bulgaria
Gertjan van Noord, University of Groningen, the Netherlands
Piek Vossen, Vrije Universiteit Amsterdam, The Netherlands