Workshop on Deep Language Processing for Quality Machine Translation (DeepLP4QMT)

Varna, Bulgaria, 10 September 2016

This one day workshop will be held in conjunction with
The 17th International Conference
on
Artificial Intelligence: Methodology, Systems, Applications – AIMSA 2016

Varna, Bulgaria, 7-9 September 2016

Workshop Programme

Workshop Proceedings

Motivation

In the last decade, research on language technology applications, such as machine translation (MT), information retrieval and extraction (also cross-lingual), etc. has benefited from the significant advances obtained with the exploitation of increasingly sophisticated statistical approaches. To a large extent, this advancement has been achieved also by encompassing a host of subsidiary and increasingly more fine-grained linguistic distinctions at the syntactic and semantic levels.

Thus, the NLP mainstream has headed towards the modeling of multilayered linguistic knowledge. To leap forward in terms of the quality of its output, machine translation and other technologies are taking advantage of enhanced capacities for deeper analysis of natural language and massive open online world knowledge that are now becoming available. The following initiatives can be mentioned as best practices, among others:

  • LOGON MT system from-Norwegian-to-English which uses Minimal Recursion Semantics (MRS) and DELPH-IN deep HPSG grammar expertise for language transfer;
  • Systems based on Abstract Meaning Representation (AMR);
  • The ParGram parallel deep grammars and parsebanks covering several language families in the LFG formalism;
  • The development of sophisticated syntactic and semantic models, sensitive to lexical semantics and semantic roles;
  • Creation of high-quality parallel treebanks via model transfers (such as Prague Czech-English Dependency treebank);
  • Creation of deep resources, such as English DeepBank, released in 2013;
  • Creation of common tagsets and thus ‘universalizing’ linguistic resources, such as the Universal dependencies initiative, etc.

In the long run, richer world knowledge will be available, even beyond the current Linked Open Data, with respect to larger datasets, semantics enhanced with world facts, and more dynamic conceptual knowledge representation. Concomitantly, the evolutive trend in Natural Language Processing shows a strong integration of the knowledge-poor language processing with the knowledge-rich one, supported by deep grammars and deep language resources.

The workshop invites papers on the use of deep natural language processing and resources providing deep analyses for a range of applications including, but not limited to, machine translation.

Topics of interest

  • Deep MT transfer models
  • Deep processing of source language
  • Deep generation using world knowledge models and/or deep grammars
  • Deep learning for quality machine translation
  • MT and IR supported by Linked Open Data
  • Language resources for quality machine translation
  • Modeling deep linguistic knowledge for quality applications
  • Statistical models for quality MT and other NLP-related tasks
  • Development and exploitation of monolingual and parallel deep language resources: deep grammars, parsebanks, propbanks, valency lexicons and other deep lexical resources, ontologies etc.
  • Adaptation of deep language resources to MT and other NLP-related tasks
  • Knowledge-based metrics for evaluation

Invited speakers

  • Francis Bond, Nanyang Technological University
    Linking and Enriching Lexicons: the Open Multilingual Wordnet and
    Using Deep NLP in Language Tutoring
  • Josef van Genabith, DFKI
    From Statistical to Neural Machine Translation

Important dates

  • Submission deadline: 27 June 2016
  • Notification of acceptance: 25 July 2016
  • Camera-ready papers: 22 August 2016
  • Workshop date: 10 September 2016

Paper format

Selected papers will be published in the journal „Cybernetics and Information Technologies„. We expect submissions of extended abstracts between 4 and 6 pages according to the styles of the journal. The final versions of the papers have to be between 10 and 15 pages. Please, send your abstract to Kiril Simov at: kivs at bultreebank.org

Workshop organizers

  • Kiril Simov, Institute of Information and Communication Technologies at Bulgarian Academy of Sciences
  • Petya Osenova, Institute of Information and Communication Technologies at Bulgarian Academy of Sciences
  • Jan Hajič, Charles University in Prague
  • Hans Uszkoreit, DFKI
  • António Branco, University of Lisbon

Program Committee

  • Eneko Agirre, University of the Basque Country
  • Ondřej Bojar, Charles University
  • Gosse Bouma, University of Groningen
  • Aljoscha Burchardt, DFKI
  • Mauro Cettolo, FBK
  • Koenraad De Smedt, University of Bergen
  • Ondřej Dušek, Charles University
  • Markus Egg, Humboldt University of Berlin
  • Barbora Hladka, Charles University
  • Philipp Koehn, University of Edinburgh
  • Sandra Kübler, Indiana University
  • Gorka Labaka, University of the Basque Country
  • David Mareček, Charles University
  • Preslav Nakov, Qatar Computing Research Institute, Qatar Foundation
  • Stephan Oepen, University of Oslo
  • Martin Popel, Charles University
  • Rudolf Rosa, Charles University
  • Victoria Rosén, University of Bergen
  • João Silva, University of Lisbon
  • Inguna Skadiņa, Tilde company and University of Latvia
  • Pavel Straňák, Charles University
  • Jörg Tiedemann, Uppsala University
  • Antonio Toral, Dublin City University
  • Gertjan van Noord, University of Groningen
  • Cristina Vertan, University of Hamburg
  • Dekai Wu, Hong Kong University of Science & Technology
  • Nianwe Xue, Brandeis University

Sponsor

The workshop is organized with support of the QTLeap FP7 project.

Contact Information

For information on this workshop please contact Kiril Simov at kivs@bultreebank.org.