20th and 21st September 2002, Sozopol, Bulgaria
The Second Workshop on Treebanks and Linguistic Theories (TLT2003), 14-15 November 2003, Vдxjц, Sweden
The Third Workshop on Treebanks and Linguistic Theories (TLT2004), 10-11 December 2004, Tübingen, Germany
Workshop motivation and aims
Treebanks are a language resource that provides annotations of natural languages at various levels of structure: at the word level, the phrase level, the sentence level, and sometimes also at the level of function-argument structure. Treebanks have become crucially important for the development of data-driven approaches to natural language processing, human language technologies, grammar extraction and linguistic research in general. There are a number of on-going projects on compilation of representative treebanks for languages that still lack them (Spanish, Bulgarian, Portugese,Turkish) and a number of on-going projects on compilation of treebanks for specific purposes for languages that already have them (English).
The practices of building syntactically processed corpora have proved that aiming at more detailed description of the data becomes more and more theory-dependent (Prague Dependency Treebank and other dependency-based treebanks as the Italian treebank (TUT) or the Turkish treebank (METU); Verbmobil HPSG Treebanks, Polish HPSG Treebank, Bulgarian HPSG-based Treebank etc.). Therefore the development of treebanks and formal linguistic theories need to be more tightly connected in order to ensure the necessary information flow between them.
The workshop aims at being a forum for researchers and advanced students working in one or both of these areas. It will be held in conjunction with the summer school “Empiri cal Linguistics and Natural Language Processing”, Flagman hotel, Sozopol, Bulgaria.
Frantisek Cermak, Charles University Prague, Czech Republic
Today’s Corpus Linguistics: Some Open Questions (a preliminary title).
Hans Uszkoreit, DFKI, Saarbruecken, Germany
(Title to be announced)
Elisaveta Balabanova and Krassimira Ivanova.
Creating a machine-readable version of Bulgarian valence dictionary: (A case study of CLaRK system application).
Philippe Blanche and Marie-Laure GuЋnot.
Flexible Corpus Annotation with Property Grammar.
Sabine Brants, Stefanie Dipper, Silvia Hansen, Wolfgang Lezius, George Smith.
Aoife Cahill, Mair’ead McCarthy, Josef Van Genabith and Andy Way.
Evaluating Automatic F-Structure Annotation for the Penn-II Treebank.
Montserrat Civit, Mє AntФnia MartО and NЯria BufО.
Design Principles for a Spanish Treebank.
Erhard Hinrichs and Julia Trushkina.
Forging Agreement: Morphological Disambiguation of Noun Phrases.
Krassimira Ivanova, Dimitar Doikoff.
Cascaded Regular Grammars and Constraints over Morphologically Annotated Data for Ambiguity Resolution.
Jiri Mirovsky, Roman Ondruska, and Daniel Prusa.
Searching through Prague Dependency Treebank Conception and Architecture.
What kinds of trees grow in Swedish soil? A Comparison of Four Annotation Schemes for Swedish.
Stephan Oepen, Dan Flickinger, Kristina Toutanova, Christoper D. Manning.
LinGO Redwoods: A Rich and Dynamic Treebank for HPSG.
Bulgarian Nominal Chunks and Mapping Strategies for Deeper Syntactic Analyses.
Petya Osenova and Sia Kolkovska.
Combining the named-entity recognition task and NP chunking strategy for robust pre-processing.
Kiril Simov, Alexander Simov, Milen Kouylekov, Krassimira Ivanova.
CLaRK System: Construction of Treebanks.
Segmentation Layers in the Group of the Predicate: a Case Study of Bulgarian within the BulTreeBank Framework
Treebank Development with Deductive and Abductive Explanation-based Learning: Exploratory Experiments.
Yovka Tisheva and Marina Dzhonova.
Information Structure Level in TreeBanks.
Kristina Toutanova, Christoper D. Manning, Stephan Oepen.
Parse Ranking for a Rich HPSG Grammar.
Bilingual corpora as a platform for cross-linguistic treebank development
Two round-table discussions will be organized on the following topics:
- the relationship between the syntactic properties of a given language and the choice of linguistic theory for annotation purposes
- the utility of treebanks for linguistic theorizing
Erhard Hinrichs, Germany (co-chair)
Tilman Berger , Germany
Marek Swidzinski, Poland
Adam Przepi’orkowski, Poland
Kiril Simov, Bulgaria (co-chair)
Vladimir Petkevic, Czech Republic
Anatolij N. Baranov, Russia
Sandra Kuebler, Germany
Kemal Oflazer, Turkey
Michael Barlow, USA
Tomaz Erjavec, Slovenia
Robert Engels, Norway
Andreas Wagner, Germany
Frank Richter, Germany
Manfred Sailer, Germany
Walter Daelemans, Belgium
Karel Oliva, Austria
Laurent Romary, France
The registration fee for the workshop is:
For participants from Central and Eastern Europe the fee is reduced to 25 Euro.
It is preferably the fee to be paid at the workshop place in cash.
The fees cover the following services: a copy of the proceedings of the attended workshop, coffee-breaks and refreshments.
People interested in attending the workshop have to send a letter of interest.
Deadline for participants’ applications (registration): 20 August Notification of acceptance: 25 August
Participation in the workshop is limited by the venue. Requests for participation will be processed on first come first served basis.
The workshop will take place in the town of Sozopol, Bulgaria. Sozopol is one of the best summer resorts on the Black Sea coast, famous for its unique mixture of ancient Greek culture, Bulgarian traditional atmosphere (18th century), excellent climate and entertainment facilities. In addition, it is the favourite place of Bulgarian artists, both for performances and for relaxation. It is situated to the south of Bourgas (to be reached by plane – Bourgas Airport, trains – Bourgas Railway Station, or intercity buses). It takes 40 minutes to reach Sozopol from Bourgas with the buses and minibuses that run regularly.
The Workshop will take place in hotel “Flagman”. For accommodation, the participants can choose among hotel “Flagman”, which is relatively expensive but luxurious, and a number of other options available in Sozopol: cheaper (but still good) hotels, and rooms in private houses. The organizers can arrange reservations for hotel “Flagman” and provide assistance with the rest of the options.
(1.95 BGL = 1 Euro)
Prices for hotel Flagman:
For foreigners: 43 BGL per bed
For Bulgarians: 29 BGL per bed
For foreigners: 65 BGL
For Bulgarians: 44 BGL
Approx. prices for other hotels/private rooms: 10-30 BGL per night.
Approx. expenses for meals, etc.: 8-20 BGL per day.
In the application, specify one of the following options:
- “Flagman, single room” – for foreigners this will cost 65 BGL/night
- “Flagman, bed in a double room” – for foreigners this will cost 43 BGL/night
- “cheeper hotel, single room” – about 25-30 BGL/night
- “cheeper hotel, double room” – about 30-35 BGL/night
- “cheeper hotel, bed in double room” – about 15-20 BGL/night
- “bed in a private house” – about 10-15 BGL/night, to be arranged at the moment of arrival
Linguistic Modelling Laboratory, CLPP,
Bulgarian Academy of Sciences
Acad. G.Bonchev St. 25A
1113 Sofia, Bulgaria
Tel: (+359 2) 979 2825
Fax: (+359 2) 70 72 73