Petya Osenova* and Kiril Simov,
Linguistic Modelling Laboratory,
Central Laboratory for Parallel Information Processing,
Bulgarian Academy of Sciences
* The first reviewer is also affiliated with Sofia University „St. Kliment Ohridski“
Minnen, Guido (2001)
Efficient Processing with Constraint-Logic Grammars Using Grammar Compilation.
CSLI Publications, paperback ISBN 1-57586-306-5, viii+255pp,
Stanford Monographs in Linguistics.
OVERVIEW
This is a monograph based on the author’s dissertation. It is intended for computational linguists mainly, but its interdisciplinary approach makes it useful for formal linguists and computer scientists as well.
The book focuses on the practical and theoretical problems that arise when processing with Constraint-logic grammars. The author proposes compilation techniques as a repairing tool for reducing the declarative under-determination. His approach is considered as an alternative to approaches requiring manual encoding of control information for processing declarative grammars. The author distinguishes two kinds of ‘declarative under-determination’ (p. 2): (1) ‘syntactic under- determination’ – declarative under-determination resulting from the fact that information necessary for syntactic processing is not available; (2) ‘lexical under-determination’ – declarative under- determination that results from the absence of restricting information when processing the lexicon. Minnen’s approach is tested on an HPSG formalism as an example of the constraint-based grammars and relies on the interaction between bottom-up and top-down control strategies.
The contents structure of the monograph follows the division between the two kinds of under-determination. After the two introductory chapters (Introduction and Processing with Constraint-logic Grammars), the contents of the book is divided into two main parts: Syntactic Under-determination and Lexical Under-determination with the relevant chapters. At the end of the book a Conclusion chapter is added. Five appendices, a list of references and an Index follow the text body. Each chapter follows a strict schema: it starts with a description of the problem considered in it, a motivation (both linguistic and processing), then an outline of the proposed decision of the problem is given and the chapter ends with a summary of the related works.
The introduction explains the author’s motivation for practically oriented dealing with the topic and discusses the key words in his work: constraint-logic grammars, declarative under-determination and grammar compilation. The importance of the author’s approach is highlighted with respect to the modularity between grammar and control, and grammar reversibility. There is a summary of the monograph contents as well.
Chapter 2 is entitled „Processing with Constraint-logic Grammars“. In this chapter the formal preliminaries for the rest of the book are defined and explained. The chapter discusses two topics: definitions of constraint-logic grammars and processing with constraint-logic grammars. In the first part the notion of constraint-logic grammars is formally defined. Two subclasses of them are introduced and interrelated: logic grammars based on first-order logic and typed feature grammars based on typed feature logic. The two kinds of grammars are defined in a uniform manner in order to stress the relations between them. The author pays a special attention to the typed feature grammars because of their novelty in comparison to the first-order logic based grammars. He relies on several techniques and assumptions: Hoehfeld and Smolka’s (1988) investigations of the requirements for extension of a constraint language with relations; closed world interpretation (Gerdemann and King 1994) and Prolog notations. Some basic concepts on generation and parsing with these grammars are introduced. Then the author concentrates on the implementation of the Head-driven Phrase Structure formalism using typed feature grammars. Also a concrete system, ConTroll, is shortly discussed along the lines of the typed feature grammars. The second part of the chapter presents a typology of the control strategies with respect to their interrelation: top-down, bottom up and head-driven approaches. The author aims at techniques, which are applicable to both processes – generation and parsing, using one and the same control strategy. In pursuing his aim, he discusses ‘reversible grammars’. According to him, the reversibility is accomplished via two distinct definitions for generating a grammar and parsing a grammar, both of which are derived from the grammar by means of a transformation.
Part I „Syntactic Under-determination“ consists of three chapters (3, 4 and 5 respectively). Each of them starts with a motivation section and concludes with a related work section and a summary section.
Chapter 3 is entitled „Top-down Control and Syntactic Under- determination“. In it, the problem of syntactic under-determination is explored via the ratio between some linguistic phenomena (e. g. topicalisation and empty heads), the order of the processing steps and the head-driven control. On the basis of a small example, the author shows how the efficiency of the processing and even the termination of logic grammars depend on the order in which the literals of the grammar are processed. To solve this problem, the author proposes a grammar transformation technique, called ‘literal rearrangement’ and describes its implementation in Prolog. The procedure takes as input a grammar and a description of the instantiation of the grammar input. The result of the procedure is a new grammar semantically equivalent to the original one that demonstrates better performance on the input of the specified type. Literal rearrangement procedure uses as an indicator of the relative processing efficiency the notion of ‘degree of non- determinism’, which is applied to a literal. This is a measurement of the different instantiations of the literal with respect to its evaluation context – the grammar and the instantiated parts of the literal. The procedure aims at minimization of the degree of the literal non-determinism at each step of the grammar processing.
Chapter 4, „Top-down Control and Recursion on Structure“ focuses on the termination properties of top-down control in the light of syntactic under-determination. As the structural recursion is often used in lexicalist-based formalisms like HPSG, the author seeks adequate repairing of the problematic cases. First of all, two main types of recursion are discussed: the so-called ‘peeling recursion’ and ‘building recursion’. The latter comprises the left recursion in parsing and the head recursion in generation. Then the notion of ‘argument sequence’ is proposed. In this way it is possible to predetermine whether a recursive procedure is guaranteed to terminate. Four types of argument sequences are distinguished according to the type of recursion: peeling, building, neutral and none of them. As the problematic cases go mainly into building recursion, a transformation scheme, called ‘Building recursion reversal’ (BRR) is suggested. This mechanism is intended to transform a non-terminate grammar into its terminate semantically equivalent grammar, more precisely – into a peeling argument sequence, and then the problem with the recursion is solved.
Chapter 5 is entitled „Bottom-up Control using Magic“. It preserves the priorities about unifying control strategies for both NLP tasks – parsing and generation, stated in the previous chapter. The author proposes a solution to the syntactic-under-determination (such as spurious ambiguity) from the bottom-up perspective, using magic (templates) transformation. This transformation becomes necessary as a restrictive filtering component to the guaranteed termination. Two dynamic control strategies are presented and compared: semi-naive magic control and Earley control. Then the author proposes a ‘selective magic HPSG parser’ which combines the bottom-up processing and top-down control. His suggestions are as follows: (1) the so called ‘parse types’, i.e. words and phrases are better parsed bottom-up in lexicalist-oriented formalisms; (2) non-parse types are processed using advanced top-down control strategy (Goetz and Meurers 1997). The author gives details concerning the implementation of the ‘selective magic HPSG parser’. It was implemented in Prolog by the author himself for the ConTroll system and tested with a linearization grammar of German (Hinrichs et. al. 1997). The reported results are somewhat controversial: the module is faster in some cases but slower in others (such as: delayed goals). The problem sources are detected and pointed out along with the areas that need further exploration.
Part II is entitled „Lexical Under-determination“. It consists of two chapters (6 and 7). The structure follows the section types of Part I: each chapter starts with a motivation section and concludes with a related work section and a summary section.
Chapter 6, „Lexical Rules as Systematic Covariation in Lexical Entries“, proposes a new computational treatment of lexical rules in HPSG. It is a well-known fact that via lexical rules the horizontal generalizations in the lexicon are captured. Hence their processing becomes a question of great importance, especially in lexicalized linguistic theories as HPSG. From the two possibilities of formalizing the lexical rules, meta-level approach (Calcagno and Pollard 1995) and description-level approach (in which lexical rules become a part of the theory), the author adopts the latter. He proposes an encoding of lexical rules as systematic covariation in lexical entries. The underlying motivation is the following: such an approach takes into account the peculiarities of the lexical rules as well as the exceptions to them. Last but not least, it is applied to finite-length lists and thus the problem with the potentially infinite lexicons is reduced.
After these theoretical considerations, the author describes the compiler which exemplifies the proposed idea. Four compilation steps are discussed: (1) Trivial translation of lexical rules into definite clauses and explication of their frame (i.e. the features that have been left unspecified by the linguist); (2) Determination of the possible lexical rules interaction via finite-state automata (this step eliminates the unsuccessful applications); (3) Finite-state automata are further fine-tuned according to the word classes in the lexicon; (4) Finite-state automata are translated into typed feature logic definite clauses and the interaction with the lexicon is guaranteed by indexing. A new idea about the lexicon is suggested. The covariation lexicon is meant to include extended lexicon (lexical entries specifying words that can or cannot undergo lexical rule application) and lexical rule predicates. The author is aware of the limitation areas such as spurious ambiguity and non-determinism.
Chapter 7 is entitled „Optimized Covariation Lexica“ and it is obviously inspired by the problem sources, stated in the previous chapter. The author comes to the conclusion that lexical lookup needs to be constrained. Hence two grammar transformations are proposed in order to improve processing constraint propagation and unfold transformation. As to the implementation of the tool in question, it is implemented in Prolog by Detmar Meurers, the author and in cooperation with Dieter Martini. The tool has been tested on lexical rules that refer to syntactic and morphological information. Its advantage is its speed.
CRITICAL EVALUATION
The book presents an up-to-date view on automatic transformation of declarative logic grammars with respect to speeding-up their performance and in some cases insuring the termination. It highlights the role of the linguistic specifications for the efficient Natural Language Processing.
The book is written in a clear and reader-friendly manner. It has a consistent logical structure of the content, elaborate explanations and references within the text body, which facilitates the process of following up and understanding the author’s ideas.
One drawback of the book concerns typography: there is missing content of illustrative figures (for example, fig. 4.17, p. 90), there are repetitions of whole passages (for example, p.30-31) and some other minor typos.
REFERENCES
Calcagno, Mike and Carl Pollard (1995) Lexical Rules in HPSG: What are they? Manuscript, Ohio State University, Columbus, Ohio, USA.
Earley, Jay (1970) An efficient Context-free Parsing Algorithm. Communications of the Association for Computing Machinery 13(2), 55-61.
Gerdemann, Dale and Paul King (1994) The Correct and Efficient Implementation of Appropriateness Specifications for Typed Feature Structures. Proceedings of the 15th Conference on Computational Linguistics, Kyoto, Japan, pp. 956- 960.
Goetz, Thilo and Detmar Meurers (1997) The ConTroll System as Large Grammar Development Platform. Proceedings of the Association of Computational Linguistics post-conference workshop on Computational Environments for Grammar and Linguistic Engineering, Madrid, Spain.
Hinrichs, Erhard, Detmar Meurers, Frank Richter, Manfred Sailer and Heike Winhart (1997) Ein HPSG-fragment des Deutschen, Teil 1: Theorie. technical report SFB 340 95. University of Tuebingen, Germany.
Hoehfeld, Markus and Gert Smolka (1988) Definite relations over Constraint Languages. Tecnical Report 53. IBM, Germany.
ABOUT THE REVIEWERS
Petya Osenova and Kiril Simov, are currently involved in the BulTreeBank Project (HPSG-based Syntactic Tree Bank of Bulgarian): a joint project between Linguistic Modelling Department, Bulgarian Academy of Sciences and Seminar für Sprachwissenschaft (SfS), Eberhard-Karls-Universität, Tübingen, Germany.
Petya Osenova is researcher in Linguistics. She is assistant professor in syntax at the Sofia University as well. Her scientific interests are mainly in formal and corpus linguistics.
Kiril Simov is researcher in Computer Science. His scientific interests are in computational Linguistics, constraint-based formalisms and logics, knowledge representation and Ontologies.