Multi Query Tool Demo

The goal of this demo is to show how the MultiQuery tool of the CLaRK system can be used for step by step implementation of a task.

This tool is designed to call other tools. It does not work directly with the XML data. The tool uses a list of XML Tool Queries which are executed one by one in the order they appear. There is no limitation for the level of inclusions of the queries. If there has been detected a cyclic inclusion (a multi-query includes itself or any of its sub-queries includes it) the system indicates an error. The result from each single tool application is an input for the next single tool application. The result from the last single tool application is the result from the MultiQuery Tool.

The user can change the order of application of the different queries or/and conditional application of certain operations. This can be done by Conditional Control Operators or for short Controls, specific for this tool.

The ordinary order of application starts from the first one and proceeds one by one up to the last one. When using Controls operators, some queries can be applied only if certain conditions are true. Such conditions are: the true or false value of a result from an XPath evaluation, whether the preceding single tool application has or has not modified the working document, or it is unconditional (always succeeding). When a condition for a Control is true, the next query (or another Control), which will be applied, is defined in the control itself. Otherwise, the application proceeds with the next entry in the order (query or another control). The Control operators address their targets (queries or controls to be applied in case of success) by pointing their labels. Each entry (row) in the table of the MultiQuery Tool can have a label (unique identifier) which can be referred by control operators. It is an error if a Control operator uses a target label which does not exist.

Each Control operator may consist of three parts:

By using the Controls-labels technique, the user can model the famous IF-THEN-ELSE and WHILE-CONDITION-DO structures in order to make the processing more flexible. The composition of different Controls allows the user to create varied 'programs' or 'scripts' capable of doing certain jobs. It is up to the user to create efficient and reliable processing procedures.

Demo : Preparing documents with pure text for disambiguation

Our goal here is to disambiguate the text. For this reason the text will be tagged and additional information for each word will be added namely - all the possible morphological analyses.

For marking tokens and adding the morphological information,some grammars are used. For more information about grammars and their queries see Grammar Tool Demo.

Some unnecessary data is removed from the file using remove queries. For more information about queries see Remove Tool Demo.

A constraint for disambiguation disambiguate attribute is applied over tokens. For more information about the constraint and query see Constraints Tool Demo.

The document used for this demo is: Standart20030524.tag.

In order to run the demo you have to perform the following steps:

  1. Check whether the document: Standart20030524.tag is loaded in the system. If it is not, then load it as it is described in Import XML Tool Demo. If it is, open it.
  2. Check whether the grammar queries : tokenize.gram.que and Eng_Dict_Attribute.que are saved in the system. If they are not, then load and compile grammars tokenize.gram and dict_attr.grm from Grammars/ Grammar Manager - File I/O/ Load grammar from file button. Import the queries in Root : SYSTEM : Queries : Grammar according to grammarquery.dtd as it is described in Multi Import Tool Demo.
  3. Check whether the remove queries : whitespace.rem.que and tok.rem.que are saved in the system. If they are not, then you can import queries in Root : SYSTEM : Queries : Remove according to removequery.dtd as it is described in Multi Import Tool Demo.
  4. Check whether the constraint query: attribute.const.que is saved in the system. If it is not, then load the constraint disambiguateAt.cnst from Constraints/ Value Constraints/ Edit Value Constraints - Load From File button. Import the query in Root : SYSTEM : Queries : Constraints according to constraintsquery.dtd as it is described in Multi Import Tool Demo.
  5. Open the query dialog of the MultiQuery tool from the menu item Tools.
  6. Add tokenize.gram.que and Eng_Dict_Attribute.que queries in the table with queries. To do this, press Add Query button, set Root : SYSTEM : Queries : Grammar in Current Group combo box and select the two queries from the list with all grammar queries. Then press OK. The order of the grammars is important because Eng_Dict_Attribute.que works over the result of tokenize.gram.que. Reordering can be done by simply dragging the rows of the table up or down.
  7. Add whitespace.rem.que and tok.rem.que queries in the table with queries. To do this, press Add Query button, set Root : SYSTEM : Queries : Remove in Current Group combo box and select the two queries from the list with all remove queries. Then press OK.
  8. Add attribute.const.que query in the table with queries. To do this press Add Query button, set Root : SYSTEM : Queries : Constraints in Current Group combo box and select the constraint query from the list with all constraint queries. Then press OK.
  9. The dialog of the tool in this case has to be:
  10. Then you can run the query. The result will be the same document with recognized tokens marked with w element, and all possible morphological information for that token in ana attribute, unrecognised tokens marked with tok element and punctuation marked with pt elements. And there is a dialog that gives the user a possibility to disambiguate manually the ambiguity.

The above query is saved in the document xmlEnc.multy.que in the demo directory.