Tool Application Modes - Processing Current Document vs. Multiple Apply
The processing of XML data in the CLaRK System can be done in two ways. Each of them has its advantages and disadvantages.
As the system is designed to work with corpora, this in most of the cases involves working with large amount of data. Sometimes this can be crucial for the processing time and the system resources which are needed for a certain task. Therefore CLaRK supports two techniques for processing XML documents:
The advantages of the first type of processing are that the user can see the data and can adjust the tool settings according to the specific task. The user can check which specific data the given tool will be applied to without an actual application. One disadvantage here is that the visualization of large documents requires system resources which can make the processing extremely slow. Another disadvantage is that the specific tool can be applied only to one (current) document.
To solve these problems, the CLaRK system offers the second approach (Multiple Apply). Here the user can select one or more documents which the certain tool will be applied to. The processing proceeds according to the order of the document selection. During the processing time the selected input documents are not opened in the editor which takes considerably less system resources and makes the procedure faster. Here, after starting the application, no user input is expected. During runtime, on the screen status messages are printed showing the current state of the process: currently processed document, result message after application to a single document, result document, etc.
All tools which support these two modes of application have a similar graphical interface dialogs. The mode is controlled by a checkbox "Multiple Apply" (fig. 1) situated on the main tool dialogs. If the checkbox is unselected, the tool will be applied to the current document. Otherwise, an auxiliary panel is shown under the checkbox (fig. 2).
Fig. 1 Tool application to the current document
Fig. 2 Tool application to multiple Internal Documents (Multiple Apply)
Multiple Apply Auxiliary Panel
Basically, the panel represents a table with the selected documents the specific tool to be applied to (column INPUT). Also the table contains the result document names in which the result from the application should be stored (column RESULTS). If for each input document one result document is produced, the input name and the result name appear on the same row. If for all selected input documents only one result document is produced (tool and/or options dependent), its name should appear on the first row of column RESULTS. Unless the Overwrite option is set (see Options below) the second column of the table can be edited.
On the right side of the panel, the buttons for document selection and options are situated:
Multiple Apply Options
When a tool is applied in a Multiple Apply mode the user have to specify a location where the result(s) should be stored (except for the cases the result overwrites the input data). A result location group can be specified at the bottom of the panel. By default, each tool has its own specific result group ( [Corpus_name] SYSTEM : Results : <tool-name>). The user can point to any group in the Internal Documents database with one restriction: results cannot be stored in the system groups under group [Corpus_name] SYSTEM : Queries and their descending groups. This restriction comes from the fact that these groups must contain XML documents of a special type (tool queries) and they must be valid according certain DTDs. These requirements for the result cannot be controlled in advance. Having pressed the Change button, the user gets the following result group (folder) chooser dialog:
A new result group can be set either by pointing to a group in the tree and pressing the Choose button, or by performing a double left-mouse-click on the target group. If the selected group is not appropriate for a result group, a warning message appears and the control is returned to the group chooser dialog. Two additional operations are available here: adding a new group (button New Group) and removing an existing group (button Remove Group).
For some of the tools there is one more option available: specifying a DTD for the result document(s). This option is available only for tools which produce new result documents. The user can select any DTD compiled in the system or to preserve the DTD from the input document(s) (option <Original DTD>).
While the real tool application in Multiple Apply mode is performed, the user is shown an information dialog which indicates the overall status of the process. The system shows which document is currently processed, result messages after each single application and where the result is stored. In case of errors, corresponding messages are shown. An example status window is in the following picture:
During runtime the user can cancel the tool application by using the Stop button. The application is not interrupted immediately but only after the current operation has been completed (opening a document, applying a single operation or saving a result).
XML Tools Queries
The user can save different configurations of the tools in order to execute them many times. Except for the specific tool settings, the user can specify which input documents the tool will be applied to and how the result will be formed and saved. All these settings are represented as XML documents in the Internal Documents database. Further more they can be processed system with all facilities. This specific kind of XML documents in the system are called XML Tool Queries or just queries. Each tool in the CLaRK System has its own specific type of queries with their specific DTDs. The queries are located in a special place in the Internal Documents database (system groups [Corpus_name] SYSTEM : Queries : <tool_name> and all descending sub-groups). Each XML query is valid according to its DTD.
fig. 3 Queries Panel for XPath Remove Tool
A management panel similar to the one in fig. 3 appears (with very small variations depending on the specific tool). After choosing Select button, the corresponding Query Manager is shown and the user can load a query in the current tool. If the Reset button is clicked, then the tool settings are reset to their initial values. In this case the Update button changes to Save. The settings on the current tool dialog window can be saved by using the Save/Update button. If the user creates a new query then after pressing the Save button s/he must supply a query name. If changes (modifications) on an existing queries are to be saved, the Update button requires the user confirmation for overwriting. All queries which are saved/updated are stored in the Internal Documents database.