Available Resources
Text Acknowledgements
Related links


CLaRK System

CLaRK System Online Manual

Bulgarian dialects'
electronic archive

eXTReMe Tracker








Menu DTD

Compile DTD

A default shortcut Ctrl+L

By choosing this item, the user compiles a DTD (Document Type Definition). First a standard file chooser appears. The user is expected to point out to the file where the DTD is stored. The system supports four kinds of character encodings: ASCII, Unicode UTF-8, Unicode UTF-16BE and Unicode UTF-16LE.

When an input file has been chosen, the DTD compilation begins. If the DOCTYPE element is not declared, the user has to choose the root element from all elements defined in the DTD.

If an error occurs during the DTD compilation, a notifying error message appears.

If everything is correct, there appears a message for a successful compilation. The DTD is added to the list of all DTDs known to the system. If in the system there already exists a compiled DTD with the same name, then an additional index is appended to the end of the newly added DTD name.


Renew DTD

The Renew operation replaces the content definitions of a selected DTD in the system with a DTD stored in an external file. During the renew operation all other settings related to the DTD in the system are saved unchanged. Such settings are the text and tree layout, element features etc. Thus Renew DTD operation is useful when the user constructs a DTD outside the system and needs to update it in the system. For this operation the user has to choose a DTD from the system by using a standard DTD chooser.

None of the opened documents in the system editor should use the selected DTD during Renew operation. If there are such document(s), the following warning confirmation message appears:

If No is clicked, then the Renew DTD operation is canceled.

If Yes is clicked, then all documents in the system which are connected to this DTD may become invalid. When some of these documents are opened in the system editor they will be validated again.

Then a standard file chooser appears. The user is expected to point out to the file where the DTD is stored. The system supports four kinds of character encodings: ASCII, Unicode UTF-8, Unicode UTF-16BE and Unicode UTF-16LE.

When an input file has been chosen, the parsing begins. If an error occurs during the DTD compilation, a notifying error message appears.

If everything is correct the new compiled version of the DTD substitutes the old one.

Remove DTD

By choosing this item, the user can remove a DTD from the list of all DTDs known to the system.

DTDremove.gif (5977 bytes)

If there are no saved documents referring to this DTD in the system, there comes a message for a successful removing. Otherwise, an error message appears. It warns the user that the removal operation cannot be done.

Having pressed the button Details the user can see the documents, which rely on the selected DTD. These documents do not allow the DTD removal. The two possible solutions to this problem are as follows: either the DTD for all these documents is changed, or the documents themselves are removed.

DTDreference.gif (3781 bytes)

In this way no documents will refer to the DTD in question and hence, the removal will be successful.


View DTD

This is an information dialog, showing the content of a DTD (Document Type Declaration) already compiled in the system. (For more information about DTDs, see http://www.w3.org/TR/1998/REC-xml-19980210#dt-valid).

The information data is divided into 4 sections representing different parts of the DTDs (structure data, attributes data, entities data and processing-instructions data). These four parts are contained in a tabbed pane and, by clicking on each tab, the user can switch between them.

The viewer is demonstrated by the following simple example of a DTD:

<!DOCTYPE books [
<!ELEMENT books (book)*>
<!ELEMENT authors (author)+>
<!ELEMENT book (#PCDATA, title, authors, publisher, (pages)?, isbn, (price)+)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT publisher (#PCDATA)>
<!ELEMENT pages (#PCDATA)>
<!ELEMENT price (#PCDATA)>
<!ATTLIST price
<?CLaRK member(Gram,AllAn,[;]) ?>

Structure data section:

The structure, defined by the DTD, is represented as a table (see above). Each table row contains structural data for one element. The name of the element is in the first cell. The second cell contains the definition of the element as a regular expression. The rows are sorted by lexicographical order of the element names.

Note: #PCDATA means a plain text string, excluding symbols like '<','>'.

In the picture above, the definition of author element says that it must contain only text data (not other elements). The book element structure must be: text data, title element, authors element, publisher element, pages element (optional), isbn element, price element(one or more) and date element (optional). The ordering matters.

When the declaration of an element is too long or rather complex, sometimes it is helpful for the user to see the declaration separately from the other element declarations. It becomes possible by clicking with the right mouse button on the row of the desired element and then pressing the View button, when it appears.

Element attributes data section:

In order to see the attributes of a given element, the user has to choose the element from the drop-down menu at the top of the window. This menu contains all elements declared in the DTD. If after choosing an element nothing appears in the table, it means that there are no attributes declared for this element. Each time an element is chosen, the content of the table is updated. In the picture above, the element price has been chosen and the table contains its attributes according to the DTD.

Each row represents one element's attribute only. The name of the attribute is in the first column. The type of the attribute's value is in the second one. The third one contains meta-information about the attribute (required, implied, fixed, ...).

Entities data section:

This section gives information about the entities defined in the DTD. Entities can be used as escape alternatives for symbols, which are not allowed in the text. In the picture above, there is an entity lt which will substitute the symbol '<' in the text. Otherwise the XML parser will decide that a new tag is starting when it meets the symbol '<' in the text and if it is not the case then the further processing fails. Therefore, in this case the symbol '<' has to be substituted by &lt; being interpreted as the symbol '<' but not as a starting point of a tag. The format of an entity is: &xxx;, where xxx is the name of the entity. All entity names appear in the first column of the table above. Opposite each entity name stands its corresponding text string (the text which will be substituted by the entity).

Processing-instructions section:

The processing-instructions contain more information about the text processor than about the text content itself. It is so, because the processors should know how to interpret the instructions.


Export DTD

There is a possibility to export an existing DTD document from the system to a file. For this operation the user selects the DTD he/she wants to export from the DTD chooser dialog. Then he/she points to a directory where the selected DTD to be stored with the desired file name and encoding.


Create New DTD

This dialog allows the user to create a new DTD document.The dialog is similar to the View DTD dialog. There are additional Find, Save and Cancel buttons. The content of a new DTD document is empty - empty tables for elements and attributes. There are five default entities in the Entity table - amp, apos, gt, lt, quot, which cannot be edited or duplicated.

When editing a given cell in the table a popup menu can be used by clicking with the right button of the mouse. The popup menu has the following items: Cut, Copy, Paste, Delete, Select all, Edit and Insert symbol. Insert symbol allows the user to enter symbols from the Unicode table. There are several default strings from DTD standard which can be added while editing the cell content.

  • ANY, EMPTY, #PCDATA , | ? * + ( ) for Element Description column.
  • #REQUIRED, #IMPLIED, #FIXED for Attribute Default Value column.

When right mouse button is clicked over a selected row or rows a popup menu appears. There are Insert row, Delete row, Copy and Paste items. They are used for inserting a new row, deleting the selected row(s), copying a row or rows and pasting the row(s) from the buffer into the selected row position.

Thus there are two different menus: one for manipulating the lines in the tables and one for editing the content of a cell in the table.

For saving the new DTD the Save button is clicked and a name for it is entered. If there are no errors, the new DTD will be compiled and ready to be used by the CLaRK System. Possible errors can occur if there is an empty cell in one of the tables, invalid name of element, attribute or entity, wrong regular expression for element definition, wrong attribute type, attribute default value, entity value and others.

When Find button is clicked the find dialog appears. It gives possibilities to find information about the current DTD. First, the user enters the string to be looked for. By choosing elements, attributes or entities he/she determines the table in which the search will be performed. For each one there are items for the relevant columns in the table. When the search query is ready to be applied, the Search button is clicked for searching. With Next and Previous buttons the found result can be navigated.


Edit DTD

This dialog allows the user to edit existing DTDs in the system. It is similar to the Create New DTD dialog. There are additional Save as and Update buttons. Editing operations are the same as Create New DTD dialog - Cut, Copy, Paste, Delete, Select all, Edit and Insert symbol for cell editing, Insert row, Delete row, Copy and Paste for row editing.

By Save as button the user can save the edited DTD under different name.

If there is an opened document in the system, which is related to the DTD to be edited, the following message dialog appears.

If Yes is clicked all documents in the system which are connected to this DTD may not be valid. When the documents are opened in the system, they can be validated again with respect to the new DTD.

By Update button the changes are saved to the edited DTD.


Edit Text Layout

By editing the DTD layout, the user can change the way, in which a document, loaded in the system, will appear on the screen. This facility includes the following: moving to a new line before/after opening/closing tag, hiding some tags and/or their content.

After choosing a DTD, the following table appears:

The first column contains all tag names in the DTD. Each row represents the layout information for one tag. Here follows a description of the meaning of each column in the table:

  • Tags - All the tag names in the selected DTD. A special name Unknown* is added at the end of table for the elements with a tag not defined in the DTD;
  • Open tag start - the possibility for the opening tag of the corresponding element to appear on a new line or not;
  • Open tag end - the possibility for the first child node to appear on a new line (a new line after the opening tag) or not;
  • Close tag start - the possibility for the closing tag of the corresponding element to appear on a new line or not;
  • Close tag end - the possibility for a new line to be inserted after the closing tag or not;
  • Is tag visible - the possibility for the tag to be visible or hidden on the screen;
  • Are children visible - whether the content of the tag to be visible or hidden;

The check box at the bottom of the window: "Use line offsets" supports more comprehensive visualization. It suggests an additional white space to be inserted in front of the tags. If chosen, this white space is assigned to each tag, which appears on a new line and the length of the white space depends on the depth of the node in the DOM tree.

The field Color Scheme specifies which Color Scheme will be used for the documents using this layout. If the selection is the first item (<disabled>) then no Color Scheme is used. For details see menu option Color Schemes.

The two buttons Export Layout and Load Layout are used for saving/loading the current layout settings to/from an external file in XML format.

When a document is loaded, it obeys the layout, defined for its DTD. Note that later on this layout can be changed only for the current view.


Edit Tree Layout

The usage of the Edit Tree Layout option concerns the way the document tree will be displayed on the screen. Without a Tree Layout activated the tree appears as: element nodes are represented by their tag names, text nodes - by their text context. The nodes in the tree are colored in blue if the corresponding (DOM) nodes are valid according the DTD and colored in red otherwise. When a Tree Layout is activated the user can define the way each element node is represented in the tree. The definition is a text pattern with tree types of identifiers which are interpreted not just as text:

  1. A reference to an attribute value of the current element. The syntax is '{@attribute_name}', where attribute_name is the name of the attribute whose value is needed. If the attribute is not found, an empty string is returned.
  2. A reference to an XPath key which is evaluated with the current element as a context. The syntax is '{%key_name}', where key_name is a name of an XPath key defined in the system. It is an error if such a key does not exist.
  3. An XPath expression which is evaluated with the current element as a context. The syntax is '{xpath_expression}', where xpath_expression is a valid XPath expression. The result from the evaluation can be: node-set, string, number or boolean. In case a node-set with more than one element is returned, the different values are separated by a single space character. One restriction here is that the xpath_expression must not contain the character '}' in it. If there is a need to use it, the XPath expression in a XPath key has to be defined and then user can refer to it.

The user has to be careful when s/he uses the symbol '{' in a text pattern in the cases when it must not be considered as a beginning of an identifier. In these cases the user may use the sequence '^{' instead of '{'. Example: '^{@name}' will appear as: '{@name}', but not as the value of an attribute name.

Also for each element node a different coloring can be defined. There are 12 colors available.

Here is an example preview of the Tree Layout editor:

Each layout can be enabled/disabled. When a layout is disabled the tree is shown as if there is no layout defined but  it still remains in the memory and can be enabled later. The Instructions button gives a short description of the text pattern syntax. The View Macros button shows a list of all XPath macros currently available in the system.
The checkbox Preserve ToolTip gives the possibility the original tag names to be shown in tooltip balloons on the tree panel for convenience. If the checkbox is not selected then the tooltip will repeat the content of the corresponding tree node.

The field Graphical Layout determines which layout will be used for documents having a DTD, the owner of this layout. The Graphical Tree Layouts are defined in menu Definitions / Graphical Tree Layout and used in menu View / Graphical Tree View.

In the figures below you can see how the tree layout above has changed the tree appearance:

Tree Layout enabled

Tree Layout disabled