Because of the variety of graphical characters (letters) which the Unicode tables
allow, it is necessary for the user to have a means for keyboard input. Unfortunately, in
most cases either the keys on the keyboard are not enough or the already defined keyboards
are not suitable.
In these cases the CLaRK System suggests the following solution. The user can define
his/her own keyboard maps, i.e. for each key on the keyboard a different character can be
attached. There are 94 keys available for mapping. For identification of each key, its
ASCII character is used (which coincides with the beginning of the Unicode). It
is a default for the specific machine architecture. The keyboard maps are
represented as sets of pairs. Each pair is responsible for one key. It has two elements:
the default character and the code of the new attached character from the Unicode table.
And when a newly defined keyboard is activated and some key is pressed, its character is
searched for in the set of char-code pairs. If there is such a pair found, then the second
element is taken and according to it a new character is retrieved from the Unicode table
and is visualized on the screen. If there is no such a pair, then the same character
appears on the screen.
There are two keyboards default for the system - English (the hardware system
keyboard) and Bulgarian Phonetic (auxiliary). Both are fixed and cannot be modified.
When the system works there are always two active keyboards. The two keyboards
can be switched on alternatively by the key combination <Ctrl>+<Left Shift>.
Also there is an indicator on the toolbar, which shows the currently used keyboard. If the indicator
is red colored and the sign is Aux, it means that the auxiliary one is in use. Otherwise it
is green colored with a sign Lat. A switch can be performed also by clicking on the
When this item is selected, the Keyboard Manager window appears. The initial view of the
manager is presented below:
The manager dialog window contains three subparts: Keyboard Preview, Unicode Table
Preview and Control Panel. A keyboard for editing can be selected from Current Keyboard
at the right top of the dialog.
This is the table on the left side of the window (with the white background). It shows
the current state of the auxiliary keyboard. Each row in it represents a pair from the
keyboard map. The first column contains the characters of the hardware default keyboard.
It is not editable. The second column contains the codes of the newly attached characters.
The third column is a char preview which shows the new char for the selected key
corresponding to the current code. In the picture above the selection is set to a row with a
character d. The character code attached to it is 1076, which means that when the
user presses d, on the screen will not appear d, but the character
corresponding to this code.
The user can define a key by entering the codes of the desired characters. After
entering a code, <Enter> is expected.
Unicode Table Preview
Now the question is how the user will know the code of the expected character. The answer
comes from the second component - Unicode Table Preview. This is the table with the
blue background in the picture. It contains the characters of the Unicode table available
for the current font. This font is identical with the font of the text area in the system. If the
character, expected by the user, is not in the table, then the font of the text area must be changed
The first row and column contain numbers which are used for calculating the code of
each character. The calculation is very simple. When we find the character in the table,
we take the number from the cell and add it to the number from the column. The
result is the new character code. The easiest way to assign a key to a certain character is:
first to select a row in the Keyboard Preview table and then after finding the
right character to make a double click on this character in the Unicode Table Preview.
The new character code is calculated and copied to the selected row. If there is no
selected row, nothing is done.
Example: How do we get the number 1076 for the character in the Char preview?
First, we find the location of the character in the Unicode Table Preview. In the
picture above it is in the last row and in the eighth column. The number in the
same row of the first column is 1070. The number in the same column of the first row is 6.
So the final sum is 1076.
Navigation in the Unicode table can be done by using the two buttons: Page Up
and Page Down situated on the right side of the table. Another possibility is to enter
a number into Go to field under the table and go to the current row (Code
button) or to the current page from the Unicode table (Code Page button ). A code page
contains 256 chars.
Note that the small rectangle in some of the table cells means that for this code the current font does not
It is situated at the bottom of the dialog and it contains 5 components:
This dialog window suggests a tool for changing the system fonts in several key
components of the system. This tool concerns only the graphical interface. The reason is
that the CLaRK System uses Unicode char encoding which allows the usage of a great range
of different characters from different alphabets. Unfortunately, not every font supports
the whole character table. In general, fonts are defined for a specific use and support 2
or 3 different alphabets. This manager allows changing the fonts of the components
independently. The components for which the font can be changed are:
- Text Window - this is the text area on the right side of the system main panel.
This is the place where the text of the document appears.
- Tree Window - this is the component on the left side of the system main panel
where the tree of the document structure appears.
- Attribute Table - a table, situated just below the Tree Window. It gives
information about the attributes of the currently selected element.
- Error Messages - this is the component at the bottom of the main system panel,
where the error messages appear.
- Tables - this sets the font of all tables in the system (Grammar editor,
Tokenizer editor, ...).
- Fields - this sets the font of all text fields in the system.
The dialog window:
The dialog contains 5 sections as follows:
- Font Chooser - the panel on the left, showing all available fonts for the
hardware system. The change of the font for a given component can be done by choosing a
new font entry from here.
- Component Chooser - it is situated in the upper right corner of the dialog
window. In it the user chooses the component that replaces the font.
- Font Style Modificator - changes the style of the font (Regular, Bold, Italics
- Font Size Chooser - changes the size of the currently selected font. The font
size can vary in the range from 5 to 50. If the user enters a number out of this range,
the value is automatically corrected to 5 or 50. If the input is not a number, the old
value is restored. When the user enters a new value for a font size, Preview button must be hit
in order to refresh the preview component.
- Font Previewer - makes a preview of the currently chosen font with a specified
Note: if the text in the font preview does not change when a new style is
chosen, it means that the font does not support this style.
This option can be used for changing the colors of the different components (tags,
text, attributes, comments and background) in the text area(s) and the background of the tree area(s).
The available colors are all the colors supported
by the specific hardware and software environment in which the system is used. The color
selection is supplied by a standard color chooser (computer architecture dependant).
Here is the dialog which appears after choosing the "Visuals" option:
The dialog window contains two sections:
- Colors Info
This section is responsible for the color selection for
the different components. The colors of the buttons on the right side indicate the
corresponding components' colors. By pressing the buttons, a color chooser appears. If a
new color is chosen, after closing the chooser, the background of the corresponding button
is changed to the new selection. Otherwise it remains the same. The components which can
change their colors are:
- Tags (Tag Color)
- Text (Text Color)
- Attribute Values (Attribute Color)
- Comments (Comment Color)
- Text Panels Background (Text Background).
- Tree Panels Background (Tree Background).
Here is a preview of the color settings above:
- Control Buttons:
- OK Button - Applies the new color settings.
- Reset Button - Resets the color settings as follows:
- tag color - pure blue;
- text color - pure black;
- attribute value color - pure green;
- comment color - dark gray;
- text background color - light gray.
- tree background color - white.
- Cancel Button - Cancels the current color settings.
- Color Schemes Button - Opens a Color Schemes editor dialog, described below.
This tool gives the possibility for defining in what color the specific elements in the
text area (tags, comments, text) will appear. This is a more advanced function because it defines
separately the colors of the elements and does not depend on their type but on the results from the evaluation
of arbitrary XPath expressions. This allows the different elements to be in different color
depending on the context in which they appear. When an element is visualized on the screen, a set
of XPath expressions is evaluated according to it as a context, and if one of the results is a
non-empty list, a positive non-zero number, a non-empty string or a true boolean value, then the
corresponding element is painted in the specified color.
Here is what the Color Scheme Editor looks like:
The basic unit defining the color layout is called Color Scheme. Each Color Scheme is
responsible for the visualisation of one or more documents. A Color Scheme is identified by a name and
it contains a set of pairs. Each pair specifies an XPath expression and a color. If the evaluation
of the XPath gives a positive result, then the corresponding context node is painted in the color
which is the second component of the pair. If more than one pairs define a color for a certain
node, then the first one is used.
The structure of the editor window is the following:
- Color Scheme Selector - this component is situated on the top of the window and it
contains a list of all Color Schemes defined in the system.
- Scheme Preview - contains a list of all entries (pairs) of the selected scheme in the
Color Scheme Selector. Each entry of this list is an XPath expression which is painted
in a specific color. The order of the different entries determines the sequence in which the
XPath expressions will be evaluated. The first XPath, which returns a positive result, is taken
into consideration. The operations which can be applied over the different XPath-color pairs are
determined by the three buttons on the right side on the panel:
The last two operations work over a selected entry in the list. If there is no a list
selection - nothing is performed.
- Add Line - adds a new list entry to the end of the list. The user is asked to
enter an XPath expression and to select a color. Each XPath expression is evaluated
relatively to each node in the corresponding document.
- Edit Color - gives the possibility for modification of an existing XPath-color pair.
- Remove Line - removes an entry from the list.
- Control Panel - a set of buttons used for Color Scheme management:
- New button - creates a new Color Scheme. The user is asked to specify
a scheme name.
- Remove button - removes the currently selected Color Scheme. This removal
is preceded by a warning message.
- OK button - closes the editor window and updates all modified
- Cancel button - closes the editor window and discards any modifications
of the Color Schemes.
The Color Schemes can be used from the Edit DTD Layout or
Edit Current Text Layout
menu options - field Color Scheme.
This option allows the change of the style (Look & Feel) in the graphical user interface
of the system. This does not change the structure of the dialogs but only the way they are painted on the screen.
Here follows an example what one dialog window looks like in different styles:
The number of the supported styles may vary on the different computers depending on the computer
architecture, operating system and the Java Virtual Machine. The example above is taken on a Intel x86 machine
working under Windows OS with JDK 1.4.2. On other machines the picture might look slightly different: more
or less available styles, different colors, different icons, etc. The major purpose of this option is to make the user
environment more friendly and convenient for use.
This option is relevant when the user works with files which rely on 8-bits character
encoding (like ASCII). It is used for correct mapping between ASCII and Unicode character
encoding. Because of the limitations in size of the ASCII format and the need of using
different symbols, there are many character-sets which use one and the same code ranges.
The problem here is how to distinguish which character-set should be used for a certain
ASCII file. Unfortunately, very often such information is not available and the system can
make a wrong decision when reading a file. For example, the user expects to read a file
containing a Hebrew text but the system decides that it is a Cyrillic text and interprets
it in a wrong way in Unicode. So the user is must specify which character-set to be interpreted from the system. That is the place where the Char Encoding Corrector can be used.
Here is a screen-shot of the dialog window:
The choice list at the top of the window contains all the character-sets supported by
the CLaRK System. For the moment the system supports 34 standard character-sets:
- Arabic (Windows-1256)
- Baltic (Windows-1257)
- Cyrillic (Windows-1251)
- Greek (Windows-1253)
- Hebrew (Windows-1255)
- Latin 1 (Windows-1250)
- Latin 2 (Windows-1252)
- Latin 5 (Windows-1254)
- Thai (Windows-874)
- Viet Nam (Windows-1258)
- Arabic (ISO 8859-6)
- Baltic (ISO 8859-4)
- Cyrillic (ISO 8859-5)
- Greek (ISO 8859-7)
- Hebrew (ISO 8859-8)
- Latin 1 (ISO 8859-1)
- Latin 2 (ISO 8859-2)
- Latin 3 (ISO 8859-3)
- Latin 9 (ISO 8859-15)
- Turkish (ISO 8859-9)
- Arabic (OEM-720)
- Baltic (OEM-775)
- Cyrillic DOS (OEM-855)
- Greek (OEM-737)
- Hebrew (OEM-862)
- Latin 2 (OEM-852)
- Multilingual Latin 1 (OEM-850)
- Multilingual Latin 1 + euro (OEM-858)
- Russian (Cyrillic 2) (OEM-866)
- Turkish (OEM-857)
- US Codepage (OEM-437)
- Cyrillic Russian (KOI8-R)
- Cyrillic Ukrainian (KOI8-U)
- Cyrillic Ancient (KOI8-C)
The table in the center represents a preview of the currently selected
character-set. The table contains symbols with codes in the range from 128 to 255. The
change of the selected character-set refreshes the content of the table. If the user is
not sure which character-set must be used, s/he can choose the first option from the list:
(System Default). This will make the system use the default character-set of
the specific computer architecture and operating system.
The newly selected character-set can be applied by using button Apply or
rejected with button Cancel. If a new character-set is applied, it will be
taken into consideration each time an ASCII file is opened, i.e. importing/exporting
documents, compiling DTDs, etc.
For each element in an XML document, a set of default attributes can be defined in the
DTD. These are attributes which are not presented in the elements explicitly, but it is
assumed that they are there with a default value set in the DTD. Each time a document is opened, for every element with absent default attribute, it is explicitly added with its default value.
An icon on the toolbar
Shows and hides the tags in the text area. If the tags are hidden in the area, they are
replaced by square brackets: [ - for the opening tags and ] - for the closing tags.
If the Show Attributes In Area option is activated and the tags are hidden, then attributes
are not visible as well.
An icon on the toolbar
Enables/disables the appearance of the attributes in the text areas. If the attributes are
shown in the area, they cannot be removed or added , but they can be modified. Attribute
management is supported by using a right mouse click on the table below the tree panel of the
Enables/disables validation of the document according to the DTD and active All Children
Constraints (if any). If Validation is enabled, all the errors for the current document (if any) are
shown in the Error Massage Area.
By performing a double click on a certain error message, the node containing the corresponding
error is selected in the Tree Panel and in the Text Area.
When this option is selected, the system performs a check-up of the compiled DTDs database each time it
is started. The system tries to load each compiled DTD and in case of failure the system removes the record
for this DTD, i.e. it is not a known DTD for the system any more. For all documents which refer to a DTD,
removed as a result of loading failure, the system asks the user to specify another DTD. In case of normal
use of CLaRK this will never happen. DTDs database damages may appear when there is an external
intervention of the system data files which could be performed by the user or by other application. Another
cause for inability of the system to read the DTDs database could be that the system is used with data files which
are not produced by it, but by another (incompatible) version to the system. To prevent this, each
new version of the system must be installed in a separate directory.
Unselecting this option may reduce the starting time of the system. This might be useful when the system is
running on a slower machine and when its DTDs database contains many and large DTDs. In all other cases
this option is recommended to be selected.
In the system there is a set of system DTDs mainly concerning the application of tools supporting
XML Tools Queries. All these DTDs define
the valid XML structures which can serve as tool queries. Another group of system DTDs defines the structure
of the XML representation of different tool definitions (the rules of a grammar, a tokenizer definition, constraints
definitions, etc.). All these DTDs are placed in the system resources directory. When a tool needs a certain system
DTD, it is automatically compiled (if this have not been done before) in the system database. Thus some DTDs
are not compiled unless they are needed in the processing.
Here, for convenience, the system can find all system DTDs which are not compiled yet in the system
database and compile them one by one. This check and pre-compilation (if needed) will be performed at start-up
if this option is selected.