Index: A
B C D E
F G H I
J K L M
N O P
Q R S
T U V
W X Y Z
A standard for character encoding, usually 1 symbol is encoded in 1 byte (8 bits).
An XPath expression which uses the root element of the document as a context node.
This is a kind of Value Constraint. It restricts the possible child nodes of a node, according to the context in which it appears. For details see Value Constraints in the system manual.
An XPath axis: the ancestor axis contains the ancestors of the context node; the ancestors of the context node consist of the parent of context node and the parent's parent and so on; thus, the ancestor axis will always include the root node, unless the context node is the root node.
An XPath axis: the ancestor-or-self axis contains the context node and the ancestors of the context node; thus, the ancestor-or-self axis will always include the root node.
This is a grammar match mode. In this mode the grammar finds the longest possible subword of the input word that satisfies the context expressions.
This is a grammar match mode. In this mode the grammar finds the shortest possible subword of the input word that satisfies the context expressions.
An XPath axis: the attribute axis contains the attributes of the context node; the axis will be empty unless the context node is an element.
Additional information for each DTD attribute (tokenizer, type of value).
A component of the CLaRK System Editor. It contains the attributes of the currently selected element node in the editor.
It is a part of the DTD which can be used in the definition of attributes' value type. It is an identifier for text content without any internal structure.
This component is responsible for the mapping between the upper half of ASCII table and Unicode character encodings. The mapping is needed because of the different capacities of the two standards.
A mapping from symbols (letter, digit, etc.) into numbers represented in one or two bytes.
An XPath axis: the child axis contains the children of the context node.
An operation which removes the current active document from the system editor.
A mechanism for creating user-defined colorization of the XML document elements in the text area of CLaRK. Each scheme is a set of rules which in turn define a color for nodes which meet certain XPath expression requirements (types, names, contexts, etc.).
A tokenizer which is defined on the basis of other complex or primitive tokenizers, i.e. the tokenizer uses in its definitions token categories defined in other tokenizers.
A system tool, which can be used for searching for tokens and mark-up in a specific context. For details see Concordance in the system manual.
A component which is responsible for the management and application of all kinds of Value Constraints. For details see Value Constraints in the system manual.
A construction in the MultiQuery Tool which is used certain operations to be applied depending on certain conditions. When a condition for such operator is true a certain operation is performed (if-then-else structure). The control operators are: IF (XPath), IF NOT (XPath), IF CHANGED, IF NOT CHANGED, GOTO. For detailed description see the system manual.
A system function which copies the selected nodes, including their subtrees, to a copy buffer. After applying the Copy operation the buffer contains a set of trees - one for each selected node.
A document which is currently opened in the main editor of the CLaRK System and is activated (currently being edited).
A system function which copies the selected nodes (including their subtrees) to a copy buffer and then deletes the nodes from the tree. After applying the Cut operation, the copy buffer contains a set of trees - one for each selected node.
A specific attribute defined in a DTD for which the following holds: if an element node for which the attribute is defined does not contain the given attribute, it is assumed that the attribute is present with a (default) value stated in the DTD.
An XPath axis: the descendant axis contains the descendants of the context node; a descendant is a child or a child of a child and so on; the descendant axis never contains attribute nodes.
An XPath axis: the descendant-or-self axis contains the context node and the descendants of the context node.
A property of the XML documents. The DOCTYPE specifies characteristics related with the structural DTD validation of the documents: main (topmost) element of a document, a name of the corresponding DTD, a set of DTD definitions. For details see: http://www.w3.org/TR/2000/REC-xml-20001006
The XML document type declaration contains or points to markup declarations that provide a grammar for a class of documents. This grammar is known as a document type definition, or DTD. For details see: http://www.w3.org/TR/2000/REC-xml-20001006.
A component which contains all DTDs, compiled in the system and the information attached to them.
An operation which removes a document (or documents) from the Internal Document Database of CLaRK System.
A system function which allows the user to delete the selected nodes. The children of each selected node will be inserted as children of the parent of it. The system will warn the user if, after the deletion, the structure of the document is non-valid.
A system function which allows the user to delete the selected node(s) in the tree with the entire subtree(s) below it (them). The system will warn the user if, after the deletion, the structure of the document is non-valid.
An abstract grouping of documents. It is useful when a certain operation must be applied to a set of documents. The user does not have to point each time to the documents one by one, but s/he can point to a whole group instead.
A component which is responsible for the internal documents management. It suggests operations like: opening a document, removing a document, grouping documents in abstract unions. For details see Document Manager in the system manual.
A component which allows for selecting a set of documents which will be sent to a certain tool in the CLaRK System. For details see Document Selector in the system manual.
A part of an XML document comprising an open tag, a close tag and a content or it is an empty element and is represented by one tag. Additionally an element could have attributes. Because an XML document could be viewed as a tree in which the nodes represent the elements of the document, we also use node as a synonym to element.
Additional information for each DTD element. It includes: tokenizer, type of value, element value (an XPath expression). For details see Element Features in the system manual.
A document which contains only one (root) tag. The tagname is taken from the DOCTYPE of the document's DTD. http://www.w3.org/TR/2000/REC-xml-20001006#dt-markupdecl.
This option allows the user to set the correct mapping from ASCII to Unicode character encoding.
A tool which converts symbols into entities and vice versa in the current document. For details see Entity Converter in the system manual.
A component of the CLaRK System situated at the bottom of the main editor window. It contains all messages for the errors which occurred during validation of the current document.
A system function which expands a folded part of tree representation of the document in the Tree Panel.
When a document is in the Internal Document Database of the CLaRK System it can not be accessed by external programs. In order to use it outside the system, it must be loaded in the main editor and then exported to a file.
A system component that is used for filtering tokens and mark-up when applying different tools (grammars, concordance, constraints). For details see Filters in the system manual.
An XPath axis: the following axis contains all nodes in the same document as the context node and that are after the context node in document order, excluding any descendants and attribute nodes.
An XPath axis: the following-sibling axis contains all the following siblings of the context node; if the context node is an attribute node node, the following-sibling axis is empty.
A system environment engine which works in the background of all system activities. It takes care about throwing away all resources which are not needed any more (the garbage) from the computer memory. Running the garbage collection before/after executing heavy processing procedures plays crucial role on the computer performance.
A group of grammars that are intended to be applied together. For details see Grammar Groups in the system manual.
A basic structure for constructing definitions for Element Values of certain nodes used in the grammar tool. Each Grammar Key contains an XPath expression which points the substantial information for certain nodes and tokenization info (tokenizer, normalization). Each value definition for certain type of elements consists of a sequence of Grammar keys.
It is a part of the DTD which defines the default value of the attributes. This identifier stands for an attribute which can be present for each tag it is defined for, but not obligatory.
Retrieves a wellformed XML document from a specified URL (if a network connection is available). If the retrieved data can not be parsed as XML, it can be stored on the Hard Drive as text data.
An operation which reads a RTF file and represents it as an XML document. The resulting XML document structure follows the TEI.2 DTD. The system reads from the source file the following type of information: heading information (creator, date, title, keywords, etc.), content information ( paragraphs, sections, lines) and layout information (bold, italics, underlined). The collected data is inserted in the result XML document in the corresponding places, provided by the DTD.
An operation which reads a text file without interpreting its content, but representing it as one text node.
An operation which reads an XML document from an external file, parses it and loads it in the editor.
A system function which allows insertion of a first Comment child to each selected node. For each selected node the system will insert an empty comment child.
A system function which allows insertion of a first Comment sibling to each selected node. For each selected node the system will insert an empty comment sibling.
A system function which allows insertion of a first Element child to each selected node. The system will give the user a choice between valid tags (tags which can be inserted at the specified position according to the DTD), all tags (all tags defined in the DTD) and any other user defined tag.
A system function which allows insertion of a following sibling Element node to each selected node. The system gives the user a choice between valid tags (tags which can be inserted at the specified position according to the DTD), all tags (all tags defined in the DTD) and any other user defined tag.
A system function which allows the user to insert a parent of the selected node(s). When there is a multiple selection of nodes and the user chooses this item, the selection is changed so that only the ancestors of the selected nodes (or the nodes themselves) that have a common parent are selected.
A system function which allows insertion of a first Text child to each selected node. For each selected node the system will insert a text child containing a single space character.
A system function which allows insertion of a first Text sibling to each selected node. For each selected node the system will insert a text sibling, containing a single space character..
This is a database containing records for all documents imported and saved in the system. The internal documents are NOT visible from outside the system. Each internal document has a DTD attached to it.
A system function which applies a local XSLT transformation over the selected node(s) of the current document. The user must select a document containing an XSLT transformation. For details see XSLT Transformations in the system manual.
A definition concerning the graphical representation of XML documents in CLaRK. A layout can be used either in the tree representation (Tree Layout) or in the text representation (Text Layout) of an opened XML document in the system.
A regular expression for the left context of a grammar rule. For details see Edit Grammar in the system manual.
A notion used in the Text Layout for the documents opened in CLaRK. It defines the amount of space characters which are shown in front of the XML tags in order to beautify the text representation in the editor or in the output file.
The longest sequence of tokens and mark-up that matches a regular expression. For details see Edit Grammar in the system manual.
The LookAndFeel defines the style of the graphical user interface components to be painted on the screen. Each operating system (Windows, Unix, Mac, ...) has its specific style (look@feel).
This component is the main functional menu of the CLaRK System. It is situated at the top of the editor window.
A sequence of tokens and mark-up recognized by a regular expression. The supported matching modes are: Longest Match, Shortest Match, Any Up and Any Down. For details see Grammar Manager in the system manual.
A system indicator which appears on the menu bar of CLaRK and shows the amount of memory the system currently uses. Clicking on it activates a Garbage Collection procedure.
An operation which exports a set of documents from the Internal Document Database of the CLaRK System into a specified directory. Related item: Export XML.
An operation which imports a set of documents into the Internal Document Database of the CLaRK System, stored in files from one directory. Related item: Import XML.
A tool which runs different other tools to process the working data. The tool uses Control Operators to perform conditional tool applications.
This option allows one document to be shown in more than one window. For details see New view in the system manual.
A representation of an XML element in the tree format of an XML document. A node could have attributes associated with it. The content of the element is represented as a subtree under the element node.
A procedure which change tokens before they are used in some tools like regular grammars, sorting and others. The normalization is defined on symbol level within a tokenizer. Example: "Love" is normalized to "love" for sorting. Normalization doesn't change the content of the document.
This tool restricts the number of occurrences of certain elements within one document. It is expressed by an XPath expression and two natural numbers defining the minimum and maximum number of nodes selected by the XPath expression. For details see Number Constraints in the system manual.
An operation which opens a document from the internal document database in the system editor.
An XPath axis: the parent axis contains the parent of the context node, if there is one.
This is a kind of Value Constraint in the CLaRK System. It restricts the possible parent nodes of a node or a sequence of nodes, according to the context in which they appear. For details see Value Constraints in the system manual.
A message given by the CLaRK System when the parser detects an error which occurred while checking the well-formedness of a document, when reading it from a file (importing).
A system function which pastes the content of the copy buffer as a first child of the selected node(s). If the copy buffer contains more than one tree, the trees are inserted as neighbor children.
A system function which pastes the content of the copy buffer as a following sibling of the selected node(s). If the copy buffer contains more than one tree, the trees are inserted as neighbor siblings of the selected node(s).
A key word used in the element definition in a DTD. It stands for textual data without tags.
An XPath axis: the preceding axis contains all nodes in the same document as the context node and that are before the context node in document order, excluding any ancestors and attribute nodes.
An XPath axis: the preceding-sibling axis contains all the preceding siblings of the context node; if the context node is an attribute node, the preceding-sibling axis is empty.
A tokenizer which assigns categories only to symbols. The categories of a primitive tokenizer are defined by a list of Unicode characters. For details see Tokenizers in the system manual.
A standard way for encoding meta information in an XML document. This information usually is targeted to an XML processor. The CLaRK System does not use processing-instructions and ignores them when parsing a document.
This is short for Tool Query which is an XML document describing a set of options of certain tool. On the basis of such description a tool can be applied without any other additional information and without opening a corresponding tool dialog window. On this descriptions the MultyQuery Tool strongly relies. The tools supporting the query techniques are:
- XPath Insert Sibling
- XPath Insert Child
- XPath Insert Attribute
- XPath Insert Parent
- Concordance
- Constraints (Value)
- Extract
- Grammar
- Grammar Group
- XPath Remove
- XPath Rename
- Sort
- Statistics
- Text Replace
- XSLT
- MultiQuery Tool
It is a part of the DTD which defines the default value of the attributes. This identifier stands for an attribute which MUST be present for each tag it is defined for.
The CLaRK System can import only well-formed XML documents. When the user tries to import a non-well-formed document, an error occurs. The error has to be corrected outside the system. This function allows for a new attempt for importing the same file without dialogs for choosing the file.
A system function which applies a Regular Expression Constraint over the content of the selected node(s).
A tool which allows for checking the validity of the content of certain document elements according to constraints defined by regular expressions. It is useful for textual content or when the content of some elements is defined by a disjunction in the DTD. Application of these constraints is context sensitive. For details see Regular Expression Constraints in the system manual.
An XPath expression which can use an arbitrary node as a context node. In contrast with Absolute XPath Expression, which always use the root node as context node.
A system function which allows the user to rename the selected element nodes in the tree.
An XML mark-up related to a grammar rule. Each word recognized by the grammar rule is substituted by this XML mark-up. If the recognized word has to be included in the result of the application of the rule, it can be cited by the variable \w. For details see Edit Grammar in the system manual.
A regular expression for the right context of a grammar rule. For details see Edit Grammar in the system manual.
The RTF is a format for text and graphics interchange that can be used with different output devices, operating environments, and operating systems. RTF uses the ANSI, PC-8, Macintosh, or IBM PC character set to control the representation and formatting of a document. The system supports importing of RTF documents and representing them in XML according to TEI.2 DTD. For details see Import RTF.
An operation which saves the current active document into the Internal Document Database.
An XPath axis: the self axis contains just the context node itself.
A mechanism for assigning certain key combinations to trigger certain actions: applying different types of operations on the current document in the editor. For details see Definitions/Shortcut in the system manual.
The shortest sequence of tokens and mark-up that matches a regular expression. For details see Grammar Manager in the system manual.
A kind of Value Constraint in the CLaRK System. It restricts the possible attributes and their values of a node, according to the context in which it appears. For details see Value Constraints in the system manual.
A kind of Value Constraint in the CLaRK System. It restricts the possible child nodes of a node, according the context in which it appears. For details see Value Constraints in the system manual.
A system function which sorts some nodes in a document according to their sort keys defined by XPath expressions. For details see Sort Tools in the system manual.
A definition of values used for comparing elements during sort operation. It is defined by an XPath expression. The value for each element is the list of nodes returned after the evaluation of the XPath expression. The list is additionally processed according to some options for normalization of text, reverse of text and similar.
A component of the CLaRK System Editor. It is a field at the bottom of the main editor window. It is used to display system messages.
Each concordance document has the structure of a table. This function allows the user to see a concordance document as a table and to work with it. The table view of the concordance document is connected with the XML representation of the document. For details see Table View in the system manual.
A component of the CLaRK System Editor. The area shows the current document in a textual XML format.
This component determines the way an XML document will be shown in the text area. The options are: showing/hiding tags with/without their contents, drawing some tags on new lines, inserting leading offsets, using Color Schemes. For more details see Edit Text Layout in the system manual.
A sequence of characters in a text grouped on the base of their appearance.
A name for a class of tokens.
An expression describing tokens. A token description could be a token category or a sequence of Unicode symbols surrounded by double quotes. A token description could contain wildcard symbols: # for any number of symbols, % for exactly one symbol and @ for one or zero symbols. Wildcard symbols, the double quote symbol can be used in a token description being preceded by the symbol ^. It itself is represented by the sequence: ^^. Example: "love#" is a description of love, loves, loved, lover and many others.
A system tool which is used for segmenting raw texts into tokens. For details see Tokenizers in the system manual.
An XML document describing a set of options of certain tool. On the basis of such description a tool can be applied without any other additional information and without opening a corresponding tool dialog window. On this descriptions the MultyQuery Tool strongly relies. When the user applies a tool from the tool dialog, s/he can save the current settings as a tool query in the corresponding tool query group.
This component determines the way an XML document will be shown in the tree area. For each tag name from a DTD the user can define a string pattern describing the desired appearance of the corresponding elements in the tree. The patterns are based on retrieving related data by XPath expressions. Further more each pattern defines a color for the tags.
A component of the CLaRK System Editor. The panel shows the tree structure of the current document.
A standard for character encoding. Each symbol is represented in one, two, three or four bytes in 2 bytes. For details see: http://www.unicode.org/.
An address of a web page or other data resource (pictures, sounds, etc.) on the world wide web.
This tool allows the user to define his/her own keyboard binding. For details how to use keyboard bindings, see Keyboard in the system manual.
An XML document is valid if it has an associated document type declaration (DTD) and if the document complies with the constraints expressed in it.
A message reported by the CLaRK System when an error occurs while checking the validity of a document according to its DTD.
A tool which restricts the content of certain elements according to the context in which they appear. The types of Value Constraints are: Parent, All Children, Some Children, and Some Attributes. For details see Value Constraints in the system manual.
A collection of Value Constraints which is used when a set of constraints must be applied in a specific context. For details see Value Constraints in the system manual.
An XML document, which meets all the well-formedness constraints given in the following specification: http://www.w3.org/TR/2000/REC-xml-20001006.
A syntactic element from the Token description language, which allows describing a set of tokens by leaving some of the characters in them underespecified. The symbols are as follows: '#' - corresponds to a sequence (possibly empty) of characters; '%' - corresponds to exactly one character; '@' - zero or one character.
A tool which recognizes an XML document from any source in general. In the CLaRK System, this source is usually an external file.
An XML query language which uses the tree representation of XML documents (DOM tree). For more details, see the XPath language specification: http://www.w3.org/TR/1999/REC-xpath-19991116.
An element from the XPath syntax which is used for selecting nodes from a certain context node in the DOM tree.
The following axes are available:
self
child
descendant
descendant-or-self
parent
ancestor
ancestor-or-self
following-sibling
following
preceding-sibling
preceding
attribute
The functions in XPath are divided into four categories,
depending on the data structures they work with. The four data structures are: boolean,
number, string, node set. Below object
stands for any of the four data
structures.
last()
: numberposition()
: numbercount(node-set)
: numbername(node-set?)
: stringstring(object?)
: stringconcat(string, string, string*)
: stringstarts-with(string, string)
: booleancontains(string, string)
: booleanstring substring-before(string, string)
: stringsubstring-after(string, string)
: stringsubstring(string, number, number?)
: stringstring-length(string?)
: numbernormalize-space(string?)
: stringtranslate(string, string, string)
: stringboolean(object)
: booleannot(boolean)
: booleantrue()
: booleanfalse()
: booleannumber(object?)
: numbersum(node-set)
: numberfloor(number)
: numberceiling(number)
: numberround(number)
: numberExtended XPath functions:
A tool for automatic attribute insertion based on XPath selection. The Element nodes selected by the XPath expression are inserted, one by one, with the supplied attribute-value specification.
A tool for automatic Element/Text node insertion based on XPath selection. The new nodes are inserted as child nodes of the selected by the XPath evaluation Element nodes. The user supplies the new node definition (tag name / text content) and the position in which the insertions should be performed.
A tool for automatic parent Element node insertion based on XPath selection. The user specifies the target nodes with an XPath expression and the tag name of the parent nodes to be inserted.
A tool for automatic Element/Text node insertion based on XPath selection. The new nodes are inserted as sibling nodes (preceding or following) of the selected by the XPath evaluation nodes. The user supplies the new node definition (tag name / text content) and the position (before or after the selected nodes) in which the insertions should be performed.
An general named entity which contains XPath expression and other specific information. It is designed to be used in other tools applications. Major representatives of XPath Keys are: Grammar Keys, Sort Keys, Table Sort Keys.
A means for given names to XPath expressions and using them later in other expressions, referring them only by name. The XPath of a macro must be a valid expression, which can use other XPath Macros.
The node tests are divided into two categories: node type tests and node name tests. Here are the node type tests implemented in the system:
text()
text(<text>)
text(<mode>,<text>)
, mode = 1, 2, 3 or 4.text(<mode>,<y|n>,<"(" regular expression
")">)
, mode = 1, 2, 3 or 4, normalization = y for
"yes" and n for "no", regular expression pattern of the
searched string. Example: text(3,y,("play#"|"replay#"))
node()
. Short form *
element()
attribute()
attribute(<attributeName>)
attribute(<attributeName> = "<attributeValue>")
The name node tests are used to filter the initial node-set for element nodes with a given
name. Example: child::para
The predicates in XPath are boolean expressions which
are evaluated according to a specific context node. They are enclosed by square brackets
('[
',']
') and can contain combinations of: location paths;
function calls; union operators ('|
'); additive operators (' +
',
'-
'); multiplicative operators ('*
', ' div
', 'mod
');
logical operators ('and
', 'or
'); equality (=).
A tool for automatic XML data removal based on XPath selection. The nodes returned by the XPath expression evaluation can be of any type (Element, Text, Attribute, etc.).
A tool for automatic Element nodes renaming based on XPath selection. The nodes to be renamed must be of type Element. All other types of nodes are discarded.
A tool which performs transformations over XML documents using XPath for navigation. For details see XPath Transformations in the system manual.