The goal of this demo is to show how the Node Info tool of the CLaRK system can be used to add meta-information about an XML document in that document.
The information is stored in the format recommended by the Text Encoding Initiative (TEI) standard.
The Node Info tool is used to add information for the number of token occurrences according to their type or/and elements within the document. The type of the tokens is their Token Category according to a tokenizer. Elements to be counted are specified by an XPath expression.
If the XML document is valid via TEI.2 DTD, then the result is stored in the header element, otherwise the result is inserted as a first child of the root element.
Our goal here is to add information about the tokens in the text nodes of the document - how many tokens of each Token Category are in the document.
The Document used for this demo is Standart20030524.tag
.
The tokenizer is MixedWord
tokenizer which is defined within the
System. If you would like to see the token types defined within this tokenizer, you
have to select Tokenizers
item from menu
Definitions
.
In order to run the demo you have to perform the following steps:
Standart20030524.tag
is saved in
the system. If it is not, then load it as it is described in Import XML Tool Demo and save it in the
system.Tools
.Enter XPath for node:
write the following XPath
expression: //text/descendant-or-self::text()
which selects all
textual element within the document(s).Word Info
option.Choose Tokenizer
choose MixedWord
.Add
Documents
.The information gives the number of occurrences for each Token Category.
It can be added in <extent>
element in the teiHeader of the document and/or saved in
another document.
Our goal here is to add information about the elements in the document - how many elements are in the document.
The Document used for this demo is Standart20030524.tag
.
In order to run the demo you have to perform the following steps:
Standart20030524.tag
is saved in the system. If it is not, then
load it as it is described in Import XML Tool Demo and save it in the system.Tools
.Enter XPath for node:
write the following XPath expression :
//body
- all the descendant nodes of the given one will be counted.Tag Info
option.Add
Documents
.The information gives the number of occurrences for each element.
It can be added in <encodingDesc>
element in the teiHeader of the document and/or
saved in another document.