This is a short description of the DTD used to define the XML documents which represent a serialization form of Regular Expression Constraints in the Clark System. Each document is a collection of one or more serialized RE Constraints.

<!DOCTYPE CLaRK_reg_constraints [

<!ELEMENT CLaRK_reg_constraints (comment?,regConstraint+)>

    The comment element here is optional and will not be processed by the system.

<!ELEMENT comment (#PCDATA)>

<!ELEMENT regConstraint (name, regExpr, tokenizer?, filter?, defaultXPath?)>

    Element regConstraint represents a single Regular Expression Constraint in the system. Here is a description of all its sub-elements:

<!ELEMENT name (#PCDATA)>

<!ELEMENT regExpr (#PCDATA)>

<!ELEMENT filter (#PCDATA)>

<!ELEMENT tokenizer (#PCDATA)>

<!ELEMENT defaultXPath (#PCDATA)>
]>

An Example with two serialized Regular Expression Constraints:

<CLaRK_reg_constraints>
    <regConstraint>
        <name>reg constraint name 1</name>
        <regExpr>&lt;a&gt;,&lt;b&gt;?,&lt;c&gt;+</regExpr>
        <tokenizer>MixedWord</tokenizer>
        <defaultXPath>/child::para</defaultXPath>
    </regConstraint>
    <regConstraint>
        <name>reg constraint name 2</name>
        <regExpr>$NUMBER+,$SPACE*</regExpr>
        <tokenizer>Default</tokenizer>
        <defaultXPath>//pages</defaultXPath>
    </regConstraint>
</CLaRK_reg_constraints>

Image3.gif (7452 bytes) Image4.gif (7473 bytes)