Logo

Parametric Search Appliance

This is custom script. Do not install updates.

 

Thunderstone Search Appliance Manual

<<Previous: Entity Recognition ↑Up: Entity Recognition Next>>: Plain Text Entity Dictionary

Entity XML Elements

 

The XML schema for entities is below. Optional elements are shown in square brackets; elements that may repeat are followed by ellipses. The syntax is largely compatible with the Google Search Appliance's Entity Recognition XML format, but see here for differences.

<?xml version="1.0"?>
<instances>
  <instance>
    <name>Counties</name>
    [<case_sensitive>N</case_sensitive>]
    [<apply_case>as_is</apply_case>]
    [<store_term_or_name>term</store_term_or_name>]
    [<store_regex_or_name>regex</store_regex_or_name>]
    [<pattern>(?:[[:upper:]]\w+\s+)+County</pattern>
     ...]
    [<term>Adams County</term>
     ...]
  </instance>
  ...
</instances>

The root element is <instances>, which contains one or more entities, each defined in an <instance> element. Each <instance> has the following children:

Note that more than one entity may be defined in a file, since the <instance> element defining an entity may occur repeatedly.


Copyright © Thunderstone Software     Last updated: Jul 28 2017

<<Previous: Entity Recognition ↑Up: Entity Recognition Next>>: Plain Text Entity Dictionary
Page generated in 0.12 seconds.
2024-06-18 18:22:54 EDT