Performance of a system can vary widely according to the information type processed; a system optimized for terrorist tracking will perform poorly on financial earnings reports. New knowledge acquisition relies heavily on pattern-based rules and lexical forms. In recognition of this fact, Janya leverages a variety of methods to adapt quickly to unfamiliar domains.
Automatic Lexicon Creation
When introducing the Semantex™ system to a domain a user may provide a set of exemplars specific to the content. Domain independent processing results in a new lexicon that can subsequently be fed back into the engine as knowledge for the new domain.
Lexicon-Grammar Development Environment (LGDE)
Customized grammars are the method by which users can specify new events and relationships or perform advanced entity tagging for a domain. Grammar writing is aided by a development environment, the LGDE, designed specifically for this purpose. Lexicon Grammars are descriptions of the lexical and syntactic patterns that indicate the occurrence of a information object of interest. Grammars are run as finite state transducers over the tokenlist structure as a part of the NLP engine.
For more information, see Semantex Workbench™.