Domain Customization


The performance of a system can vary widely according to the information type processed; a system optimized for terrorist tracking will perform poorly on financial earnings reports. New knowledge acquisition relies heavily on pattern-based rules and lexical forms. In recognition of this fact, Janya leverages a variety of methods to adapt quickly to unfamiliar domains. In doing so, we provide a greater breadth of customization options that allow you to fine tune the Semantex™ system for a variety of domains and tasks.


Automatic Lexicon Creation

When introducing the Semantex system to a domain a user may provide a set of sample documents specific to the content. Domain-independent processing results in a new set of terms that forms a lexicon that can subsequently be fed back into the engine as knowledge for the new domain.


Semantex Workbench

Customized grammars are the method by which users can define new entity types, specify new events and relationships or perform advanced entity tagging for a domain. Grammar writing is aided by a development environment, the Semantex Workbench™, designed specifically for this purpose. Lexicon Grammars are descriptions of the lexical and syntactic patterns that indicate the occurrence of a information object of interest. Grammars are run as finite state transducers over the tokenlist structure as a part of the NLP engine.


For more information, see Semantex Workbench.

News & Events

Meet with Janya representatives at DoDIIS 2010 in Phoeniz, Arizona!



Janya Webinar on Customizable Text Analytics Solutions.



Janya joins partners to create Savanna Solution.



Mark Logic's Open Enrichment Framework features Semantex