Pre-Processing Tools


Janya has developed stand-alone tools to facilitate handling of a wide range of unstructured text formats (intelligence message traffic, open source news, newswires, etc.). Semantex™ includes interfaces uniquely adapted to handle the results of these systems.


CASE RESTORATION

Information extraction, like a human reader, performs much better when the document is in normal mixed case. Many sources provide documents either all in lower caqse or all in upper case. Case Restoration automatically converts documents lacking case informationto standard mixed case text through machine learning methods for more accurate entity identification and parsing.


RETRIEVERS

Semantex includes a number of retrievers that understand how to talk to commerical content providers such as Lexis-Nexis, Factiva and Dialog. These retrievers simplify the task of scheduling searches and downloads of documents for processing my Semantex. Define your queries once, specify login criteria and let Semantex do the rest.


SMART FILTERING

Semantex provides filtering capabilities that allow you to specify portions of a document to skip, such as standard boilerplate text in press release or news reports. By ignoring repetitive content that has no value to your goals, Semantex helps you to focus processing on the things that matter and provide more relevant results for the task at hand.

News & Events

Meet with Janya representatives at DoDIIS 2010 in Phoeniz, Arizona!



Janya Webinar on Customizable Text Analytics Solutions.



Janya joins partners to create Savanna Solution.



Mark Logic's Open Enrichment Framework features Semantex