Customizable Text Extraction

Semantex has been designed from the ground up to support highly modular and configurable language processing for dynamic, domain and problem-specific information needs.

 

configurations v2

 

Configurations in Semantex allow quick addition of custom dictionaries and patterns to capture domain-specific concepts, entity types, events and relationships to augment the baseline provided by Semantex by default.

For example, a configuration to monitor web content for key news about Fortune 500 companies may be built using a dictionary of the company names, their products and services, etc.  The configuration would also contain patterns leveraging Semantex linguistic features to extract specific events such as management changes or contract awards.

For sophisticated patterns and tailored statistical modules, Janya's professional services team composed of the same linguists and engineers that develop the core functionality of Semantex™ is available. 

Configurations for Different File Types

In addition to domain-specific dictionaries and patterns, configurations can also use meta data from XML or text file types. In particular, time and date information available as part of the document meta data can be leveraged by Semantex patterns to resolve relative time and date mentions in the document. File types with meta data relevant to text analytics include support request memos, emails, etc.

Examples of relative time and date mentions in documents include “tomorrow, the support team will be visiting the customer location to resolve the issue.” With the available document date, tomorrow will be resolved to a specific date, in the standard TIMEX2 format.

Date and time information that is extracted by Semantex patterns for events provide the “when” for events, which allows filtering of events based on occurrence or reported time as well as plotting of events on a timeline.

Document Actions

Copyright © 2008-2012 Janya Inc. All rights reserved.