Janya Journal Processor

The Janya Journal Processor™ analyzes, identifies and extracts key information from the text of articles in PDF format through the use of innovative natural language processing capabilities.

For example, metadata such as title, author, references, journal name, institution, etc. can be mined from scientific journal articles. 

Further, since the Janya Journal Processor™ is based on the Semantex™ platform, it can also extract and tag domain-specific categories of information such as chemicals, products, diseases, symptoms, interactions, etc. from free text sections. The Janya Journal Processor™ is designed to be extensible, and to enrich content management and advanced search applications.

The Janya Journal Processor™ is a unique tool that leverages Semantex’s customizable machine learning and grammar rule techniques to extract metadata using  document style information, while also providing traditional Semantex™ capabilities for text analysis.

 

Make Your Scientific and Technical Content Discoverable

The Janya Journal Processor™ ingests documents in PDF format and provides dynamic metadata and Semantex™ domain-specific concept categories in a configurable XML format that can be easily indexed by a variety of search and content management systems.

 

Adapt Metadata Extraction to Custom Knowledge Requirements

The Janya Journal Processor™ leverages document style information such as font sizes, formatting, etc. to identify individual sections and fields within documents. The tool is extensible to a range of PDF article formats and languages to meet custom knowledge requirements. In addition, the full range of the Semantex™ platform’s customizable text analysis capabilities are available to any Janya Journal Processor™ solution for deep domain-specific contextual analysis and extraction.

 
Document Actions

Copyright © 2008-2012 Janya Inc. All rights reserved.