Our unified Information Access prototype processes unstructured content through text annotation tools of choice, and integrates annotated content with structured data. Developers can configure these tools to perform entity and relationship extraction, categorization, sentiment analysis, and more. Customer or open source services can be plugged into the framework as required. The framework then processes extracted information through an array of annotation services that developers can configure to filter, alter, and enrich the data, for example by querying external data sources for additional metadata.
Our prototype integrates data provided by the Defense Technical Information Center (DTIC) including unstructured Tehnical Report documents and structured document metadata-citations pertaining to the reports. We extracted information from within the documents, transformed the "content into context", and linked it to the citations. Additionally, we linked the citations to form a Unified Information application.
Anzo is a complete software suite for enterprise data management based on W3C Semantic Web technology standards for connecting data. Although the Anzo platform supports many different data formats such as Excel spreadsheets, relational databases, Java objects via the Java API, and many more, our implementation utilizes the unstructured data integration component.
Our Document Annotation prototype automatically "interprets" a document and highlights terms of interest to the user. The annotator can adapt to cover broad topics or precise elements as mission requirements dictate.