Conatix is a Berlin- and London-based startup and a spin-off project of the Humboldt-Universität zu Berlin developing an online semi-automated business intelligence (business research) system based on recent advances in machine learning. Conatix applied and tested Stanbol in the Business Intelligence domain as part of the IKS Early Adopter program.
The primary use-case for the Conatix semi-automated business intelligence system is the department of a large or medium-sized company or financial institution that is engaged in external research (sourcing and structuring new information that comes from outside the organization). This could include the new product development department evaluating new product ideas, the marketing department exploring new market segments and competitors, the strategy or planning department building future macroeconomic scenarios, the legal or compliance department assessing new and existing regulatory requirements, or the investment analysis department analyzing new investment opportunities. In all of these cases, one or more researchers will use the Conatix business intelligence system to organize their research work and to store and share their research results.
The aim of the project was to test the hypothesis that the IKS CMS could be useful for dynamic real-time content management by non-technical business users as they process new and existing documents recommended to them by the Conatix business intelligence system. We found that Stanbol and VIE features are indeed useful for content management not only at the point of direct user contact with content in our system (in the frontend user interface), but that Stanbol can also enhance the process of new content discovery by the backend of our system. IKS text enhancement functionalities add value to the business research process by making it possible to visualize important and useful terms within a given text. Furthermore, the Apache Stanbol Engine extends the knowledge derived from a text by the user by proposing terms that are in some ways connected to the initial text. Our system can then feed these new terms into our own machine learning-based integrated web crawler/classifier engine to enhance the relevant result values even more.
User-defined highlighting within text: At the beginning of the project, we had the impression that Stanbol tools or widgets provide user-defined highlighting within a block of text and we wanted to integrate such functionalities into our system. When the user selects a block of text, the selected text is highlighted in a different color so it attracts attention the next time the user views the same document/page. This concept was originated by our development team because of the highlighting feature of Annotate.js which made us think of implementing it within our own system. In fact there is no such feature in VIE.js or Annotate.js that makes user-defined text highlighting possible. However this misconception did not impede us from taking another approach to implementing such a functionality into our system.
Named entity recognition with Annotate.js and VIE.js: The Conatix HTML frontend shows results generated by our integrated web crawler/document classifier as a list of tagged documents. To enhance the user experience and give the result pool more context, we use VIE and Annotate.js to link to further information. This helps the user to speed up his research. These named entities enhance the user experience when using the application by highlighting and visualizing important terms. Because the primary use-case for our business intelligence system is marketing and investment research, an appropriate future extension would be to link this feature to a database of company names.
User customized notes with Hallo.js: To enhance the results generated by our integrated web crawler/document classifier further we use Hallo.js to give our customers a tool to add and edit notes associated with relevant documents and highlight important terms. The features provided by Hallo.js enable the user to format the notes for better readability.
User-defined important term mining: Documents found by our users will be enhanced using Stanbol. This will provide a better overall experience to our customers. The Conatix semi-automated business intelligence system is preparing to link terms highlighted by the VIE database with relevant documents discovered by our users while using our system. Apache Stanbol can be configures to use a user-defined vocabulary list for text enhancement. One important part of the Conatix system is a text classifier that generates a dictionary of frequent and important terms in user-identified documents. This dictionary can then be turned into a Stanbol-friendly vocabulary list and fed into the engine of Apache Stanbol so that the chain for Stanbol entity recognition can use this vocabulary list. As a consequence the text enhancement will become more customized to the user-defined text over time.