With the variety of content management systems (CMS) out there, it has become very easy to publish content on the web, even manage a community website, with no programming or web design skills required. Thus the volume of user or publisher generated content is rapidly increasing, and the effective management of this data, and it’s usability has become a challenge. Apache Stanbol is an open source technology stack, which was designed to aid this process, without requiring any structural changes in the CMS. Semantic enhancement of unstructured text (among many other useful features) can be used over RESTful endpoints, making it very easy to integrate into existing workflows. The modular design, easy administration, extendibility and vibrant community make it a very powerful framework. The automatic processing, analysis and enrichment of text is not an easy task , thus appropriate tools are needed, which can be easily integrated in Apache Stanbol.
DBpedia Spotlight – Introduction
DBpedia Spotlight is an open source software designed to step up to this task. It automatically annotates mentions of DBpedia resources in text, and goes through the whole analysis life cycle – from entity detection (spotting) to candidate selection (possible DBpedia resources the mentions might refer to) and last but not least disambiguation, in case there are multiple candidates for a single entity. Pablo Mendes (co-founder of DBpedia Spotlight) and myself , Iavor Jelev ( CTO at babelmonkeys / GzEvD) were very happy to integrate the functionality of DBpedia Spotlight in Apache Stanbol as part of the early adopters programme. In this blog post we want to give you an overview of the integration, details on the new EnhancementEngines and EnhancementChains, as well as tips on how to use them. If you are not familiar with Apache Stanbol or the concept of an EnhancementEngine, please refer to the great post of Anuj Kumar which covers this subjects in detail. Thanks to Anuj for this, I wished his post were available when we were starting out with developing our EnhancementEngines. We actually intended to do a similar introduction to the development process of an engine, but Anuj has done a great job and we will build on his post. If you are already familiar with the concepts EnhancementEngine and EnhancementChain, you should be able to easily follow this report. Continue Reading →
