It’s that time of year again! Time to pause and look back at the achievements of IKS in 2011. I have selected a series of blog posts, as my guide. To recap, the vision of IKS is to bring semantic technologies as open source components to small and medium sized CMS providers. But, Seth Grimes says it best,
“The IKS project aims to add semantically rooted capabilities to content management: semantic search, content enrichment, support for intelligent user interfaces (“semantic navigation”), even support for reasoning, for automated inference, over managed content. Yet the semantic annotations that affix meaning to managed content are rarely born with the content. It’s not that users are lazy. Rather, they’re writing for an immediate audience and not for wider reuse, and authoring and publishing tools do not make it easy to create annotated content, or even to create a rich set of metadata describing content”.
A major contribution of IKS to the CMS space is Apache Stanbol, VIE and VIE Widgets. The summary of blog posts below document our work and results in this area in 2011. Stay tuned for the next series of posts that update what the industry is doing with IKS technology, cool applications and demos, early results from our IKS UX contest and what to expect from IKS in 2012.
“A major contribution of IKS to the CMS space is Apache Stanbol, the open source semantic enhancements engine that is being developed as immediately usable software for existing content management systems. While traditional metadata services are usually covered by CMSes, Apache Stanbol provides semantic lifting of the textual content: the automatic detection of “Named Entities” such as persons, places and locations and their linking to external sources, e.g. to dbpedia descriptions of resources”
“Apache Stanbol now enables you to upload your own custom vocabulary to annotate unstructured text with related web documents indexed to that vocabulary. This particular enhancement engine is called “Keyword Linking Engine”. The engine and together with the “Entity Hub” for managing local terminologies has been designed and written by Rupert Westenthaler. The enhanced content along with the entities can then be used in more advanced semantic search applications”
In today’s digital age, more organizations are publishing and sharing their data as “linked data”. According to latest statistics, the Linked Open Data cloud includes 30 billion RDF triples. The open data is a valuable information source for content repositories. It contains lots of hierarchies which can be used to classify the documents in the content repository or entities representing actual content. We decided to implement a new feature in the scope of CMS Adapter component of Apache Stanbol to exploit these large sets of RDF data on the web.
Most European small to medium sized companies manage digital content in multiple languages. So a semantic engine, that can understand multiple languages would be an asset. Apache Stanbol provides multilingual features for some European languages. These linguistic capabilities are dependent on the capabilities of OpenNLP (Apache Incubator), especially the availability of part-of-speech taggers. The following languages are supported by the Keyword Linking Engine of Apache Stanbol: English, German, Danish, Swedish, Dutch, and Portuguese.
We have codenamed this work “index pipeline”. It brings together the components Apache Stanbol Contenthub, Apache Solr, Linked Media Framework (LMF) and LDPath. With the integration of these components users can now build richer indexes that make use of the content semantics such as named entities and concepts from thesauri. This approach extends the current keyword based indexing search techniques available in most Content Management Systems, without affecting the underlying search infrastructure.
Apache Stanbol Reasoners module provides RDF enabled systems with a set of reasoning services to exploit automatic inference engines. The module implements a common api for reasoning services, providing the possibility to plug different reasoners and configurations in parallel. The module includes OWLApi and Jena based abstract services, with concrete implementations for Jena RDFS, OWL, OWLMini and HermiT reasoning service.
Traditional content management systems are monolithic beasts. Just to make your website editable you need to accept the web framework imposed by the system, the templating engine used by the system, and the editing tools used by the system. Want to have a better user interface? Be prepared to rewrite your whole website, and to the pain of having to migrate content between different storage systems. But none of this should be necessary. When web editing tools were more immature, it made sense for the same people to build the whole stack from database content models to web page generation and editing tools. But that was ten years ago, now we could do better.
VIE can do quite a bit to make applications smarter. If you’re looking for simple in-page editing, check out the VIE-powered Create UI. There are several user interface widgets built on top of VIE. You can easily drop these widgets into your web application to gain immediate benefit of semantic interaction: Create — inline content editing tool and Annotate.js — automatic content annotation tool