A Reference Architecture for Semantic CMS

The IKS project is about bringing semantic technologies to CMS vendors. But we are not only producing technology with Apache Stanbol, we are also addressing the broader research themes of how users create, query, consume and interact with knowledge-enriched content. One of the latest research papers written by Fabian Christ and Benjamin Nagel addresses our findings on a reference architecture for semantic CMS.

Fabian Christ, Benjamin Nagel: A Reference Architecture for Semantic Content Management Systems. In M. Nüttgens, O. Thomas, B. Weber (eds.): Proceeding of the Enterprise Modelling and Information Systems Architectures Workshop 2011 (EMISA’11), Hamburg (Germany). GI, LNI, vol. P-190, pp. 135-148 (2011)

From the abstract:

Content Management Systems (CMS) lack the ability of managing semantic information that is part of the content stored in a CMS. On the other hand, a lot of research has been done in the field of Information Extraction (IE) and Information Retrieval (IR), respectively. Additionally, the vision of the Semantic Web yields to new software components that make semantic technology usable for application developers. In this paper, we combine IE/IR concepts with the technologies of the Semantic Web and propose a new family of CMS, called Semantic CMS (SCMS), with advanced semantic capabilities. We provide a reference architecture for SCMS and prove its value along two implementations. One implementation was created as part of the Interactive Knowledge Stack research project and another one in a one-year student project exploring the design of an SCMS for the software engineering domain.

In this post we would like to present the main figure of the reference architecture to you. In Fig. 1 we have a conceptual architecture for a semantic CMS. This architecture consists of two columns: the content and the knowledge column. On top of this basement is our integrating roof which integrates content features with knowledge features and enables a Semantic User Interface.

Semantic CMS Architecture

Fig. 1: Semantic CMS Architecture

The content column contains the architecture of a traditional CMS with a User Interface layer on top. Inside we have layers for Content Administration, Management, Data Model, and Repository encapsulated by a Content Access layer. You will find something similar to this in most CMS today.

The interesting, extending part comes with the knowledge column. Here we have layers for semantic features that are used to enhance your content from the content column and give you more knowledge about your content. On a technical level you can refer to knowledge as some metadata about your content on the left.

The knowledge column starts with a Semantic User Interaction layer on top and a Knowledge Access layer underneath. One of the most important layers is the Knowledge Extraction Pipelines layer that gives you all the nice natural language processing features to extract metadata from your content. This layer is also referred to as the Semantic Lifting layer. Once you have extracted metadata about your content, you can use the Reasoning layer to retrieve even more knowledge by combining new and existing knowledge.

At the lowest level we have a Knowledge Models layer that defines structures, e.g. ontologies, for your metadata and hereby for representing your knowledge inside the system. The knowledge is stored in a Knowledge Repository layer at the bottom of this architecture. Needless to say that you will also need a Knowledge Administration layer.

Based on this approach you take your CMS, which basically is the content column in this architecture, and extend it with an implementation of the knowledge column. And here we are back to semantic technology that the IKS project brings to CMS vendors. With the IKS Technology Stack you have an example of an implementation for the knowledge column. Try it out and have a look at the latest IKS Technology Stack releases or go to Apache Stanbol and check out the latest developments for the knowledge column. With further technologies like the VIE component you can then start and implement your own Semantic User Interface.

Author: Fabian

I'm a software engineer and researcher in the field of model-based software development with focus on software architectures and application frameworks at the Software Quality Lab (s-lab) of the University of Paderborn. I'm passionate about open-source software development and an active developer of the Apache Stanbol project.

Comments are closed.