Creating Enhancement Engines for Stanbol 0.10.0-incubating using NetBeans 7.1.2

In this tutorial I will describe the steps to follow for getting started with writing Enhancement Engines (EE) for Stanbol 0.10.0-incubating. In general, writing an EE for Stanbol is quite easy, once you know about some important details about maven, OSGI, and Apache Felix – the aim of this document is to share you that details.
For the essentials on Enhancement Chains and Engines, please read:
http://incubator.apache.org/stanbol/docs/trunk/components/enhancer/ and then
http://incubator.apache.org/stanbol/docs/trunk/components/enhancer/engines/

Preparations : Maven and NetBeans

For creating Enhancement Engines you will need the latest version of NetBeans (currently 7.1.2) which you can download from here: http://netbeans.org/downloads/index.html
Also, EE-s will be packaged using maven, which you can get from here: http://maven.apache.org/ . However, if you are using Ubuntu 12.04, like I do you can just install maven from the Software Centre, or with the command in the terminal : sudo apt-get install maven. This provides you with maven 3.0.4. The minimum version is 2.2.1, that is part of the Ubuntu 10.04 distro, and I can confirm that it also works.

Fortunately, NetBeans has built-in support for maven, so no other steps are necessary.
NetBeans: Tools → Options → Formating → Java → imports (class count to Use Star import, Members count to use star import off) – Otherwise you will have maven-scr-plugin errors “Unable to load class”

Download, Compile, Package and Start Stanbol

You can download and use Stanbol by following the steps described here:
http://incubator.apache.org/stanbol/docs/trunk/tutorial.html
However, in the line
export MAVEN_OPTS=”-Xmx512M -XX:MaxPermSize=128M”
you probable should rather say -Xmx700M as with 512M I could manage to get out of heap space exceptions. It does not happen all the time, but once in a while.
There are two reasons why we need Stanbol. Firts, obviously, we want to try out out newly written EE-s. Also, as to data stanbol 0.10.0 jars are not in maven repos yet, we will take them from the deployment. (stanbol 0.9.0 jars are in maven, but they wont work with 0.10.0 because some package name changes)
Compiling and packaging Stanbol will take a while, so be patient. Once Stanbol the mvn clean install command finishes without errors, you can carry on to the next step.
(* at this point I have run into problems with CELI test cases – it gave me SOAP errors, however re-running mvn install once more the issue does not come up again. )

Creating an OSGI bundle in NetBeans

Stanbol is built up from OSGI bundles. If you are not familiar with OSGI, do not worry, you will like it by the end of this tutorial. Like everything else, our Example Enhanement Engine will be written and deployed as an OSGI bundle.

  1. Start creating a new project
  2. Click on Maven->OSGI Bundle
  3. Fill in name, version, group id, version
  4. Then click on finish.

This will create you an OSGI bundle project with the an Activator class as shown below. We won’t rely on that class, because the service we are going to create will be registered not using an Activator, but using maven-scr-plugin. Therefore, we need to delete this Activator.java.

Also, we delete the corresponding line in the pom.xml (under project files). However, we add a declaration to export the java package we will work on:

<configuration>
   <instructions>
      <!--<Bundle-Activator>com.example.enhacnementengine.Activator</Bundle-Activator> -->
      <Export-Package>
          hu.sztaki.testenhancer;version=${project.version}
      </Export-Package>
   </instructions>
</configuration>

While we are here in pom.xml, we will do an other modification that is necessary. Above the org.apache.felix plugin, we insert the following:

<plugin>
   <groupId>org.apache.felix</groupId>
   <artifactId>maven-scr-plugin</artifactId>
   <executions>
     <execution>
        <id>generate-scr-scrdescriptor</id>
        <goals>
           <goal>scr</goal>
        </goals>
     </execution>
   </executions>
</plugin>

This plugin will add an OSGI-INF directory to the jar file we are will generate. In that directory a descriptor will be generated that will register our EnhancementEngine after uploading the bundle to Apache Felix.
After the modifications, this is how our pom.xml looks like:

Creating the java code for the enhancer

Create a class named TestEnhancer in the package com.example.enhancementengine (by right-clicking on the package name) and extend the class description the following way:

public class TestEnhancer
 extends AbstractEnhancementEngine
 implements EnhancementEngine, ServiceProperties {
}

Now we gonna need to add some dependencies to our project.

  1. First we add the stanbol-0.10.0-incubating jar. There is a 0.9.0 version that you can find in the maven repos, however, that is not good for us, so do not insert it into this project!
  2. Right-click on the dependencies folder of the project (not Java Dependencies!), then click on the Add dependency item!
  3. Fill the dialog with the following data: Group ID: org.apache.stanbol Artifact ID: org.apache.stanbol.enhancer.servicesapi version 0.10.0-incubating
  4. Do not fill anything else here, but click “Add”

Under the dependencies a new jar appears with a yellow warning sing. Right-click on this jar and click on “manually install artifact”. Using the Browse button, locate the proper jar in your stanbol deployment, e.g. under
launchers/stable/target/classes/resources/bundles/20/org.apache.stanbol.enhancer.servicesapi-0.10.0-incubating-SNAPSHOT.jar

After this, you can add the following imports to your java code:

import org.apache.stanbol.enhancer.servicesapi.EnhancementEngine;
import org.apache.stanbol.enhancer.servicesapi.ServiceProperties;
import org.apache.stanbol.enhancer.servicesapi.impl.AbstractEnhancementEngine;

(use the alt-enter context suggestion function of Netbeans)
After that, we can create the stub implementations of canEnhance, computeEnhancements and getServiceProperties functions.

Next, we add some annotations that will be used for generating the bundle descriptor:

@Component(immediate = true, metatype = true, inherit=true)
@Service
@Properties(value = {
    @Property(name = EnhancementEngine.PROPERTY_NAME, value = "testenhancer"),
    @Property(name=TestEnhancer.TRIGGER_STRING, value=TestEnhancer.DEFAULT_TRIGGER_STRING),
    @Property(name=TestEnhancer.RDF_PROP, value=TestEnhancer.DEFAULT_RDF_PROP),
    @Property(name=TestEnhancer.WIKI_URL,value=TestEnhancer.DEFAULT_WIKI_URL)
})

The Property named EnhancementEngine.PROPERTY_NAME is mandatory, the TRIGGER_STRING and the WIKI_URL are the parameters of our ExampleEnhancer. We will create a very simple enhancer, that if the incoming content contains the TRIGGER_STRING it will add an RDF triple that point to the WIKI_URL.
In order to make the annotations work, we need some imports.
Alt-enter on the @Component, select “Search Dependency at Maven Repositories for Component”
Add org.apache.felix.annotations 1.6.0 jar. After that, you will be able to add all the dependencies for @Component, @Service, @Properties, @Property,
After that, we can add the following into the class body:

public static final String TRIGGER_STRING = "hu.sztaki.testenhancer.trigger";
public static final String DEFAULT_TRIGGER_STRING = "Polanyi";
public static final String RDF_PROP = "hu.sztaki.testenhancer.rdfprop";
public static final String DEFAULT_RDF_PROP = "rdf:about";
public static final String WIKI_URL = "hu.sztaki.testenhancer.wiki";
public static final String DEFAULT_WIKI_URL = "http://en.wikipedia.org/wiki/Michael_Polanyi";

This part will configure our simple enhancer to whenever the input contains “Polanyi” it should give a triple according to which the current document is about the philosopher Michael Polanyi. Don’t get confused, 3 of these 6 variables are defining the names of properties, the other define their default values.
Below these 6 lines we add the following:

private String trigger;
private String rdfProp;
private String wikiUrl;
    @Override
    protected void activate(ComponentContext ctx) {
        Dictionary props = ctx.getProperties();
        this.trigger = props.get(TRIGGER_STRING);
        this.rdfProp = props.get(RDF_PROP);
        this.wikiUrl = props.get(WIKI_URL);
    }

The private variable represent the actual values of the above variables, which we retrieve from at the Activation of the component.

Now we implement the canEnhance and the getServiceProperties methods
public int canEnhance(ContentItem ci) throws EngineException {
	return ENHANCE_SYNCHRONOUS;
}

These indicates that we will enhance in a synchronous mode, meaning that the framework will wait until we finish processing.
public Map getServiceProperties() {
        return Collections.unmodifiableMap(Collections.singletonMap(
        ENHANCEMENT_ENGINE_ORDERING,(Object) ServiceProperties.ORDERING_DEFAULT));
    }

This is about the place of the Enhancement Engine in the Chain.
Finally we go on implementing the computeEnhancements method.

public void computeEnhancements(ContentItem ci) throws EngineException {
        // get the content text
        Map.Entry contentPart =
                ContentItemHelper.getBlob(ci,
                Collections.unmodifiableSet(
                new HashSet(Arrays.asList("text/plain"))));
        String text;
        try {
            text = ContentItemHelper.getText(contentPart.getValue());
        } catch (IOException e) {
            throw new InvalidContentException(this, ci, e);
        }
        if (text.contains(trigger)) {
            UriRef entityAnnotation = EnhancementEngineHelper.createEntityEnhancement(ci, this);

            MGraph model = ci.getMetadata();

            UriRef predicate = new UriRef(rdfProp);
            UriRef object = new UriRef(wikiUrl);

            Triple triple = new TripleImpl(entityAnnotation, predicate, object);

            model.add(triple);
        }

    }

You will need to search for the dependency for UriRef and other classes in maven. If you want to use logging, use slf4j!

Bundling

  • Press right click on the project and select “Clean and Build”. This produces a jar file in the “target” directory of the project which contains your bundle.
  • Now go to the console of your Stanbol instance! Under the bundles tab you can install your bundle into your system.
  • After the bundle is installed, you can configure your Enhancer on the Configuration tab.
  • Your enhancer now should show up in the default chain!

Comments are closed.