Our goals are ambitions when it comes to Web 3.0: a tidal wave of Semantic web app is going to sweep the Internet and…we want to be among them.
Amit Singhal from Google has been straightforward in presenting the new semantic search of Big G to the WSJ and we’ll not get into this: there are already excellent reviews out there in case you missed the point. This post is about WordLift 2.0 and the work we’ve been doing with the IKS team to make WordPress a better CMS. More over we’ll present our roadmap and the key objectives of our future business on Web 3.0.
In WordLift 1.x the target was to enrich your posts and pages with schema.org mark-up for persons and organizations and we learned three crucial lessons:
- editors’ time is precious and no one wants to wait for a process to be completed,
- selecting relevant entities with multiple selection is not an effective usage-pattern (a lot of our editors missed this step),
- not everyone is famous (or willing) enough to be on wikipedia, freebase or the new york times (and typical blogs don’t have a thesaurus).
With WordLift 2.0 we wanted:
- to move a step forward adding more terms of the schema.org vocabulary,
- to let editors work directly on entities using WordPress,
- to use named entities in the information architecture of any content-intensive web sites,
- to easy the job of the editors by letting Stanbol work in background and by providing a more consistent UX.
The new schema.org vocabulary supported by WordLift
We now support: person, organization, event, product, place, thing and creative work and we’re building a PHP framework (soon to be release to the community) to support the entire schema.org vocabulary; this will also help us in the future keep the cost of software maintenance under control (the vocabulary is evolving pretty fast these days).
The entity editor
Entities in the IA of your blog
It is now possible to access the contents of your blog using:
- a blog entity map organized in a matrix of nice-looking-boxes with all the entities detected on the blog; in the matrix the size of each entity-box represents the number of occurrences of that entity in the whole corpus of text: the entity-box points the user to the entity-page (see below)
- a matrix of entity-boxes for each post; also in this case each entity-box points to its entity-page;
- a world-map displaying all the geo-entities of the blog;
- the entity-page with all the posts related to the entity and the information on the entity itself.
Content enrichment in background
When you are writing or editing your posts and pages on WordPress, the changes you make are automatically saved every x minutes. In the lower right corner of the editor, you’ll see a notification of when the entry was last saved to the database.
In order to let editors write and save their posts without waiting for Stanbol to conclude its analysis (as it was in WordLift from version 1.0 to version 1.6) we attached to the autosave process the enhancement pipeline.
Now for every autosave of either post or page, Stanbol will process the data and WordLift will smoothly return the list of named entities in a separate tab of the editor window. The analysis is progressive and continuos and WordLift keeps track of any adjustment occurring in the text editor (if you change your article the list of entities will be updated accordingly).
We need to consolidate the existing release and to hit the production environment in the next couple of months: we’re currently planning to test the beta of WordLift 2 on Enel’s (Italy’s first energy utility company) enterprise blogging platform (as we did for version 1.6) and to add one large publisher to the list of our selected beta-testers in order to get back to the WordPress community with something relevant, scalable and consistent as soon as possible.
We had the pleasure of sharing the innovation of WordLift 2.0 with the WordPress VIP team in Austin for the SXSW and we’re currently evaluating different options for the marketing plan of this release.
The most exciting goal when looking at WordLift, IKS and the amazing WordPress community is to enable every blogger to create, share and curate named entities on a large scale; with Google moving ahead with Semantic Search everyone shall have the toolbox ready to produce structure data with his/her own content.
Here follows a quick screencast of the entities navigation pattern introduced with v 2.0: entities are represented as tiles, in the “All Entities” page the size of the entity is given by the number of occurrences of that entity in the entire blog; more over we present the responsivness of the matrix that seamlessly adapts to different screen-sizes (laptop, iPad and smartphone).