Automated Linking Data with Apache Stanbol

Olivier Grisel — 2024-05-03T14:05+00:00

This talk will introduce the Stanbol_ project and showcase how it can be integrated in traditional Enterprise Content Management solutions.

[Stanbol](http://incubator.apache.org/stanbol)

Stanbol is an Open Source project under incubation at the Apache Software Foundation. Its goal is to provide Web and CMS developers with a set of HTTP / RESTful services to help them integrate semantic technologies into their products and web sites.

The following Stanbol services are currently under active developments:

Enhancement engines: use Natural Language Processing tools such as [Apache OpenNLP](https://opennlp.apache.org/index.html) to extract knowledge (topics, named entities, facts) from unstructured content and link it to unambiguous URIs from reference knowledge bases;
Entity Hub: a Linked Data indexing cache built on top of [Apache Solr](https://lucene.apache.org/solr/), [Clerezza](https://incubator.apache.org/clerezza) and [Jena](https://incubator.apache.org/jena/) that comes with precomputed indexes and live connectors to popular knowledge bases such as [DBpedia](http://dbpedia.org), [Geonames ](http://www.geonames.org/), [YAGO](https://en.wikipedia.org/wiki/YAGO_%28ontology%29)...
Content Hub: a faceted search engine based on Solr to search for content using the knowledge automatically extracted by the enhancement engines;
CMS bridges to lift the structured content of document repositories using the JCR and [CMIS](https://en.wikipedia.org/wiki)

ReST / HTML errors:System Message: WARNING/2 (<string> , line 16)</p>

Bullet list ends without a blank line; unexpected unindent.

Content_Management_Interoperability_Services access protocols (using [Apache Chemistry](https://chemistry.apache.org)) and store the result into a triple store suitable for [SPARQL](https://en.wikipedia.org/wiki/SPARQL) access;

Rules engine based on [Apache Jena](https://incubator.apache.org/jena/) for knowledge refactoring (e.g. convert extracted knowledge into the rich snippet vocabulary for SEO), integrity checks, merging rules, deductive inference...

Automatically extracting and post-processing structured knowledge from semi-structured content it a key step towards better interoperability of the user intents and building smarter applications. [Apache Stanbol](http://incubator.apache.org/stanbol) aims to make it as easy as possible to achieve that goal.

Docutils System Messages

ReST / HTML errors:System Message: ERROR/3 (<string> , line 1); <em>backlink</em></p>

Unknown target name: "stanbol".

Automated Linking Data with Apache Stanbol (SemWeb.Pro) RSS Feed

Automated Linking Data with Apache Stanbol

Docutils System Messages