Drupal RDFme Plugin

The Data Endpoint feature of RDFme allows to maintain all Drupal data and any other external sources indexed in a local triple store. In addition, the Data Endpoint allows to access all this data via a generic SPARQL endpoint as well as to create and manage custom services based on SPARQL or SPARQLScript.

In comparison to existing RDF and SPARQL modules, the main difference of RDFme is a totally different approach to user interface and interaction with metadata. In RDFme, we put mostly impact on moving between different configurations, loading and unloading data many times over with a variety of mappings. As a follow up of that, we also added some features for customization of data indexing policies and APIs to automatically setup data sources and services (for example using module installation script).

For an example how to use RDFme APIs, including Data Endpoint API, please see our Gi2MO IdeaStream module. It demos how to automatically setup mappings, datasets and services using a Drupal installation script.

List of main features:

  1. [Datasets] RDFme allows to index any sort of RDF data in the local ARC2 store. There are variety of configurations that give different options for updating datasets over time.
  2. [Services] The indexed dataset can be queried using a generic SPARQL endpoint or they can be accessed as REST services defined using SPARQL or SPARQLScript.
  3. [Endpoint Options] The access to entire data endpoint can be limited using standard Drupal permissions system. The indexing of datasets can be handled in Drupal cron cycles. In addition, the generic endpoint can be fully turned off.

Setting up datasets:

Datasets can be any RDF serialized data, e.g. exports of RDF using RDFme, RDF/XML files, remote endpoints accessible with a HTTP protocol etc. RDFme allows to handle every source individually and index it in the triple store under separate graphs. Every dataset can be individually updated or fully reindexed.

In addition, to RDFme allows to define indexing strategies for the datasets:

  • normal – every index cycle the entire graph is deleted and data is reindex from scratch
  • local – uses the internal API of the local RDFme installation. The local Drupal data is indexed based on the RDFme mappings. During index cycles only the data that has changed in Drupal is added to the index (the dataset graph is not wiped).
  • RDFme – similar as local but uses RDFme REST API to connect with remote Drupal instances that have RDFme installed.
  • file – indexing of remote RDF files. In every index cycle RDFme will check if the file has modified and update the graph in triple store only if changes have been detected.

Datasets setup screen (click to enlarge).

Managing services:

The RDFme Services are an easy interface to save and share the indexed datasets with users within the limitations set by the administrator. Each service is a SPARQL query or a SPARQLScrpt that allows functionalities similar to SQL stored procedured (conditional statements, loops, variables etc.). Each service can be individually managed and disabled/enabled on demand.

To help the administrators in writing the service script we integrated simple code highlighting using WYSIWYG module and CodeMirror plugin. In order to use this feature, your Durpal installation needs to have a “SPARQL” input format defined. If the WYSIWYG module is available and its CodeMirror plugin has been installed properly, RDFme will attempt to create the input format automatically during it’s installation. You can get all the needed files in the RDFme bundle distribution.

Services setup screen (click to enlarge).

Additional Data Endpoint options:

The option panel of Data Endpoint allows to configure the following aspect of the module:

  • generic endpoint – the generic endpoint is a regular SPARQL web service that enables to query all datasets together with any constructs of the SPARQL language. In the options screen it is possible to complete disable this feature (for example for security reasons).
  • cron indexing – by default the datasets are indexed when being added for the first time. If you want to ensure that they are always up to date you should use the regular Drupal CRON script (cron.php). In the options screen you can enable or disable indexing of dataset in the cron cycles of Drupal.

In addition, the access to the generic SPARQL endpoint can be configured using regular Drupal permissions system.

Data Endpoint options screen (click to enlarge).