MappingPedia: A Collaborative Environment for R2RML Mappings

. Most of Semantic Web data is being generated from legacy datasets with the help of mappings, some of which may have been spec-iﬁed declaratively in languages such as R2RML or its extensions: RML and xR2RML. Most of these mappings are kept locally in each organization, and to the best to our knowledge, a shared repository that would facilitate the discovery, registration, execution, request and analysis of mappings doesn’t exist. Additionally, many R2RML users do not have suﬃcient knowledge of the mapping language, and would probably bene-ﬁt from collaborating with others. We present a demo of MappingPedia, a collaborative environment for storing and sharing R2RML mappings. It is comprised of ﬁve main functionalities: (1) Discover, (2) Share, (3) Execute, (4) Request, and (5) Analyze.

Currently, a significant number of users have developed R2RML mappings but they have been kept locally in each organization, and to the best to our knowledge, a shared repository that would facilitate the registration, discovery, exploration and execution of mappings doesn't exist. Additionally, many R2RML users do not have sufficient knowledge of the mapping language, and would probably benefit from collaborative work with users who have experience in the development and specification of mappings in similar contexts. MappingPedia 1 provides such an environment where users may browse mappings that have used concepts in their domain of interest or request their development by other users. The work in [6] points out the difficulties encountered during the development of R2RML mappings, and presents two approaches for mapping construction: Ontology-driven and Database-driven; [3] extends them with the Model-driven and Result-driven approaches. MappingPedia is a tool to support collaborative mapping development and is independent of a specific editing approach.
MappingPedia integrates several tools developed for the exploitation of RDF data. The core of the integrated tools is morph-RDB [7], an RDB2RDF engine that follows the R2RML specification. It supports two operational modes: (1) RDF data generation from a relational database or CSV file, and (2) SPARQL to SQL Query translation according to R2RML mapping descriptions. MIRROR [1] is a system that generates two sets of R2RML mappings: First, it creates a set of mappings similar to the W3C Direct Mappings (https://www.w3.org/TR/ rdb-direct-mapping/), and second, a set of R2RML mappings that result from the implicit knowledge encoded in relational database schemas. Loupe [5] is a tool that is aimed at the exploration of data sources that have been annotated, and their ontologies. Loupe conducts an inspection of the classes, properties and triples to gather explicit vocabulary, classes and property usage, and to discover implicit data patterns through a fine grained set of metrics.
We demonstrate the capabilities of MappingPedia for a set of use cases that take into account three key elements: the dataset, the ontologies used for the mappings and the mapping files themselves. Figure 1 presents the MappingPedia architecture. MappingPedia consists of two components: the Engine and the Interface. The MappingPedia Engine is responsible for storing mappings as RDF graphs in a Virtuoso server, executing mappings by connecting to morph-RDB, and taking care of the evolution of mappings through their storage in a GitHub repository. The MappingPedia Interface provides a web interface for end-users and REST interfaces for external applications. Additionally, it stores user data in a MySQL database and calls the Loupe API Fig. 1. Architecture of MappingPedia to gather statistics. Loupe has been extended in order to analyze not only classes and properties, but also the value distribution of properties. This is the case of the R2RML properties rr:class and rr:predicate, where their values are the classes and properties used in mappings. -Browse. Users may search for metadata defined in the MappingPedia ontology 2 . The ontology includes properties for datasets and mapping files and reuses the Data Catalog (dcat) and Dublin Core (dc) ontologies. The user may also search by a class or property in a mapping; for example a user may search the mappings that contain foaf:Person. A screenshot of this functionality can be seen in Fig. 2. -Execute. A user may execute R2RML mappings on MappingPedia using morph-RDB. If the mappings have not been developed, MIRROR can be used to generate the initial mappings; the output will be an RDF dataset. Additionally she may specify a SPARQL query using the concepts in the ontologies mapped, and the query will be rewritten into SQL according to the R2RML mapping descriptions. -Share. A user may register mappings and dataset information.

Use Cases
-Request. A user may request the development of mappings. This resembles a ticket system where a request may be Open, In progress, Resolved (Fixed, Incomplete, Not Fixed) and Closed. -Analyze. Statistics are gathered on all the classes and properties of ontologies that have been used in the mappings (e.g. MappingPedia, R2RML, Data Catalog, etc.); also, statistics may be requested over a period of time.

Request and Analyze Functionalities
In the demo we will show the functionalities of MappingPedia with an emphasis on two cases: mapping request and mapping analysis.
Mapping request. There are two actors in a mapping request: the data owner and the mapping creator. The data owner wants to publish his dataset as Linked Data. He creates a request for mappings and provides information on his dataset. A mapping creator may assign himself a request for mappings, the dataset owner will be notified. Once the mappings have been uploaded the owner will again be notified. He may execute the mappings and according the outcome of the execution he may decide to close the request or reopen it.
Mapping analysis. The objective of mapping analysis is to provide a generic overview and analytics about the mappings to a user who is browsing. Mapping-Pedia uses Loupe profiling services to generate information such as the number of mappings available, top classes being mapped, top properties being mapped, and the number of columns mapped for each mapping. Currently MappingPedia is populated with the 685 English DBpedia mappings that are available in the RML format 3 . The analysis of those mappings is presented in the "Statistics" option. A screenshot of this functionality is illustrated in Fig. 3.

Conclusions and Future Work
We have presented MappingPedia, a collaborative environment that integrates various R2RML-based tools for the purpose of discovery, sharing, requesting and executing R2RML mappings. MappingPedia is in its first version and we envision several features that we will implement. In the future we will integrate an R2RML mapping editor so that a mapping creator is able to work directly on MappingPedia once he has been assigned a mapping request. We will also integrate MappingPedia with data characterization tools in order to propose mappings based on the content of a dataset. Finally, we also plan to integrate MappingPedia with RML and xR2RML engines.