Ontotext

Reason-able Views

Reason-able views (RAV) represent a practical approach for reasoning with the web of linked data. It is an assembly of independent datasets, which can be used as a single body of knowledge - an integrated dataset - with respect to reasoning and query evaluation. The integrated dataset is designed to meet some criteria for "reasonability", e.g. it has specific qualities with respect to a specific reasoning task and language. For example, "consistent with OWL Lite" or "allows RDFS entailment within O(n) time and space".

Linked data reason-able view can be considered a special case where:

  • All the datasets in the view represent linked data
  • Single reasonability criteria is imposed on all datasets
  • Each dataset is connected to at least one of the others

Considering the size of the LOD datasets, in order to make query evaluation and reasoning practically feasible, the integrated dataset of a linked RAV should be loaded in a single repository (even if it employs some sort of distribution internally). Such linked RAV can be considered as index, which caches parts of the LOD cloud and provides access to the datasets included in it in a manner similar to the one in which web search engines index WWW pages and facilitate their usage.

As a final practical consideration, to allow for caching and indexing, linked RAVs should include only datasets that are more or less static; this excludes various types of wrappers or virtual datasets, where RDF is generated in answer to retrieval requests (one can make an analogy with the dynamic part of the WWW).

Standard Methods of Inference

Practically inapplicable to a web of linked data are the standard methods of sound and complete inference with respect to relatively rich flavor of the First Order Predicate Calculus (FOPC). Some of the major obstacles are:

  • Counting on "closed-world" assumption models developed under centralized control by the most popular FOPC fragments, such as the Description Logics (DL). This is irrelevant in web context. Performing sound and complete inference with respect to LOD-type data is heavily prone to inconsistency. This renders the results of such inference useless.
  • Mechanisms with prohibitively high computational complexity of the semantics of languages like DL. They require "satisfiability" checks. As a result the most scalable published experiments with DL reasoning remain below 10 million statements of sound and complete reasoning. This is not enough.
  • Unsuitability for reasoning of some of the datasets of LOD (or some parts of them). Some data publishers seem to use the OWL and RDFS vocabulary without account for their formal semantics. The result of inference for some datasets is of questionable utility. For instance, a dataset contains a subject hierarchy, encoded via the relation rdfs:subClassOf with cycles of length tens of concepts. Any reasoner, following the standard semantics of rdfs:subClassOf, will infer that all the concepts in the loop are equivalent. This does not seem to be the intention of the publishers.
  • Reasoning with data distributed across different web servers is possible but much slower than reasoning with local data. The fundamental reason is related to the so called "remote join" problem known from the distributed database management systems (DBMS).

Linking the Linked Data

Reasoning has the potential to enhance the interlinking between linked data datasets, as long as it it ensures enforcement of the semantics of the links. For instance, the link between the identifiers for Vienna in DBpedia (dbpedia:Vienna) and in Geonames (geonames:2761369), and the statements linkingVienna to the corresponding high-level administrative region in Austria (geonames:2761367):

dbpedia:Vienna owl:sameAs geonames:2761369
geonames:2761369 gno:parentFeature geonames:2761367

derive by simple inference the statement:

dbpedia:Vienna gno:parentFeature geonames:2761367

This would allow this connection between the DBpedia entry of Vienna and the Geonames description of Austria to appear when exploring dbpedia:Vienna or to be considered during query evaluation.