GraphDB Users Ask: What’s the Difference Between SPARQL and FedX Federation?

TESTED ON: GraphDB 9.10

November 25, 2021 4 mins. read GraphDB Q&As

ONTOTEXT ANSWER:

There are a few ways to federate different data sources with GraphDB and Ontotext Platform. Two of them achieve the same goal. SPARQL repositories can be federated with either standard SPARQL tools or by using FedX. This leads to the logical question: what’s the tradeoff between the two of them.

Standard SPARQL federation is a manual tool. It is controlled with the SERVICE keyword. With it, you decide to execute a part of the query against a remote repository. You have to know the address of the remote repository and its password and username, if it is secured. You have to write the query manually and to know the model of the remote repository. The SERVICE keyword is treated as a subquery.

There are three ways to achieve standard SPARQL federation with GraphDB:

  • With the SERVICE keyword and the full address of the repository, such as SERVICE <http://user:pass@gdb-host.com/repository/test>. Notice the usage of basic authentication. That’s the only way to use credentials with this approach. The problem is that the credentials may leak into the GraphDB logs.
  • If your repository is local – on the same GraphDB instance – you can use internal federation with SERVICE <repository:test>. This way you skip the HTTP overhead. If there’s a lot of data being transferred, this is preferable to the previous option.
  • If you have a secure remote repository but don’t want to risk leaking your password, or you want to give access to other users without them needing to know the password, you can create a “proxy” repository and then access it via the local repository mechanism described above.

It is important to order your query properly. Remember that in SPARQL subqueries are executed first. Imagine a case where your query has two parts – one part returns 6 results, the other part returns 600. The two parts are connected. If you get the 600 results first and then narrow them down with the 6, you will be performing much more work than the inverse case. Therefore, you want the query that returns 6 results to execute first. Since subqueries are executed before the main query, this means you want to run the overall query on the repository with the 600 results, if possible.

FedX federation takes all the manual work away. You define a federated repository and say to which remote repositories you want to connect. You can provide it with credentials. The remote repository can be any SPARQL-enabled repository. This includes other federated repositories. You just write your query as you usually would, without worrying about the way in which data is partitioned across the repositories or about query structure.

The drawback of this ease of access is that FedX has to perform all the work that you usually do manually. There’s logic to decide which part of the query should be sent to which repository, to structure it properly to avoid the query ordering issues, then to transfer the data and join it. That could be a lot of work, and FedX might get it wrong, just like you sometimes write an unoptimized query. That’s why you can expect it to have worse performance than the manual execution of the same query.

To sum it up, if you have low-level access to the remote repositories and want to optimize your queries, you would usually prefer standard SPARQL federation with the SERVICE keyword. If you prefer ease of access and have a lot of users who are not familiar with the way in which the data is modelled, FedX is the better choice.

Did this help you solve your issue? Your opinion is important not only to us but also to your peers.

New call-to-action

 

Article's content

Ontotext answers questions from our GraphDB users. You can also check out the frequently asked questions on general topics about GraphDB. Or you can get quick answers on technical questions from the community as well as Ontotext experts using the graphdb tag on stack overflow.

GraphDB Users Ask: Where Can We Deploy GraphDB And What Are Some Best Practices?

In this blog, we answer questions from our GraphDB users. This question is about where can one deploy GraphDB and what are some best practices

GraphDB Users Ask: What Isolation Levels Does GraphDB Support?

In this blog, we answer questions from our GraphDB users. This question is about the the isolation levels GraphDB supports..

GraphDB Users Ask: What is the Most Important Hardware Attribute for Optimizing GraphDB Performance?

In this blog, we answer questions from our GraphDB users. This question is about the most important hardware attribute for optimizing GraphDB performance.

GraphDB Users Ask: What is the Best Way to Store the Triples’ History in the Database?

In this blog, we answer questions from our GraphDB users. This question is about the best way to store the triples’ history in the database

GraphDB Users Ask: Can I Use Nested Repositories to Introduce Logical Separation to GraphDB?

In this blog, we answer questions from our GraphDB users. This question is about using nested repositories to introduce logical separation to GraphDB

GraphDB Users Ask: Can I Fine-tune Security on Some of the Endpoints in GraphDB?

In this blog, we answer questions from our GraphDB users. This question is about fine-tuning securing on a GraphDB endpoint.

GraphDB Users Ask: What Are the Different Ways to Deploy GraphDB?

In this blog, we answer questions from our GraphDB users. This question is about the different ways to deploy GraphDB.

GraphDB Users Ask: What is the best way to integrate JSON data in GraphDB?

In this blog, we answer questions from our GraphDB users. This question is about the best ways to integrate JSON data in GraphDB.

GraphDB Users Ask: How Does GraphDB’s Security Work, Especially for Automated APIs?

In this feature, we answer questions from our GraphDB users. This question is about how about GraphDB security workds, especially for Automated APIs

GraphDB Users Ask: Is Kafka Only Used for Exporting Data, or for Importing, or Can We Do Both?

In this feature, we answer questions from our GraphDB users. This question is about if Kafka is used only for exporting or importing data or we can use for both

GraphDB Users Ask: How Do I Change the Configuration of an Existing Connector?

In this feature, we answer questions from our GraphDB users. Today’s question is about how to change the configuration of connector if you’ve made a mistake when creating it

GraphDB Users Ask: Are There Any Administration Differences to Operating a Cluster on GraphDB 10?

In this feature, we answer questions from our GraphDB users. Today’s question is about whether there are administration differences to operating a cluster in GraphDB 10

GraphDB Users Ask: Can I Scale GraphDB?

In this feature, we answer questions from our GraphDB users. Today’s question is if one can scale GraphDB.

GraphDB Users Ask: Can I Change My Inference At Runtime?

In this feature, we answer questions from our GraphDB users. Today’s question is if one can change inference at runtime.

GraphDB Users Ask: How To Mark Statements In A Query As Explicit Or Implicit?

In this feature, we answer questions from our GraphDB users. Today’s question is about how to mark statements in a query as explicit or implicit.

GraphDB Users Ask: Can I Use the Standard Ontop Configurations?

In this feature, we answer questions from our GraphDB users. Today’s question is if one can use the standard Onotp configurations.

GraphDB Users Ask: Should I Use a SPARQL Repository or a HTTP Repository?

In this feature, we answer questions from our GraphDB users. Today’s question us whether to use a SPARQL Repository or a HTTP Repository.

GraphDB Users Ask: Do You Have Any Advice on the Log4j Vulnerability for Different Versions of GraphDB?

In this feature, we answer questions from our GraphDB users. Today’s question is about the Log4j vulnerability for different versions of GraphDB.

GraphDB Users Ask 12 Very Short Questions

In this feature, we answer questions from our GraphDB users. Today, we answer 12 very short question from GraphDB users.

GraphDB Users Ask: Which of the GraphDB Logs Do I Need to Monitor for Problems?

In this feature, we answer questions from our GraphDB users. Today’s question is about GraphDB logs and how to monitor for problems.

GraphDB Users Ask: Can You Help Me Optimize My Queries?

In this feature, we answer questions from our GraphDB users. Today’s question is about how users can optimize their queries.

GraphDB Users Ask: What’s the Difference Between SPARQL and FedX Federation?

In this feature, we answer questions from our GraphDB users. Today’s question is about the difference between SPARQL and FedX federation.

GraphDB Users Ask: What Does The “Insufficient Free Heap Memory” Error Mean?

In this feature, we answer questions from our GraphDB users. Today’s question is about what the “Insufficient Free Heap memory” error means.

GraphDB Users Ask: How To Optimize My Inference?

In this feature, we answer questions from our GraphDB users. Today’s question is about how to optimize inference.

GraphDB Users Ask: Is RDF-Star The Best Choice For Reification?

In this feature, we answer questions from our GraphDB users. Today’s question is about whether RDF-star is the best choice for reification.

GraphDB Users Ask: Can GraphDB Infer Data Based on Values From a Virtualized Repository?

In this feature, we answer questions from our GraphDB users. Today’s question is about if GraphDB’s inference works with virtualized repositories.

GraphDB Users Ask: How Does SHACL Work on GraphDB?

In this feature, we answer questions from our GraphDB users. Today’s question is about how SHACL works on GraphDB.

GraphDB Users Ask: Does GraphDB Support ABAC?

In this feature, we answer questions from our GraphDB users. Today’s question is about if GraphDB supports ABAC.

GraphDB Users Ask: Why Do I Get Errors About GraphDB Being “Unable to Find Valid Certification Path to Requested Target”?

In this feature, we answer questions from our GraphDB users. Today’s question is about getting errors about GraphDB being “unable to find valid certification path to requested target”.

GraphDB Users Ask: How Can I Break Up My Data to Control Access To It?

In this feature on our blog, we answer questions from our GraphDB users. Today’s question is about GraphDB security and access control.

GraphDB Users Ask: Why does My Import Start Really Fast But Then Starts Losing Speed After a While?

In this feature on our blog, we answer questions from our GraphDB users. Today’s question is about GraphDB import speed.

GraphDB Users Ask: Can You Help Me Understand The Built-in GraphDB Security?

In this feature on our blog, we answer questions from our GraphDB users. Today’s question is about GraphDB security.

GraphDB Users Ask: How Many Repositories Can I Have in GraphDB and How Can I Unite the Disparate Data Between Them?

In this feature, we answer questions from our GraphDB users. Today’s question is about the number of repos in GraphDB and accessing the data.