GraphDB Users Ask: Where Can We Deploy GraphDB And What Are Some Best Practices?

TESTED ON: GraphDB 10.2

June 22, 2023 4 mins. read GraphDB Q&As

ONTOTEXT ANSWER:

Oracle famously claims that 3 billion devices run Java. This means that there are 3 billion devices that can run GraphDB. This even includes every single Blu-ray player. You probably wound’t want to deploy your database on Blu-ray, but you can.

This great amount of flexibility means that you can deploy on any cloud or VM and any OS. Most of our users start out with testing on their own workstations. Working from your own machine is fine even for large datasets and complex computations – you can run a repository with hundreds of millions of statements on a workstation. Once you are ready to move to a more substantial deployment, the key question is what capabilities you need.

Let’s assume you want to run the database in a high-availability setup. In order to achieve this, we offer GraphDB enterprise with a cluster. The cluster has three key features to keep in mind:

  • Data replication – all your data is present on all GraphDB instances in the cluster. If one instance fails, all the others are perfectly capable of maintaining operation alone.
  • Eventual consistency – to speed up processing while maintaining the ACID principle, writes happen in order, but reads can happen at any time. You can configure strong consistency.
  • Leader elections – in the cluster, the leader is responsible for internal load balancing of the requests and for “testing” all updates before approving them. Elections are automated and internal.

From these factors, you can establish the following best practices:

  • You want all your workers to have the same hardware parameters since they will all do the same work on the same data.
  • For optimal reliability, you want all workers on separate instances, in separate availability zones.
  • The cluster operates best on an odd number of instances, to prevent election “deadlocks”.
  • We ship GraphDB with an optional external proxy which maintains information on the status of the cluster and always directs requests to the leader for less HTTP overheads. For extra resilience, you can deploy the external proxy in a cluster of 2 or more instances and load balance requests toward it.
  • The GraphDB cluster operates on two ports – the HTTP port and gRPC port. Internal traffic should be allowed on both. External traffic could be limited to the HTTP port only.

Regardless of your deployment model, some general advice is:

  • Have easy access to your logs. This can be done with tools such as datadog, prometheus, etc.
  • Get acquainted with the GraphDB reporting tools.
  • Do not panic! If there’s a node down, don’t start destroying and restarting the cluster, only take drastic measures if it’s inoperable and not healing automatically.
  • Make regular backups.
  • When deploying with SSO, a JWT decoder and the browser’s inspector tool go a long way.

When deploying on cloud infrastructure, there are some additional considerations to make:

  • You can use your cloud’s load-balancing capabilities instead of the external proxy. This can make your setup easier while sacrificing a bit of performance.
    • With the external proxy, the requests would be routed like this: Internet → Proxy → Leader → Overall cluster.
    • With the load balancer, the requests would be routed like this: Internet → Load balancer → Cluster → Leader → Overall cluster. That’s one extra hop if the load balancer does not redirect toward the current leader by chance.
  • You can also use the cloud provider’s load balancer for SSL termination. GraphDB can serve requests over HTTPS, but it may be easier to skip the configuration.
  • You could deploy GraphDB instances in different regions. In such a case, keep in mind that latency and bandwidth can become a big issue. There’s a lot of internal communication.
  • IOPS are often a problem with cloud providers and hard to analyze.

These steps are answers to the most common issues that arise when operating your cluster. However, there are many variables that make each deployment unique. Get in touch about your use case and we can ensure you are making the most of your GraphDB!

New call-to-action

 

Article's content

Ontotext answers questions from our GraphDB users. You can also check out the frequently asked questions on general topics about GraphDB. Or you can get quick answers on technical questions from the community as well as Ontotext experts using the graphdb tag on stack overflow.

GraphDB Users Ask: Where Can We Deploy GraphDB And What Are Some Best Practices?

In this blog, we answer questions from our GraphDB users. This question is about where can one deploy GraphDB and what are some best practices

GraphDB Users Ask: What Isolation Levels Does GraphDB Support?

In this blog, we answer questions from our GraphDB users. This question is about the the isolation levels GraphDB supports..

GraphDB Users Ask: What is the Most Important Hardware Attribute for Optimizing GraphDB Performance?

In this blog, we answer questions from our GraphDB users. This question is about the most important hardware attribute for optimizing GraphDB performance.

GraphDB Users Ask: What is the Best Way to Store the Triples’ History in the Database?

In this blog, we answer questions from our GraphDB users. This question is about the best way to store the triples’ history in the database

GraphDB Users Ask: Can I Use Nested Repositories to Introduce Logical Separation to GraphDB?

In this blog, we answer questions from our GraphDB users. This question is about using nested repositories to introduce logical separation to GraphDB

GraphDB Users Ask: Can I Fine-tune Security on Some of the Endpoints in GraphDB?

In this blog, we answer questions from our GraphDB users. This question is about fine-tuning securing on a GraphDB endpoint.

GraphDB Users Ask: What Are the Different Ways to Deploy GraphDB?

In this blog, we answer questions from our GraphDB users. This question is about the different ways to deploy GraphDB.

GraphDB Users Ask: What is the best way to integrate JSON data in GraphDB?

In this blog, we answer questions from our GraphDB users. This question is about the best ways to integrate JSON data in GraphDB.

GraphDB Users Ask: How Does GraphDB’s Security Work, Especially for Automated APIs?

In this feature, we answer questions from our GraphDB users. This question is about how about GraphDB security workds, especially for Automated APIs

GraphDB Users Ask: Is Kafka Only Used for Exporting Data, or for Importing, or Can We Do Both?

In this feature, we answer questions from our GraphDB users. This question is about if Kafka is used only for exporting or importing data or we can use for both

GraphDB Users Ask: How Do I Change the Configuration of an Existing Connector?

In this feature, we answer questions from our GraphDB users. Today’s question is about how to change the configuration of connector if you’ve made a mistake when creating it

GraphDB Users Ask: Are There Any Administration Differences to Operating a Cluster on GraphDB 10?

In this feature, we answer questions from our GraphDB users. Today’s question is about whether there are administration differences to operating a cluster in GraphDB 10

GraphDB Users Ask: Can I Scale GraphDB?

In this feature, we answer questions from our GraphDB users. Today’s question is if one can scale GraphDB.

GraphDB Users Ask: Can I Change My Inference At Runtime?

In this feature, we answer questions from our GraphDB users. Today’s question is if one can change inference at runtime.

GraphDB Users Ask: How To Mark Statements In A Query As Explicit Or Implicit?

In this feature, we answer questions from our GraphDB users. Today’s question is about how to mark statements in a query as explicit or implicit.

GraphDB Users Ask: Can I Use the Standard Ontop Configurations?

In this feature, we answer questions from our GraphDB users. Today’s question is if one can use the standard Onotp configurations.

GraphDB Users Ask: Should I Use a SPARQL Repository or a HTTP Repository?

In this feature, we answer questions from our GraphDB users. Today’s question us whether to use a SPARQL Repository or a HTTP Repository.

GraphDB Users Ask: Do You Have Any Advice on the Log4j Vulnerability for Different Versions of GraphDB?

In this feature, we answer questions from our GraphDB users. Today’s question is about the Log4j vulnerability for different versions of GraphDB.

GraphDB Users Ask 12 Very Short Questions

In this feature, we answer questions from our GraphDB users. Today, we answer 12 very short question from GraphDB users.

GraphDB Users Ask: Which of the GraphDB Logs Do I Need to Monitor for Problems?

In this feature, we answer questions from our GraphDB users. Today’s question is about GraphDB logs and how to monitor for problems.

GraphDB Users Ask: Can You Help Me Optimize My Queries?

In this feature, we answer questions from our GraphDB users. Today’s question is about how users can optimize their queries.

GraphDB Users Ask: What’s the Difference Between SPARQL and FedX Federation?

In this feature, we answer questions from our GraphDB users. Today’s question is about the difference between SPARQL and FedX federation.

GraphDB Users Ask: What Does The “Insufficient Free Heap Memory” Error Mean?

In this feature, we answer questions from our GraphDB users. Today’s question is about what the “Insufficient Free Heap memory” error means.

GraphDB Users Ask: How To Optimize My Inference?

In this feature, we answer questions from our GraphDB users. Today’s question is about how to optimize inference.

GraphDB Users Ask: Is RDF-Star The Best Choice For Reification?

In this feature, we answer questions from our GraphDB users. Today’s question is about whether RDF-star is the best choice for reification.

GraphDB Users Ask: Can GraphDB Infer Data Based on Values From a Virtualized Repository?

In this feature, we answer questions from our GraphDB users. Today’s question is about if GraphDB’s inference works with virtualized repositories.

GraphDB Users Ask: How Does SHACL Work on GraphDB?

In this feature, we answer questions from our GraphDB users. Today’s question is about how SHACL works on GraphDB.

GraphDB Users Ask: Does GraphDB Support ABAC?

In this feature, we answer questions from our GraphDB users. Today’s question is about if GraphDB supports ABAC.

GraphDB Users Ask: Why Do I Get Errors About GraphDB Being “Unable to Find Valid Certification Path to Requested Target”?

In this feature, we answer questions from our GraphDB users. Today’s question is about getting errors about GraphDB being “unable to find valid certification path to requested target”.

GraphDB Users Ask: How Can I Break Up My Data to Control Access To It?

In this feature on our blog, we answer questions from our GraphDB users. Today’s question is about GraphDB security and access control.

GraphDB Users Ask: Why does My Import Start Really Fast But Then Starts Losing Speed After a While?

In this feature on our blog, we answer questions from our GraphDB users. Today’s question is about GraphDB import speed.

GraphDB Users Ask: Can You Help Me Understand The Built-in GraphDB Security?

In this feature on our blog, we answer questions from our GraphDB users. Today’s question is about GraphDB security.

GraphDB Users Ask: How Many Repositories Can I Have in GraphDB and How Can I Unite the Disparate Data Between Them?

In this feature, we answer questions from our GraphDB users. Today’s question is about the number of repos in GraphDB and accessing the data.