Read about the significant advantages that knowledge graphs can offer the data architect trying to bring a Data Fabric to their organization.
Through 2022, the application of graph processing and graph databases will grow at 100% annually to accelerate data preparation and integration, and enable more adaptive data science. GARTNER, INC: ‘DATA FABRICS ADD AUGMENTED INTELLIGENCE TO MODERNIZE YOUR DATA INTEGRATION’ (EHTISHAM ZAIDI ET AL, DECEMBER 2019)
Humans are stuck on this planet until they devised a way to travel, 11kms per second, the escape velocity from Earth.
It’s only been very recently that humans figured out how to go that fast. It’s hard and it’s expensive. According to this article, it costs $54,500 for every kilogram you want into space. Think of the money you’ll save if you go before the holiday season bingeing! That was until commercial space companies like SpaceX took a different approach. It has been suggested that their Falcon 9 rocket has lowered the cost per kilo to $2,720. A cost reduction by nearly a factor of 20 is an astounding accomplishment for any industry, but it’s especially noteworthy for escaping our gravitational fetters. How did they do it?
They did it by doing a lot of hard and expensive work and not throwing it in the ocean. They are reusing the initial boosters and other parts of the rocket to achieve that incredible 11kms per second. The problem of wasting time, effort and resources is so common that it’s a cliché.
Data analysis is an example where time and effort are being spent over and over only for the data and development to be chucked into the ocean after the work is done. This is particularly true for the initial data preparation stage of any analysis work. Before a data analyst can get to work, they have to gather the data because we have long passed the age where all the data we need is sitting in just one place.
Next, the data needs to be sifted through to better understand what you have and most importantly what you are missing. After that, the data needs to be cleaned. That means removing errors, filling in missing information and harmonizing the various data sources so that there is consistency. Once that is done, data can be transformed and enriched with metadata to facilitate analysis.
Finally, the data is uploaded and ready to actually be used, to do that actual analysis part of the data analysis task. What’s worse is that data analysts spend the majority of time preparing the data rather than, you know, analyzing it. The same survey also found unsurprisingly they aren’t too happy about it.
Finding relationships in combinations of diverse data, using graph techniques at scale, will form the foundation of modern data and analytics. This applies to knowledge graphs, to data fabrics, NLP, explainable AI, … GARTNER, INC: ‘Top 10 Trends in Data and Analytics, 2020’ (RITA SALLAM ET AL, May 2020)
Knowledge graphs represent a collection of interlinked descriptions of concepts and entities. These concepts use other concepts to describe each other. The connections made through these descriptions create context. It’s context that enriches meaning and enables understanding.
A knowledge graph can be used as a database because it structures data that can be queried such as through a query language like SPARQL. It can be treated as a graph, a set of vertices and edges. You can apply graph optimizations or operations such as traversals and transformations. It is also a knowledge base, because the data in it bears formal semantics, which can be used to interpret the data and infer new facts.These semantics enable humans and machines to infer new information without introducing factual errors into the dataset.
Knowledge graphs help with data analysis in a number of ways. The use of metadata and especially semantic metadata creates a unified, standardized means to fuse diverse, proprietary and third-party data seamlessly in a format based on how the data is being used rather than what format it is in or where it is stored.
A knowledge graph provides centralized information control for enterprises at a time when it is no longer possible to integrate all transactional information due to its extreme volume, velocity and variety. Having a formal definition that is both machine and human readable of enterprise-level models describing important and shared concepts across all business departments and reach agreement on common meta-data, reference and master data entities has an enormous value.
Why are knowledge graphs important:
The reuse of initial stage boosters saved space travelers a factor of 20. In the world of knowledge graphs we’ve seen factors of 100!
Ontotext worked with a global research-based biopharmaceutical company to solve the problem of inefficient search across dispersed and vast sources of unstructured data. It’s not rocket science, but biopharmaceutical is just as complex. They were facing three different data silos of half a million documents full of clinical study data. Researchers who were trying to design new clinical studies first had to trudge through days of tedious processing with search results of up to 1,000-10,000 results to identify relevant clinical studies.
By using Ontotext’s knowledge graph technology, they were able to achieve a number of benefits:
This led to quicker access to data, improved usefulness of search results, which ultimately provided improved evidence-based decision-making and the efficient design of new clinical studies. Most incredibly, the time it took to retrieve the required information for answering the regulatory questions was reduced from four person days to less than one.
Knowledge graphs are at the heart of Ontotext Platform. In the past, in order to ensure application developers didn’t have to deal with the complexities and nuances of knowledge graphs, it was necessary to build a middleware layer with lots of APIs and canned SPARQL queries underneath. That came with cost overheads and the app server layer could mushroom to be a bigger development task than implementing the knowledge graph itself.
Ontotext’s decades of experience have found a better and simpler way. The use of GraphQL and Shapes ensures that application developers can blissfully avoid hacking SPARQL but without the bloated app server middleware layer.
It ensures your enterprise can take advantage of the semantic approach while avoiding backend development of APIs, tools that simplify data consumption and processing. The strict adherence to open-source standards avoid unpalatable vendor lock-in and maximize interoperability with third party tools and data.