How to Take Back 40-60% of Your IT Spend by Fixing Your Data

Creating a semantic graph foundation helps your organization become data-driven while significantly reducing IT spend

November 3, 2023 11 mins. read Brandon RichardsSumit PalSumit PalMichael AtkinMichael Atkin

Organizations that quickly adapt to changing market conditions have a competitive advantage over their peers. Achieving this advantage is dependent on their ability to capture, connect, integrate, and convert data into insight for business decisions and processes. This is the goal of a “data-driven” organization. However, in the race to become data-driven, most efforts have resulted in a tangled web of data integrations and reconciliations across a sea of data silos that add up to between 40% – 60% of an enterprise’s annual technology spend. We call this the “Bad Data Tax”. Not only is this expensive, but the results often don’t translate into the key insights needed to deliver better business decisions or more efficient processes. 

This is partly because integrating and moving data is not the only problem. The data itself is stored in a way that is not optimal for extracting insight. Unlocking additional value from data requires context, relationships, and structure, none of which are present in the way most organizations store their data today. 

Solution to the Data Dilemma

The good news is that the solution to this data dilemma is actually quite simple. It can be accomplished at a fraction of the cost of what organizations spend each year supporting the vast industry of data integration workarounds. The pathway forward doesn’t require ripping everything out but building a semantic “graph” layer across data to connect the dots and restore context. However, it will take effort to formalize a shared semantic model that can be mapped to data assets, and turn unstructured data into a format that can be mined for insight. This is the future of modern data and analytics and a critical enabler to getting more value and insight out of your data. 

This shift from relational to graph approach has been well-documented by Gartner who advise that “using graph techniques at scale will form the foundation of modern data and analytics” and “graph technologies will be used in 80% of data and analytics innovations by 2025.” Most of the leading market research firms consider graph technologies to be a “critical enabler.” And while there is a great deal of experimentation underway, most organizations have only scratched the surface in a use-case-by-use-case fashion. While this may yield great benefits for the specific use case, it doesn’t fix the causes behind the “Bad Data Tax” that organizations are facing. Until executives begin to take a more strategic approach with graph technologies, they will continue to struggle to deliver the needed insights that will give them a competitive edge. 

Until executives begin to take a more strategic approach with graph technologies, they will continue to struggle to deliver the needed insights that will give them a competitive edge. Share on X

Modernizing Your Data Environment

Most organizations have come of age in a world dominated by technology. There have been multiple technology revolutions that have necessitated the creation of big organizational departments to make it all work. In spite of all the activity, the data paradigm hasn’t evolved much. Organizations are still managing data using relational technology invented in the 1970’s. While relational databases are the best fit for managing structured data workloads, they are not good for ad hoc inquiry and scenario-based analysis.

Data has become isolated and mismatched across repositories and silos due to technology fragmentation and the rigidity of the relational paradigm. Enterprises often have thousands of business and data silos–each based on proprietary data models that are hard to identify and even harder to change. This has become a liability that diverts resources from business goals, extends time-to-value for analysts, and leads to business frustration. The new task before leadership is now about fixing the data itself.

Fixing the data is possible with graph technologies and web standards that share data across federated environments and between interdependent systems. The approach has evolved for ensuring data precision, flexibility, and quality. Because these open standards are based on granular concepts, they become reusable building blocks for a solid data foundation. Adopting them removes ambiguity, facilitates automation, and reduces the need for data reconciliation.

Data Bill of Rights

Organizations need to remind themselves that data is simply a representation of real things (customers, products, people, and processes) where precision, context, semantics, and nuance matter as much as the data itself. For those who are tasked with extracting insight from data, there are several expectations that should be honored– that the data should be available and accessible when needed, stored in a format that is flexible and accurate, retains the context and intent of the original data, and is traceable as it flows through the organization. 

This is what we call the “Data Bill of Rights”. Providing this Data Bill of Rights is achievable right now without a huge investment in technology or massive disruption to the way the organization operates.

Strategic Graph Deployment

Many organizations are already leveraging graph technologies and semantic standards for their ability to traverse relationships and connect the dots across data silos. These organizations are often doing so on a case-by-case basis covering one business area and focusing on an isolated application, such as fraud detection or supply chain analytics. While this can result in faster time-to-value for a singular use case, without addressing the foundational data layers, it results in another silo without gaining the key benefit of reusability.

The key to adopting a more strategic approach to semantic standards and knowledge graphs starts at the top with buy-in across the C-suite. Without this senior sponsorship, the program will face an uphill battle of overcoming the organizational inertia with little chance of broad success. However, with this level of support, the likelihood dramatically increases of getting sufficient buy-in across all the stakeholders involved in managing an organization’s data infrastructure. 

While starting as an innovation project can be useful, forming a Graph Center of Excellence, will have an even greater impact. It can give the organization a dedicated team to evangelize and execute the strategy, score incremental wins to demonstrate value and leverage best practices and economies of scale along the way. They would be tasked with both building the foundation as well as prioritizing graph use cases against organizational focuses. 

One key benefit from this approach is the ability to start small, deliver quick wins, and expand as value is demonstrated. There is no getting around the mandate to initially deliver something practical and useful. A framework for building a Graph Center of Excellence will be published in the coming weeks.

The key to adopting a more strategic approach to semantic standards and knowledge graphs starts at the top with buy-in across the C-suite. Share on X

Scope of Investment Required

Knowledge graph advocates admit that a long tail of investment is necessary to realize its full potential. Enterprises need basic operational information including an inventory of the technology landscape and the roadmap of data and systems to be merged, consolidated, eliminated, or migrated. They need to have a clear vision of the systems of record, data flows, transformations, and provisioning points. They need to be aware of the costs associated with the acquisition of platforms, triplestore databases, pipeline tools, and other components needed to build the foundational layer of the knowledge graph.

In addition to the plumbing, organizations need to also understand the underlying content that supports business functionality. This includes the reference data about business entities, agents, and people. The taxonomies and data models about contract terms and parties, the meaning of ownership and control, notions of parties and roles, and so on. These concepts are the foundation of the semantic approach. These might not be exciting, but they are critical because it is the scaffolding for everything else. 

Initial Approach

When thinking about the scope of investment, the first graph-enabled application can take anywhere from 6-12 months from conception to production. Much of the time needs to be invested in getting data teams aligned and mobilized – which underscores the essential nature of leadership and the importance of starting with the right set of use cases. It need to be operationally viable and solve a real business problem. The initial use case has to be important for the business.

With the right strategic approach in perspective, the first delivery is infrastructure plus pipeline management and people. This gets the organization the MVP including an incremental project plan and rollout. The second delivery should consist of the foundational building blocks for workflow and reusability. This will prove the viability of the approach.

Building Use Cases Incrementally

The next series of use cases should be based on matching functionality to capitalize on concept reusability. This will enable teams to shift their effort from building the technical components to adding incremental functionality. This translates to 30% of the original cost and a rollout that could be three times faster. These costs will continue to decrease as the enterprise expands reusable components – achieving full value around the third year.

The strategic play is not the $3-$5 million for the first few domains, but the core infrastructure required to run the organization moving forward. It is absolutely possible to continue to add use cases on an incremental level, but not necessarily the best way to capitalize on the digital future. The long-term cost efficiency of a foundational enterprise knowledge graph (EKG) should be compared to the costs of managing thousands of silos. For a big enterprise, this can be measured in hundreds of millions of dollars – before factoring in the value proposition of enhanced capabilities for data science and complying with regulatory obligations to manage risks.

Business Case Summary

Organizations are paying a “Bad Data Tax” of 40% – 60% of their annual IT spend on the tangled web of integrations across their data silos. To make matters worse, following this course does not help an organization achieve their goal of being data-driven. The data itself has a problem. This is due to the way data is traditionally stored in rows, columns, and tables that do not have the context, relationships, and structure needed to extract the needed insight. 

Adding a semantic graph layer is a simple, non-intrusive solution to connect the dots, restore context, and provide what is needed for data teams to succeed. While the Bad Data Tax alone quantifiably justifies the cost of solving the problem, it scarcely scratches the surface of the full value delivered. The opportunity cost side, though more difficult to quantify, is no less significant with the graph enabling a host of new data and insight capabilities (better AI and data science outcomes, increased personalization and recommendations for driving increased revenue, more holistic views through data fabrics, high fidelity digital twins of assets, processes, and systems for what-if analysis, and more). 

While most organizations have begun deploying graph technologies in isolated use cases, they have not yet applied them foundationally to solving the Bad Data Tax and fixing their underlying data problem. Success will require buy-in and sponsorship across the C-suite to overcome organizational inertia. For best outcomes, create a Graph Center of Excellence focused on strategically deploying both a semantic graph foundation and high-priority use cases. The key will be in starting small, delivering quick wins with incremental value and effectively communicating this across all stakeholders.

While initial investments can start small, expect initial projects to take from 6-12 months. To cover the first couple of projects, a budget between $1.5-$3 million should be sufficient. The outcomes will justify further investment in graph-based projects throughout the organization, each deploying 30% faster and cheaper than early projects through leveraging best practices and economies of scale. 


The business case is compelling – the cost to develop a foundational graph capability is a fraction of the amount wasted each year on the Bad Data Tax alone. Addressing this problem is both easier and more urgent than ever. Failing to develop the data capabilities that graph technologies offer can put organizations at a significant disadvantage, especially in a world where AI capabilities are accelerating and critical insight is being delivered in near real time. The opportunity cost is significant. The solution is simple. Now is the time to act.

Ontotext is developing a playbook for the design and implementation of a foundational semantic graph layer and strategic graph deployment to help clients turn graph theory into operational reality. Future articles will explore the importance of a Graph Center of Excellence. It will explore use cases, initiation points, requirements for governance, and approaches for successful implementation. Feedback is welcome.

New call-to-action

Article's content

General Manager APAC region at Ontotext

Brandon has spent the last eight years helping hundreds of enterprises across APAC on their graph technology journey at both Ontotext and Neo4j. Prior to that, he spent 6 years at Oracle working with strategic accounts and founded and managed a commercial real estate strategy firm in Silicon Valley for over 4 years. He is currently based in Kuala Lumpur, Malaysia.

Sumit Pal

Sumit Pal

Strategic Technology Director at Ontotext

Sumit Pal is an Ex-Gartner VP Analyst in Data Management & Analytics space. Sumit has more than 30 years of experience in the data and Software Industry in various roles spanning companies from startups to enterprise organizations in building, managing and guiding teams and building scalable software systems across the stack from middle tier, data layer, analytics and UI using Big Data, NoSQL, DB Internals, Data Warehousing, Data Modeling, Data Science and middle tier.

Michael Atkin

Michael Atkin

Managing Director at Content Strategies LLC

Michael Atkin has been an analyst and advocate for data management since 1985. His experience spans from the foundations of the information industry to the adoption of semantic technology. He has served as an advisor to financial institutions, global regulators, publishers, consulting firms and technology companies.