• Blog
  • Informational

Linked Leaks: A Smart Dive into Analyzing the Panama Papers

May 20, 2016 5 mins. read Atanas Kiryakov

Diving in Panama Papers and Open Data

What do David Cameron, Pedro Almodovar and Leo Messi have in common? No, the Argentinian footballer doesn’t star in the Spanish director’s latest movie. Neither does the UK prime minister. These three people – alongside thousands of other rich and powerful celebrities, business executives and politicians – have been linked to companies in the Panama Papers leak in recent weeks.

Listen to our webinar recording: Diving in Panama Papers and Open Data to Discover Emerging News to see how to empower your data analytics with Open Data.

‘The Biggest Leak in History’

When the news of the 2.6TB of data on shell companies broke in early April, it immediately became viral and has been trending ever since. Revenue agencies and government officials around the world pledged to fight tax avoidance in tax havens, which, though not illegal, are the secret coffers the rich and powerful one-percenters have been using to reduce their tax rates.

A month later, on May 9th, the International Consortium of Investigative Journalists (ICIJ), which broke the news, released a searchable database of more than 300,000 entities from the Panama Papers and Offshore Leaks investigations.

The names of David Cameron and Lionel Messi do not appear in the Panama Papers. In the wake of the leak, though, Cameron admitted that before becoming prime minister in 2010, he had owned shares in a tax-haven fund set up by his late father.

On the other hand, Messi is believed to have avoided taxes via the company Mega Star Enterprises, which he reportedly owns together with his father Jorge Horacio Messi. And, finally, Almodovar said at the Cannes Film Festival that he was one of the least important names cited in the Panama Papers.

Panama Papers Dataset Enriched by Linked Data Portal

For two months now journalists and the general public have been wondering who’s also in the Panama Papers and which shareholders are connected with which corporations in which countries. A simple search of a single name or organization in a database, however, may prove tedious and enormously time-consuming.

Using the ICIJ database content and other open data sources, we, at Ontotext, created the Linked Leaks linked data Knowledge Graph database of the Panama Papers. Thus, the linked data project comes into play to enrich the data with semantics, link the dataset to other Linked Open Datasets, and provide richer findings while searching through the Panama Papers.

The Knowledge Graph portal also encourages data analytics enthusiasts, journalists and developers to dive into and dig for additional information in the Panama Papers.

Playing with Linked Leaks allows for various types of analytics queries to discover relationships between companies, shareholders, countries and chains of control. The Linked Leaks demonstration service gives an all-new perspective of the Panama Papers, linking the leaked data to open-data information about countries and geographical regions. Click To Tweet

Linked Leaks, which contain more than 22 million RDF statements, also serve as a kind of ‘Investigative Reporting Workbench’, allowing for asking smart questions in SPARQL and showcasing the role of Linked Data in Investigative data journalism. Analytics enthusiasts can also freely download the Linked Leaks data in RDF for on-premise analytics and for building applications using the data.

Putting the Panama Papers into Context

The Linked Leaks Knowledge Graph, published according to the Linked Open Data principles, has already been developed to link the Panama Papers to information on countries and geographical regions from the DBpedia and GeoNames resources, and links to more datasets will be added.

These datasets help all sorts of discovery and analytics queries. For example: companies related to a given shareholder (person or organization), including control relationships; companies that control other companies in the same country, through a company in an offshore zone; or most popular offshore jurisdictions.

Linked Leaks: A Smart Dive into Analyzing the Panama Papers

‘The Game of Queries’ in Linked Leaks

By asking smart questions in SPARQL in Linked Leaks, everyone can get richer findings to their investigative search of the Panama Papers.

Now let’s take a look at a few sample queries.

As you can see, many sorts of interlinked cross-queries can be asked in the Linked Leaks graph database. Ontotext is just starting to explore the possibilities and opportunities of asking smart questions about the Panama Papers and is working to further enrich the Linked Leaks with new relations, additional mappings and new sample queries to fine-tune the raw data interpretation and analysis.

We at Ontotext also plan to map this data to the Financial Industry Business Ontology (FIBO), so that one can query and analyze the data using its semantics.

Follow #LinkedLeaks @Twitter and post your #LinkedLeaks questions and queries!

Dive in our webinar: Diving in Panama Papers and Open Data to Discover Emerging News to see a live demo of news and open data analytics!

Article's content

CEO at Ontotext

Atanas is a leading expert in semantic databases, author of multiple signature industry publications, including chapters from the widely acclaimed Handbook of Semantic Web Technologies.

Benchmark Results Position GraphDB As the Most Versatile Graph Database Engine

GraphDB is the first engine to pass both LDBC Social Network and Semantic Publishing benchmarks, proving its unique capability to handle graph analytics and metadata management workloads simultaneously.

Ontotext Expands To Help More Enterprises Turn Their Data into Competitive Advantage

Join us for a review of our accomplishments and plans for the next few years. Have a cup of tea or a glass of wine and enjoy the story!

Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes

Read about how to use reasoning to enrich big knowledge graphs with new facts and relationships, avoiding the typical pitfalls and reaping all the benefits

At Center Stage IV: Ontotext Webinars About How GraphDB Levels the Field Between RDF and Property Graphs

Read about how GraphDB eliminates the main limitations of RDF vs LPG by enabling edge properties with RDF-star and key graph analytics within SPARQL queries with the Graph Path Search plug-in.

The Semantic Web: 20 Years And a Handful of Enterprise Knowledge Graphs Later

Read about how the Semantic Web vision reincarnated in thousands of Linked Open Data datasets and millions of Schema.org tagged webpages. And how it enables knowledge graphs to smarten up enterprises data.

Ontotext Comes of Age: Increased Efficiency, New Technology, Big Partners and Big AI Plans

Read about the important and exciting developments in Ontotext as we are closing up 2018.

Linked Leaks: A Smart Dive into Analyzing the Panama Papers

Learn about how, to help data enthusiasts and investigative journalists effectively search and explore the Panama Papers data, Ontotext created Linked Leaks.

Practical Big Data Analytics For Financials

Learn more about the benefits of big data – from keeping up with compliance standards & increasing customer satisfaction to revenue increase.

Triplestores are Proven as Operational Graph Databases

Dive into the theory of how RDF triplestores work and how they can support graph-traversal efficiently.

Industry Relevance of the Semantic Publishing Benchmark

Learn how the Semantic Publishing model for using Semantic Technology in media and how the Semantic Publishing Benchmark is utilized by organizations to tag information.