• Blog
  • Informational

Fighting Fake News: Ontotext’s Role in EU-Funded Pheme Project

April 27, 2017 7 mins. read Milena Yankova

The Fake News

Over the past year, one issue has been dominating both the social media and the political realms: fake news. Twitter and Facebook users have been showered with all kinds of sensational reports, images and links claiming one breaking news scoop after the other.

The vast and fast increase of fake news – often shared and spread via social media – has the potential to alter people’s perception or confirm their beliefs in one issue or another. Sensational fake news posts have often trumped credible, excellently-sourced and critical coverage of various topics, as users are often lost among the heaps of conflicting and contradictory information they are being exposed to every day.

The Proliferation of Fake News

At the end of March, researchers at the Oxford University published a report, which found that in the ten days leading up to the U.S. presidential election, Twitter users based in the battleground state of Michigan shared as many links to fake news (or as the researchers called it ‘junk news’) as they shared links to news by reputable professional news organizations.

“The number of links to junk news alone is roughly equivalent to the number of links to professionally researched journalism,” the Oxford University researchers said in their report that had examined the Twitter behavior of nearly 140,000 Michigan-based potential voters.

Prior to the election, polls had predicted a win for the Democratic candidate Hillary Clinton in the state. Donald Trump won the state by the narrowest of margins to become the first Republican candidate to win Michigan since 1988.

Before and after the U.S. election, and a few months earlier at the time of Brexit, the issue of fake news was dominant in the global and regional digital and political debates. It still is.

People who want to look with a critical eye at the information they come across are searching for ways to detect which news are fake, which sources to trust, and to what extent they can believe the credibility of the original source of the news.

The Proliferation of Fake News

Project PHEME – Detecting and Verifying Rumours On Social Media

Before ‘fake news’ became the latest buzzword, we at Ontotext had started working in January 2014 alongside eight other partners on an EU-funded project aimed at creating a computational framework for automatic discovery and verification of information at scale and fast.

Project PHEME – Computing Veracity Across Media, Languages, and Social Networks, which launched in January 2014 and finished on March 31, 2017, focused on modeling, identifying, and verifying phemes (internet memes with added truthfulness or deception), as they spread across media, languages, and social networks. Aptly named after the goddess of rumors and fame in Greek mythology, Pheme, the project was aimed at developing a smart way to alert users to rumors and misinformation.

Ontotext’s Role in PHEME – Semantic Technology for Separating Rumors from Facts

As a partner in the PHEME project, one of Ontotext’s main contributions has been its semantic graph database GraphDB, which served as a semantic repository with scalable lightweight reasoning. Datasets from the Linked Open Data (LOD) cloud such as FactForge, DBpedia, OpenCyc, and Linked Life Data were used as the factual knowledge sources.

Another major contribution has been to develop an algorithm for rumor classification, which tells users whether a tweet is a rumor or not. However, it is important to point out that the opposite of a rumor is not a fact. Click To Tweet Some tweets, such as “I had a beer in the park” for example, are not rumors, but neither are they really claims that need to be proven. By classifying tweets into rumor/not rumor with some probability (from 1 to 10), the algorithm aims to identify the ones that are intended to spread a rumor.

Using its extensive experience in text analytics, Ontotext has also provided concept extraction and enrichment. By interlinking people, organizations and locations from unchecked social-media streams to the rich contextual information about them in our knowledge base, we know who these people are, which organizations they are associated with, and where they are based. This helps us recognize whether something is a rumor or not.

Pheme Ontotext GraphDB

Ontotext Helps Develop Digital Journalism Prototype and Fact-Checking Assistant Hercule

As part of the project, Ontotext and partners also developed an open-source digital journalism prototype that aims to harness the systems being developed within PHEME and present them in a dashboard geared specifically at journalists looking to quickly locate and verify information online.

In addition, Ontotext and its project partners developed the fact-checking assistant Hercule, a web-based portal that aims to help journalists with the daily tasks of sorting and retrieving newsworthy pieces of information from Twitter. With the help of the PHEME named-entity recognition and resolution tools (linking objects to the respective concept in Linked Open Data), and the application of high-confidence classifiers for rumor detection, veracity calculation and “check-worthiness” calculation, each tweet is enriched with new features and concepts.

The tweets are grouped into stories based on their similarity. Each story can be visualized together with the concepts mentioned in the individual tweets (for example, names of persons, organizations, and locations) and the news articles related to it.

These features provide a greater context to the story, thereby facilitating the verification of claims on the social network. Users can explore each related concept and news article with a single click. In this way, they can quickly get the information they need in order to fact-check the contents of a tweet. They can also view trends in the information about different concepts to see how often the media has mentioned them.

How Semantic Technology Helps Users Quickly Detect Fake News

So, semantic technologies and machine algorithms help social media users and journalists to quickly check to what extent they can trust a post on social networks or a piece of news. Technologies help with analysis and fact-checking and tell us how a piece of shared content has been referred to, to what extent the information in it has been verified, and if the same or similar information can be found in highly-reliable reputable sources.

Technologies save users a lot of the time that they would otherwise spend fact-checking a post in the sea of newly generated content. If it takes you more than 5 minutes to google keywords from an unconfirmed report in order to pull some stories related to the people or organizations mentioned in the post you are fact-checking, chances are that you’ll give up trying because it’s frustrating and takes a lot of your valuable time. Semantic technology comes to the rescue here with referencing the story, providing more information on the topic, and pointing to relevant related stories.

Ontotext’s Pursuit of Helping Users Perfect their News Critical Skills

On top of sowing fear and confusion, the widespread sharing of fake news lowers people’s trust in the media and only by finding a way to fight it, the media can hope to gain back their users’ credibility.

Although there is no universal remedy for eradicating the spreading of fake news once and for all, semantic technology and Ontotext help the fight against its proliferation in an increasingly divisive society by trying to foster a culture of looking critically at any news report and sensational post in the social media. Helping people stay informed and sharpen their critical judgment skills is the beginning of the pursuit of truth in our post-truth world.


Master Publishing Brochure
          New call-to-action

Article's content

A bright lady with a PhD in Computer Science, Milena's path started in the role of a developer, passed through project and quickly led her to product management. For her a constant source of miracles is how technology supports and alters our behaviour, engagement and social connections.

Linked Data Solutions for Empowering Analytics in Fintech

Read about how FinTech can use the power of Linked Data to put data into context and expose various links between these concepts.

Semantic Technology: Creating Smarter Content for Publishers

Learn how semantic technology helps publishers create better content publishing workflows and improved content consumption for readers.

The 5 Key Drivers Of Why Graph Databases Are Gaining Popularity

Read about the 5 key characteristics of graph databases – speed, meaning, answers, relationships, and transformation.

GraphDB Migration Service: The 10-Step Pathway from Data to Insights

Welcome to our GraphDB Migration Service that helps you prepare for migrating your data to GraphDB, walks you through the setup and monitors performance.

Fighting Fake News: Ontotext’s Role in EU-Funded Pheme Project

Read about the EU-funded project PHEME aiming to create a computational framework for automatic discovery and verification of information at scale and fast.

Semantic Technology: The Future of Independent Investment Research

Learn how independent research firms use cutting-edge technologies to add value to research pieces and monetize the content they offer.

Top 5 Semantic Technology Trends to Look for in 2017

Read about the top 5 trends in which Semantic Technology enables enterprises to make sense of their data and fine-tune their offerings to customers.

Ontotext’s 2016: Our Top 7 Webinars Of The Year

Data shows that in 2016 we had a total of 22 webinars that attracted over 7 000 people – here are the 7 best webinars!

Ontotext’s 2016: What Did You Liked The Most On The Blog

Nearly 10 000 people read our blog in 2016 and the following 5 posts gathered most interest.

Linked Data in Regtech: Boosting Compliance and Performance

Learn how regulatory technology, coupled with semantic technology, can help enterprises and financial institutions reduce exposure to risk.

How Data Integration Joined the Music Hit Charts

Learn how today it is the Internet, data integration, and tailored recommendations that stage the music scene for the new Bob Dylans.

Open Data Innovation? Open Your Data And See It Happen

Learn how open data trend-setting governments and local authorities are opening up data sets and actively encouraging innovation.

Linked Data Innovation – A Key To Foster Business Growth

Learn how freely available and machine-readable Linked Open Data enriches organizations’ data and helps them discover new links and insights.

Linked Data Approach to Smart Insurance Analytics

Read about how Linked Data and semantic technology can enrich data and pave the way to advanced analytics.

Linked Data Paths To A Smart Tourism Journey

Read about how the tourism industry can benefit from Linked Data and big data analytics for wiser investments and higher profits.

Linked Data Pathways To Wisdom

Learn about the linked data pathways to wisdom through ‘who’, ‘what’, ‘when’, ‘where’, ‘why’, ‘how to’ and, finally, ‘what is best’.

Taking Semantic Web to its Next Level with Cognitive Computing

Learn about the new age of cognitive computing and integrating its concepts into two decades of semantic web growth.

Open Data Play in Sports Journalism And EURO 2016

Read about how open data gives those modern-day Sherlocks the bases of their stories.

Open Data Sources for Empowering Smart Analytics

Learn how Open Data and how more businesses use data analytics to gain insights, predict trends and make data-driven decisions.

Journalism in the Age of Open Data

Learn how governments and authorities can start relying more on journalism to promote the use of open data and its social and economic value.

Building Linked Data Bridges To Fish In Data Lakes

Learn how enterprises can build bridges to extracting more powerful and more relevant insights from their Big Data analytics.

Open Data Use Cases In Five Cities

Learn how London, Chicago, New York, Amsterdam and Sofia deal with open data and extract social and business value from databases.

ODI Summit Take Out: Open Data To Be Considered Infrastructure

Learn about The ODI’s second Summit with prominent speakers such as Sir Tim Berners-Lee, Martha Lane Fox and Sir Nigel Shadbolt.

Highlights from the “Mining Electronic Health Records for Insights” Webinar

Read some of the Q&As from our webinar “Mining Electronic Health Records for Insights”.

Highlights from ISWC 2015 – Day Three

The 14th International SemanticWeb Conference started three days ago and Ontotext has been its most prominent sponsor for 13 years in a row.

Highlights from ISWC 2015 – Day Two

The 14th International SemanticWeb Conference started three days ago and Ontotext has been its most prominent sponsor for 13 years in a row.

Overcoming the Next Hurdle in the Digital Healthcare Revolution: EHR Semantic Interoperability

Learn how NLP techniques can process large volumes of clinical text while automatically encoding clinical information in a structured form.

Highlights from ISWC 2015 – Day One

The 14th International SemanticWeb Conference started three days ago and Ontotext has been its most prominent sponsor for 13 years in a row.

Text Mining to Triplestores – The Full Semantic Circle

Read about the unique blend of technology offered by Ontotext – coupling text mining and RDF triplestores.

Text Mining & Graph Databases – Two Technologies that Work Well Together

Learn how connecting text mining to a graph database like GraphDB can help you improve your decision making.

Semantic Publishing – Relevant Recommendations Create a Unique User Experience

Learn how semantic publishing can personalize user experience by delivering contextual content based on NLP, search history, user profiles and semantically enriched data.

Why are graph databases hot? Because they tell a story…

Learn how graph databases like GraphDB allow you to connect the dots and to tell a story.