• Blog
  • Informational

Open Data Sources for Empowering Smart Analytics

June 16, 2016 6 mins. read Milena Yankova

ontotext, open data, open data analytics

Finding Open Data sources is a walk in the park: a simple search leads to hundreds of pages of datasets. Governments, NGOs and organizations keep on aggregating and publishing Open Data and more and more businesses and developers use data analytics to gain insights, predict trends and make data-driven decisions.

Why Open Data?

In recent years, Open Data has opened the doors to easier and more efficient ways of finding and analyzing huge datasets. Yet, the true value of Open Data sources is finding ways and using tools to explore, analyze and reuse all that content to minimize efforts and maximize returns.

Once an organization has decided which open datasets it will use, Semantic Technology and more specifically semantic graph databases can help it enrich and classify entities with Linked Open Data, identify relationships between concepts, and disambiguate one concept from another. Linking Open Data is increasingly turning into a means for organizations to stay ahead of competitors.

Government Open Data Sources

Still, the first step to analyzing Open Data is to use as reliable sources of datasets as possible.

Our search for Open Data sources begins with government data.

Data.gov

The US Government’s Open Data portal Data.gov has 195,000-plus federal and local datasets on topics ranging from agriculture and education to finance and climate.

However, sometimes datasets need more visuals and more user-friendly experience to become easier to search and time- and cost-efficient to use.

Data USA

In early April 2016 the MIT Media Lab, in cooperation with Deloitte and Datawheel, launched the Data USA website, a “visualization engine of public US Government data that tells stories about America”, as the developers had put it. Users can search for any of the four categories – locations, industries, occupation or education.

For example, if you type Boston or Pittsburgh in the search field, the website shows an aerial photo of the city with its population, median household income and median age as main statistics. Links below lead to five other categories for each city: economy, demographics, education, housing & living, and health & safety.

The website also features profiles on cross-topics such as ‘Most Common Universities for Computer Science’ or ‘Gender Pay Gap in Connecticut’.

data.gov.uk

In Europe, we find the UK government’s data.gov.uk website, which has aggregated Open Data on topics such as environment, society, towns, and business & economy from government bodies and agencies.

European Data Portal

Europe-wide, we have the European Data Portal developed by the European Commission. This source has nearly 430,000 datasets tagged under various categories, including environment, economy & finance, education, culture & sport.

Global Organizations

On a supra-governmental and supra-continent level, the World Health Organization and UNICEF provide Open Datasets with statistics on hunger, diseases, deaths, children and women’s health. The World Bank has a free and open access to data about development in countries around the globe, with economy & growth, health, education, environment and climate change featured. So does the OECD data portal.

If you are not sure where you to begin, OpenDataSoft has compiled a list of more than 2,500 Open Data source portals by country.

Linked Open Data

Googling Open Data Sources

Still, browsing all these portals each at a time is sometimes tedious and always time- and resource-consuming. Google Public Data – though not as comprehensive as the separate statistics websites – has aggregated some of the most popular and reliable official sources and key economic and health indicators across the world.

For example, a random try search for wages in the US in the ‘Metrics’ menu shows the ‘Compensation of Employees’ report by the U.S. Bureau of Economic Analysis, with charts and comparisons by region or by industry.

Google Trends gives info on the search habits, traffic and interest over time on searches, with historical data dating back to 2004. It also contains infographics on the search interest in global trending topics such as the US Presidential Elections, the Panama Papers, the Brussels attacks or the Zika virus.

Crowdsourcing for Open Data

Users and developers are not only browsing for data, they are actively contributing to creating open databases, DBpedia and GeoNames being the most notable examples.

Corporate Data as Open Data Sources

Businesses and organizations may be increasingly using Open Data sources to support decisions but they are reluctant to publish their proprietary data except for statutory filings. If you want to have some basic company info, browsing the websites of the US Securities and Exchange Commission (SEC) or UK’s Financial Conduct Authority (FCA), to name just these two, this is a dull and often unproductive task.

OpenCorporates, a company based at the Open Data Institute (ODI), contains basic data on almost 100,000,000 companies around the world. OpenCorporates has also designed visuals using several sources: filings to the SEC, banking data held by the National Information Center of the Federal Reserve System in the US, and information about individual shareholders published by the official New Zealand corporate registry.

The visualizations show all (the thousands of) subsidiaries in all countries of BP, Bank of America, Citigroup, Goldman Sachs, Morgan Stanley, JP Morgan and Wells Fargo.

Another organization, Berlin-based OpenOil, has collected more than 1 million corporate filings related to the oil, gas and mining industries. It has indexed the full text of contracts, company disclosures, news articles and government reports, which allows users to simultaneously check documents from different sources.

Lunking Open Data

Gaining Insights from Open Data

The number of open datasets is only set to grow and so is the need for organizations to have tools to rapidly analyze data in order to have the upper hand in a fierce competitive environment. Linked Open Data and Semantic Technology help organizations boost data analytics by building ranking reports, viewing topics linked implicitly, drawing trend lines, and extending analytics with additional data sources. Share on X

Apart from generating economic and social value, Open Data creates new business models and opportunities. More and more organizations are and will be embracing smart analytics to create additional value for their stakeholders, users and customers.

 

          New call-to-action

Article's content

A bright lady with a PhD in Computer Science, Milena's path started in the role of a developer, passed through project and quickly led her to product management. For her a constant source of miracles is how technology supports and alters our behaviour, engagement and social connections.

Linked Data Solutions for Empowering Analytics in Fintech

Read about how FinTech can use the power of Linked Data to put data into context and expose various links between these concepts.

Semantic Technology: Creating Smarter Content for Publishers

Learn how semantic technology helps publishers create better content publishing workflows and improved content consumption for readers.

The 5 Key Drivers Of Why Graph Databases Are Gaining Popularity

Read about the 5 key characteristics of graph databases – speed, meaning, answers, relationships, and transformation.

GraphDB Migration Service: The 10-Step Pathway from Data to Insights

Welcome to our GraphDB Migration Service that helps you prepare for migrating your data to GraphDB, walks you through the setup and monitors performance.

Fighting Fake News: Ontotext’s Role in EU-Funded Pheme Project

Read about the EU-funded project PHEME aiming to create a computational framework for automatic discovery and verification of information at scale and fast.

Semantic Technology: The Future of Independent Investment Research

Learn how independent research firms use cutting-edge technologies to add value to research pieces and monetize the content they offer.

Top 5 Semantic Technology Trends to Look for in 2017

Read about the top 5 trends in which Semantic Technology enables enterprises to make sense of their data and fine-tune their offerings to customers.

Ontotext’s 2016: Our Top 7 Webinars Of The Year

Data shows that in 2016 we had a total of 22 webinars that attracted over 7 000 people – here are the 7 best webinars!

Ontotext’s 2016: What Did You Liked The Most On The Blog

Nearly 10 000 people read our blog in 2016 and the following 5 posts gathered most interest.

Linked Data in Regtech: Boosting Compliance and Performance

Learn how regulatory technology, coupled with semantic technology, can help enterprises and financial institutions reduce exposure to risk.

How Data Integration Joined the Music Hit Charts

Learn how today it is the Internet, data integration, and tailored recommendations that stage the music scene for the new Bob Dylans.

Open Data Innovation? Open Your Data And See It Happen

Learn how open data trend-setting governments and local authorities are opening up data sets and actively encouraging innovation.

Linked Data Innovation – A Key To Foster Business Growth

Learn how freely available and machine-readable Linked Open Data enriches organizations’ data and helps them discover new links and insights.

Linked Data Approach to Smart Insurance Analytics

Read about how Linked Data and semantic technology can enrich data and pave the way to advanced analytics.

Linked Data Paths To A Smart Tourism Journey

Read about how the tourism industry can benefit from Linked Data and big data analytics for wiser investments and higher profits.

Linked Data Pathways To Wisdom

Learn about the linked data pathways to wisdom through ‘who’, ‘what’, ‘when’, ‘where’, ‘why’, ‘how to’ and, finally, ‘what is best’.

Taking Semantic Web to its Next Level with Cognitive Computing

Learn about the new age of cognitive computing and integrating its concepts into two decades of semantic web growth.

Open Data Play in Sports Journalism And EURO 2016

Read about how open data gives those modern-day Sherlocks the bases of their stories.

Open Data Sources for Empowering Smart Analytics

Learn how Open Data and how more businesses use data analytics to gain insights, predict trends and make data-driven decisions.

Journalism in the Age of Open Data

Learn how governments and authorities can start relying more on journalism to promote the use of open data and its social and economic value.

Building Linked Data Bridges To Fish In Data Lakes

Learn how enterprises can build bridges to extracting more powerful and more relevant insights from their Big Data analytics.

Open Data Use Cases In Five Cities

Learn how London, Chicago, New York, Amsterdam and Sofia deal with open data and extract social and business value from databases.

ODI Summit Take Out: Open Data To Be Considered Infrastructure

Learn about The ODI’s second Summit with prominent speakers such as Sir Tim Berners-Lee, Martha Lane Fox and Sir Nigel Shadbolt.

Highlights from the “Mining Electronic Health Records for Insights” Webinar

Read some of the Q&As from our webinar “Mining Electronic Health Records for Insights”.

Highlights from ISWC 2015 – Day Three

The 14th International SemanticWeb Conference started three days ago and Ontotext has been its most prominent sponsor for 13 years in a row.

Highlights from ISWC 2015 – Day Two

The 14th International SemanticWeb Conference started three days ago and Ontotext has been its most prominent sponsor for 13 years in a row.

Overcoming the Next Hurdle in the Digital Healthcare Revolution: EHR Semantic Interoperability

Learn how NLP techniques can process large volumes of clinical text while automatically encoding clinical information in a structured form.

Highlights from ISWC 2015 – Day One

The 14th International SemanticWeb Conference started three days ago and Ontotext has been its most prominent sponsor for 13 years in a row.

Text Mining to Triplestores – The Full Semantic Circle

Read about the unique blend of technology offered by Ontotext – coupling text mining and RDF triplestores.

Text Mining & Graph Databases – Two Technologies that Work Well Together

Learn how connecting text mining to a graph database like GraphDB can help you improve your decision making.

Semantic Publishing – Relevant Recommendations Create a Unique User Experience

Learn how semantic publishing can personalize user experience by delivering contextual content based on NLP, search history, user profiles and semantically enriched data.

Why are graph databases hot? Because they tell a story…

Learn how graph databases like GraphDB allow you to connect the dots and to tell a story.