Learn about the potential semantic data integration carries for piecing massive amounts of data together.
On the Semantic Web, the question of “What something is?” is a matter of ontologies rather than a philosophical debate on the essence of being.
An ontology is what computer science uses to tackle the messy matter of meaning. In order to define what something “is” (to a computer program), information technology resorts to the use of ontologies.
One of the most widely used definitions of an ontology reads:
An ontology is a formal, explicit specification of a shared conceptualization.
Cit. What is an Ontology
In this definition, “formal” refers to machine-readable, “shared” refers to agreed upon by a group and “conceptualization” is what defines an abstract model describing a particular field of knowledge.
An ontology’s main purpose is to capture the knowledge about a domain. Generally, it does that by providing sets of machine-readable statements. It also contains links, descriptions and classification of terms as well as the (explicitly defined) relationships between them.
Ontologies can be defined in any formal knowledge representation (KR) language. Historically, they have been defined in all sorts of languages, some of which only veterans remember today: KIF, Conceptual graphs, OIL (the predecessor of OWL), etc.
In recent years, there has been an uptake of ontologies that contain statements expressed as triples and relying on ontology languages such as RDFS or OWL.
Today’s ontologies conceptualize the world by defining classes (unary logical predicates, i.e., such with a single argument) and relationships (binary predicates). This is called Object Oriented Modelling (the OO-model).
It is very close to the Unified Modeling Language (UML) – the general-purpose, developmental, modeling language that is today’s standard for OO modeling. At this level, the only major difference is that RDFS/OWL allows you to define hierarchy or relationships (rdfs:subProperty), while UML does not.
Formal knowledge representation (KR) is about building models of the world, of a particular domain or a problem, which allow for automatic reasoning and interpretation. Such formal models are called ontologies and can be used to provide formal semantics (i.e., machine-interpretable meaning) to any sort of information: databases, catalogs, documents, web pages, etc. The association of information with such formal models makes the information much amenable to machine processing and interpretation.
Cit. Atanas Kiryakov. In Semantic Web Technologies: Trends and Research in Ontology-based Systems; John Davies (Editor), Rudi Studer (Co-Editor), Paul Warren (Co-Editor). pp. 115-138 John Wiley & Sons, Europe.
A program that wants to compare or combine information across the two databases has to know that these two terms are being used to mean the same thing. Ideally, the program must have a way to discover such common meanings for whatever databases it encounters.
Cit. Scientific American: Feature Article: The Semantic Web: May 2001
As messy and unresolvable by design the concepts of understanding and meaning are, they become a bit more clear in the context of machine-readable content.
Because, in order to handle information on our behalf, algorithms need formal definitions and logical descriptions. So computer programs use ontologies to understand or, more precisely, to compute meaning.
Unlike us, humans, computer programs need to have formally represented, explicitly described and categorized knowledge to be able to process the information in it.
If you read the sentence “Did Helena love Paris?”, you are likely to associate it with the Trojan war. But if a computer program reads it, it would need additional information as to compute what this statement is about. In the process of “figuring out” what is what it would need an ontology. It would also have to refer to interlinked meanings to analyze the terms and the relationships between them.
In other words, an ontology is what computer programs use to “look up” terms, to see their connections with other terms, their place within a bigger context and, ultimately, find their meaning. Click To Tweet One such example is how triplestores use ontologies as database schemata.
The whole point of ontologies is that they make the knowledge from a particular field of expertise explicit, shareable and reusable by a machine algorithm. And the reason we need to make meaning explicit for machines is that they have to “know” what is what before they can serve us relevant information when we ask them or complete any task we give them such as booking a flight or sifting through thousands of documents for a relationship between two terms.
In Ontologies for Knowledge Management, Andreas Abecker and Ludger van Elst outline three general uses of ontologies: supporting knowledge visualization, supporting knowledge search, retrieval, and personalization, and serving as the basis for information gathering and integration.
With ontologies, computer programs are given sets of machine-readable statements.
[These statements] facilitate communication and sharing of information between different systems.
As a result, they help machines help us with:
Just as our brain connects ideas, thoughts and events into a meaningful web of concepts, so can we build ontologies to perform the same task.
Cit. Finding the Concept, Not Just the Word: A librarian’s guide to ontologies and semantics, p 12
In order to use, reuse and share the information in various fields, domain experts in collaboration with computer scientists define and organize terms and data (and the relationships they exist in) from a particular domain into an ontology.
Various organizations and groups of individuals are dedicated to building ontologies with both common terms (for example, general-purpose ontologies such as DC, FOAF, Goodrelations) and domain-specific terms such as FIBO (the financial sector), SNOMED (medicine), MarineTLO (the marine domain).
Ontologies, together with other semantic technologies, open up opportunities for a new class of tools to power two of the most important activities in our information age:
and
Adding a layer of semantics (i.e., of meaningful relationships) to existing data, ontologies significantly enrich information, making it interoperable, reusable and shareable. Click To Tweet In an enterprise context, ontologies bring the value of smooth semantic data integration and improved domain knowledge management discovery and preservation.
With the help of ontologies, software agents compare or combine information across different domains of knowledge and expertise and find meaningful connections everywhere. In this way, the opportunities for us to “live, work and learn together” increase exponentially, together with our attempts to tackle the messy matter of meaning.