SIRMA AI LTD. (“СИРМА АИ” ЕООД)
38A,
TEL: (359 2) 9810018, 9812338, FAX: (359 2)
9819058
http://www.OntoText.com, info@OntoText.com
BOR
a Pragmatic
DAML+OIL Reasoner
System Documentation
Overview, Installation, Inegration
21.11.2002
Stansilav
Jordanov, Kiril Simov
![]()
BOR - the subject
of this document - was developed under the On-To-Knowledge project
(IST-1999-10132). This document
contains parts of deliverables D38, D40, and D41 were BOR was first announced.

Abstract
BOR is a reasoner that complies with the DAML+OIL model-theoretical semantics. Most of the classic reasoning tasks for the description logics are available, including realization and retrieval. Few innovative services, such as model checking and minimal ontology extraction, are also implemented. The full set of functional interfaces allows a high level of management and querying of DAML+OIL ontologies, represented in Sesame repository.
BOR integrates smoothly with Ontology Middleware and the Sesame RDF(S) repository, thus providing easy and intuitive (non-DL) interfaces and rich software infrastructure.
Availability
BOR is available free of charge for
non-commercial purposes. For commercial use, please, contact info@ontotext.com. BOR
is publicly available for demonstration at the Onto
Contents
2 BOR’s Integration into Sesame – SeBOR
4 Limitations and Known Issues
4.1 Inconsistency:
Literal sameClassAs Literal in DAML+OIL schema
4.2 Why
we need separate properties for definitions and qualifications?
6 How Much of DAML+OIL is Supported
6.1.1 ‘Well formed’ definitions
6.1.2 Supported restriction types
6.2.1 Abstract and concrete roles
6.2.2 Concrete roles specifics
8.2 SeBOR (BOR integrated with Sesame)
BOR is a Description Logic (DL) reasoner
implemented in Java. It is available both as a standalone application/module
and as Sesame
plug-in.
Further in this document the term `BOR' will refer to the core reasoner module and `SeBOR' (a shortcut for `Sesame-BOR') will refer to the Sesame-integrated-BOR.
There are at least two possible ways to introduce BOR. Both are different views on the same mater. One is the Description Logic approach and the other is the DAML+OIL approach. This document does not follow strictly any of them. The author hopes the resulting mixture is still useful and does not confuse too much the untrained part of the auditory.
It is worth mentioning that BOR and its structures are closer or the DL approach while its descendant - SeBOR is naturally closer to the DAML+OIL approach.
BOR can be used with Sesame in a way as simple as setting up
a new repository with the BorSail included in its sail stack. As SeBOR operates
on a subset of the DAML+OIL language a BOR-ed repository is not a normal RDF(S)
repository – you still have full RDF(S) read access to it but you can not add
arbitrary RDF(S) statements.
Currently SeBOR lacks any sophisticated GUI for ontology
development – the user should “manually” prepare the ontology and/or instance
data in either DAML+OIL or KRSS format and “upload” it to SeBOR for further
processing.
This further processing is actually “pre-reasoning” – when
new knowledge is uploaded into a BOR-ed repository, SeBOR (re)calculates the
taxonomy[1]
and classifies all the instance resources with respect to the DAML+OIL
semantics. The result is asserted as subClassOf
and type
statements in the repository, for the benefit of “the plain RDF(S) world” (i.e.
the rest of Sesame). More details about the integration are provided in this
section.
From a description logic’s point of view RDF(S) has rather
week means of expressing relationships between classes. Namely RDF(S) utilizes
simple named classes (no class expressions) and the only way to relate them is
through the subClassOf
property. And while the subClassOf
relationship tends to be sufficient to describe an ontology it is actually the
lack of class expressions that makes the ontology development in a plain RDF(S)
environment such a hard task to accomplish.
That was the main purpose of BOR’s integration into Sesame –
ease the process of ontology development and make the resulting taxonomy and
classification available to Sesame users at all levels.
SeBOR uses the same repository for both explicit
‘definition’ statements provided by the user(s) and ‘qualification’ statements
inferred by itself. If for instance, when building the taxonomy BOR infers that
C is a
direct sub-class of D,
then <C, subClassOf,
D> statement will be added in the same repository where the
‘original’ statements of the ontology reside. Although this mixture is useful
from Sesame point of view it raises the requirement for efficient means of
distinguishing statements of those two disjoint types.
For that purpose a new set of properties is introduced with
the schema:
http://www.ontotext.com/otk/2002/07/bor
Properties defined in this schema are said to be
BOR-specific or special properties. Statements utilizing a special property are
said to be special statements. These special predicates are exclusively used in
pre-reasoned statements. BOR treats every non-special explicit statement in the
repository as a definition.
When BOR
builds the taxonomy the immediate super classes of a class are stated using the
bor:subClassOf
property. Having this property defined as a sub-property of rdfs:subClassOf,
according to the model-theoretic semantics, the RDFS inference engine adds the
appropriate implicit statements, such as <A, rfds:subClassOf, B>,
where <A,
bor:subClassOf, B> statement is available plus the transitive
closure of the rdfs:subClassOf
relation. The described approach has the following advantages:
·
Clear separation between definitions and
qualifications;
·
Easy extraction of immediate sub/super-classes
of a given class;
·
Easy to clear qualification subpart of the repository;
·
The structure of the taxonomy is explicitly
available to other (only RDF(S) but not DAML+OIL aware) Sesame layers
Analogously, when the result from the instance
classification is stored in the repository the special property bor:type (a sub-property
of rdf:type)
is used to state the most specific types of an instance.
A
typical usage of SeBOR would be:
1.
Upload ontology and instance data through the DAML+OIL
or KRSS Import interfaces;
2. SeBOR
will be automatically invoked to build the taxonomy, classify instance data and
store the results into the repository as already explained;
3. Then
repeatedly access the repository in a read-only manner (e.g. RQL queries)
This scenario can be demonstrated as follows. Imagine we
have a DAML+OIL repository (ontology plus some instance data) as presented
below:
After the uploading of the above knowledge, BOR infers that:
·
C1 is super class of both A and B
·
C2 is a sub-
·
I1 is instance of C1 and C2,
i.e. I:C1 and I:C2
At the diagram, the
instantiation links are given with dashed lines, while the subsumption links
with non-dashed. The black links are explicitly defined, while the orange ones
are those inferred by BOR.
This approach will be inefficient for applications for which
the update operations are relatively more often compared to the read
operations. However, the integration of BOR with Sesame provides the most easy
to use DAML+OIL-aware platform compared to the “competitors”:
o
The ontology editor OILEd (http://oiled.man.ac.uk/)
coupled with FACT reasoner (http://www.cs.man.ac.uk/~horrocks/FaCT/).
There is no query language within this setup – the only action possible is to
check the consistency of an ontology.
FACT itself has no DAML+OIL import;
o
TRIPLE (http://triple.semanticweb.org/)
– an RDF query, inference, and transformation language for the Semantic Web. It
supports DAML+OIL in combination with RACER and FACT reasoner. Again there is
no support for intensive DAML+OIL modification scenario.
o
OpenCyc, "Taxonomic inferences provided by
OpenCyc for DAML ontologies" available at http://opencyc.sourceforge.net/daml/daml-taxonomic-inferences.html.
A similar approach is taken, however, OpenCyc was still not available when
publishing this document.
As a typical back-end tool SeBOR has not much UI. When a BOR-ed repository is used through the Sesame web-interface, the menu in the upper frame gets extended with a third line of actions (“BOR actions”, as seen at the screenshot below).

· DAML+OIL import – similar to Sesame’s “Add (www)” function. User supplies URL of a DAML+OIL document and (optionally) a base URL. SeBOR adds the classes and instance data found in the document to the repository. Then recalculates the taxonomy and the classification of instances.
· KRSS Import – similar to DAML+OIL import except that the file is in KRSS format and is http-uploaded from the client to the server.
· TBox reload – forces SeBOR to reload its TBox (i.e. ontology) from the repository. Additionally a KRSS dump of the ontology is printed.
· Subonto – extracts (in the form of XML encoded DAML) the part of the ontology that is relevant to the instance data (i.e. “strips” unused classes).
· Invoke … - enables direct invocations of some of BorSail’s methods. This function is present for debugging purposes.
A BOR-ed repository named “SeBor” is available for
experiments at the OntoText public OMM Server at http://omm.ontotext.com
The “SeBor (Plain)” repository gives a plain RDF(S) view to
the same underlying DB repository.
As expected, number of issues appeared to be problematic
with respect to DAML+OIL and its schema. Some of those (as the first one below)
appeared to be general problems rather than just an implementation issue for
BOR.
It is a problem with the so-called "Importing terms
from RDF/RDFS" part of the DAML+OIL schema, which appears even
without the MTS (Model-Theoretic Semantics) of RDFS. The "Importing
terms" as they are now are wrong and it seems nobody imported
nothing automatically using them.
Here is the problem:
1. It is defined in the
schema that
<daml:Class, subClassOf,
rdfs:Class>
if we simplify it a bit, it is
because the RDF(S) notion for class is much weaker – it includes any sorts of
classes in contrast to the DAML+OIL one which takes just "object
classes", without datatype classes (such as Literal) and relation
classes (such as Property).
2. What is said in the
"Importing terms"
<rdfs:Class
rdf:ID="Literal">
<sameClassAs
rdf:resource="http://www.w3.org/2000/01/rdf-schema#Literal"/>
</rdfs:Class>
is intuitively OK, and can
distilled into triples as follows:
<daml:Literal, type,
rdfs:Class>
<daml:Literal, sameClassAs,
rdfs:Literal>
3. However the domain and range
of daml:sameClassOf according
to the schema are
<sameClassOf, domain,
#Class>
<sameClassOf, range, #Class>
and as part of the DAML+OIL
schema #Class
means daml:Class
4.
From 3 both the MT semantics of RDF(S) and (all sorts
of) DAML+OIL semantics support the inference that:
<X, sameClassAs, Y> => <X,
type, daml:Class> and <Y, type, daml:Class>
If 3 and 4 correctly represent
the intended semantics of sameClassAs,
i.e. sth that declares two daml+oil "object classes" being
equivalent, than no one should use it for rdfs:Class-es,
without realizing that it implicitly defines them being also daml:Class-es.
This is the case with Literal, as follows
5. From 2 and 4 it follows that:
<rdfs:Literal,
type, daml:Class>
which is confusing because
it is much stronger (and incorrect) than what we already know about the rdfs:Literal,
namely
<rdfs:Literal, type, rdfs:Class>
It is the very same problem with the way rdfs:Property is imported
in the DAML+OIL schema.
What can be done?
o
First option: the definition of sameClassOf
should be relaxed wrt to the domain and range restrictions, becoming:
<sameClassOf, domain, rdfs:Class>
<sameClassOf, range,
rdfs:Class>
It seems to us, however, that this is going to be
problematic with the MTS of DAML+OIL.
o
Second option: If sameClassOf domain and
range restrictions cannot be relaxed, than the Importing Terms in the DAML+OIL
specification have to be reformulated in a way, especially those for Literal and Property classes.
It can be handled as follows:
<daml:Literal, type, rdfs:Class>
<daml:Literal, equivalentTo,
rdfs:Literal>
And even better we can more precisely say that Literal is a data type
class rather then a general one, so, replacing the first triple above
with:
<daml:Literal, type,
daml:Datatype>
So, finally, we propose the import of Literal to look like:
<Datatype rdf:ID="Literal">
<equivalentTo rdf:resource="http://www.w3.org/2000/01/rdf-schema#Literal"/>
</Datatype>
For the sake of completeness, it can also be added that
<daml:Literal, subClassOf,
rdfs:Literal>
This is a detailed example of the issues discussed in the Design Overview.
Let us consider the following ontology:
(def-c Parent-of-happy-child (some
has-child Happy))
(def-c Parent-of-rich-child (some
has-child Rich))
(def-pc Proud-parent
(or Parent-of-happy-child
Parent-of-rich-child))
Its RDF(S) triple representation will be:
(“Happy” rdf:type
daml:Class)
(“Rich”
rdf:type daml:Class)
(“Parent-of-happy-child” rdf:type daml:Restriction)
(“Parent-of-happy-child” daml:onProperty “has-child”)
(“Parent-of-happy-child” daml:hasClass “Happy”)
(“Parent-of-rich-child” rdf:type
daml:Restriction)
(“Parent-of-rich-child” daml:onProperty “has-child”)
(“Parent-of-rich-child” daml:hasClass “Rich”)
(“Proud-parent” rdf:type
daml:Class)
(“Proud-parent” rdfs:subClassOf
“_0”)
(“_0” rdf:type daml:Class)
(“_0” daml:intersectionOf
“_1”)
(“_1” rdf:type daml:List)
(“_1” daml:first “Parent-of-happy-child”)
(“_1” daml:rest “_2”)
(“_2” rdf:type daml:List)
(“_2” daml:first “Parent-of-rich-child”)
(“_2” daml:rest daml:nil)
Where _0,
_1,
etc. are IDs reserved for anonymous resources. Note that as a result of the
addition of these statements the repository will infer and add a number of
(implicit) statements.
Now, let us suppose at this point BOR is invoked to build
the taxonomy. As this is rather expensive operation the result (i.e. the
taxonomy tree) will be stored in the repository. Eventually this will add the
following RDF(S) statements to the repository:
(daml:Nothing rdfs:subClassOf “Proud-parent”)
(“Proud-parent” rdfs:subClassOf “Parent-of-happy-child”)
(“Proud-parent” rdfs:subClassOf “Parent-of-rich-child”)
(“Parent-of-happy-child”
rdfs:subClassOf daml:Thing)
(“Parent-of-rich-child” rdfs:subClassOf daml:Thing)
Note that the repository will infer the transitive closure
of the subClassOf
property. The resulting state of the repository will not be eligible for
loading into BOR again. Take, for example, Proud-parent. Repository
will contain at least the following statements about it:
(“Proud-parent” rdfs:subClassOf “Parent-of-happy-child”)
(“Proud-parent” rdfs:subClassOf “Parent-of-rich-child”)
(“Proud-parent” rdfs:subClassOf “_0”)
There are certain problems:
Problem 1
How to determine which of the statements are definitive and which are
qualificative (i.e. pre-reasoned by BOR)?
Problem 2
The extraction of the immediate sub/super classes of a given class will be much
too cumbersome due to the transitive closure of the subClassOf property the
repository has introduced. (Note: this is not quite right – a distinction on
the base of the ‘explicit’ flag is possible)
The proposed solution
Introduce a number of custom classes
and properties that SeBOR will utilize in order to make a distinction between
definitions and qualifications.
(see http://www.ontotext.com/otk/2002/07/bor for complete list)
Let us discus the capabilities of BOR.
BOR has support for:
Currently BOR has no support for:
Let’s try to
categorize these requirements:
There are 3 distinct concept types:
(enum Color (red green blue
yellow)) (subrole hasSon hasChild)
(transitive hasPredecessor)
(range dressColor Color)
(primitive-concept
Human) (define-primitive-concept
Male Human) (define-concept Female (and
Human (not Male))) Generally the third argument of any
concept definition is a concept expression.
(not Human) (and Human Female) (or Male Female) (all hasChild Male)
(some hasChild Female)
(none role
Concept) == (all role (not Concept))
(at-least
3 hasChild Male) (at-least
1 hasChild Female) (exactly n
role Concept) == (and (at-least n role
Concept)
(at-most n role Concept)) (all age
"[18,+oo)") (some
dressColor "red, yellow") (none role
Dataset) == (all role (not Dataset))From a DL’s
(description logic) point of view DAML+OIL is a quite expressive language. As
described in previous paragraph the DL language utilized by BOR is far from the
expressiveness of DAML+OIL but it has one major advantage – it is tractable.
This chapter describes the conditions a DAML+OIL source should satisfy in order
to be successfully processed by SeBOR.
Current version of
BOR can handle only unfoldable terminologies. This means a single definition
for each non-primitive concept, no cyclic definition dependencies and no GCIs (general concept inclusion
axioms).
Due to this
restriction class definitions should be ‘well formed’. Let’s formalize this:
Definition of a
DAML+OIL class is every statement whose subject is the class resource and its
predicate is one of the properties:
A well defined class
resource is one that either has a single definition or one that has multiple
definitions and each of them is a subClassOf statement.
Note that a well
formed DAML+OIL class definition is trivially converted into an equivalent
concept definition axiom that BOR can handle. The only exception is the
disjointUnionOf statement. In its general form it can not be converted into an
equivalent concept definition axiom. Consider a class A defined as a disjoint
union of classes B1, B2 … Bn. This is equivalent to stating that A is a union
of B1, B2 … Bn plus stating that B1, B2 … Bn are pairwise disjoint. And while
the former can be expressed in BOR’s terms (i.e. as a well formed class
definition), the later generally can not.
Note that the
following statements are not handled by BOR:
Restrictions
utilizing the ‘hasValue’ predicate are not supported as they require nominal
classes, which are not currently supported by BOR. Analogously to well formed
class definitions, each restriction is required to have exactly one of the
following properties defined:
·
toClass
·
hasClass
·
minCardinality
·
maxCardinality
·
cardinality
·
minCardinalityQ
·
maxCardinalityQ
·
cardinalityQ
BOR is capable of
handling role hierarchies, transitive roles and supports both abstract and
concrete (data type) roles. It is not suited to handle inverse roles and
functional roles (attributes).
Thus the following
properties and classes are not handled by BOR:
As already
mentioned, BOR supports concrete data types and roles reasoning.
Let’s take a short
DL walk.
In DL data types are
considered something external to the ontology – they are not concepts. Data
type elements are not individuals and they are not subject to interpretation by
a model. The only concept expressions that include concrete roles, types and
sets are the universal and exists concrete restrictions. For example (in Krss
format):
(all likesColor “white, green, red”) – the class of all individuals
whose favorite color(s) are among the tree listed colors. In this case the data
type is most probably an enumeration type of all color names. “white, green,
red” is the data set transcript.
(some hasAge “[18,+oo)”) –
the class of all individuals whose age is no less than 18. The data type is
some integer derived class – probably the subtype of non-negative integers.
As far as it
concerns BOR each concrete role must have a unique data type range. This way
when processing a concrete restriction BOR additionally constrains the data
type with the data set transcript to obtain the dataset. However the internal
representation of data types, data sets and data properties that BOR maintains
differs depending on the data type. Current version of BOR has built-in support
for 3 base data types:
Each concrete data
type is derived from a base data type. For example you may define data type
PositiveInteger by further restricting the Integer base type with the “(0,+oo)”
transcript.
In order to define
an enumeration type its members list must be supplied as a transcript string.
String types can not
be restricted by a transcript – any string belongs to them. An interesting idea
is to derive more specific string types using regular expressions.
When BOR serializes
a terminology into a SESAME repository for each data type it stores a bit more
information than expected by the DAML+OIL specification. Namely it stores the
name of the java class that implements the base data type functionality and the
transcript string used to restrict the base type.
A number of DAML+OIL
features (resources) have no direct analog in description logics and are not
handled by BOR.
These include:
In addition to the
standard XML and
KRSS is an acronym
of Knowledge Representation System Specification. You can find the exact
specification at http://www.bell-labs.com/user/pfps/papers/krss-spec.ps. Note that this is an abstract specification
and any particular system that utilizes it is not required to accept/recognize
all of the specified constructs.
KRSS uses Lisp-like syntax. Each top level expression is referred to as 'statement'. Following is a formal grammar of the krss format recognized by BOR:
krrs-file ::= <tbox-stmt>* end-tbox? <abox-stmt>* end-abox? ;; tbox-stmt ::= (primitive-concept <concept-name>) | (define-primitive-concept <concept-name> <concept-expr>) | (define-concept <concept-name> <concept-expr>) | (subrole <subrole-name> <super-role-name>) | (transitive <role-name>) | (enum <enum-datatype-name> <member-name>+) | (range <role-name> <datatype-name>) ;; concept-expr ::= <concept-name> | (and <concept-expr>+) | (or <concept-expr>+) | (not <concept-expr>) | (all <role-name> <concept-expr-or-dataset-transcript>) | (some <role-name> <concept-expr-or-dataset-transcript>) | (none <role-name> <concept-expr-or-dataset-transcript>) | (at-least <integer> <role-name> <concept-expr-or-dataset-transcript>?) | (at-most <integer> <role-name> <concept-expr-or-dataset-transcript>?) | (exactly <integer> <role-name> <concept-expr-or-dataset-transcript>?) ;; concept-expr-or-dataset-transcript ::= | ;; abox-stmt ::= (instance <individual-name> <concept-expr>) | (related <subject-name> <role-name> <object-name>) | (equal <individual-name> <another-individual-name>) | (distinct <individual-name> <another-individual-name>) ;;
Here follows an
example of KRSS representation of a DAML+OIL ontology.
(enum SkinColor (white black yellow red albino))(enum EyeColor (blue brown green hazel black albino)) (range skin-color SkinColor)(range eye-color EyeColor)(range age Int) (def-c indian (some skin-color "red"))(def-c adult (some age "[18,+oo)"))(def-c mother-of-an-adult-blue-eyed-indian (some has-child (and adult indian (some eye-color "blue"))))(def-c target mother-of-an-adult-indian) end-tboxThe installation of BOR in its two
The standalone version of BOR is available at http://www.OntoText.com/bor/BOR.zip
It comprises the jar file containing the compiled
classes, some sample KRSS files and 3 bat files that ease the launch of
specific tasks.
With the standalone version of BOR you can generally
perform
|
satisfiability test |
ts krss-file |
|
model checking
(strict) |
mc krss-file |
|
model checking
(non-strict) |
mc -non-strict krss-file |
|
taxonomy building |
bt krss-file |
An online demo of BOR is
available at
http://62.213.161.156:8888/webor/
It gives a simple web-interface
to BOR’s functionality (w/o taxonomy building). Of course it needs no
installation – just a web browser on the client’s side.
Note: in order to setup SeBOR a installed version of
Sesame is required.
Note: this will
replace your original webapps/sesame/toolbar.jsp file
Currently any modification to a BOR-ed repository causes the
whole taxonomy and
Further adjustments with respect to the upcoming
OWL and OWL Lite languages are already in progress.
[1]
The minimal sufficient set of direct