OWLIM 4.3
OWLIM 4.3 is bundled with
Sesame 2.6 to deliver
SPARQL 1.1 Federation. This emerging SPARQL standard provides a means to distribute queries over multiple SPARQL end-points. The big advantage of this technology is the ability to enrich query results by 'joining' to other 3rd party resources, e.g. linked data services that provide a SPARQL interface.
Many of the advanced features of OWLIM are implemented as plug-ins and the API for building such extensions has been made public for this release. OWLIM plug-ins can communicate and share data via various 'hooks' in to the processing pipelines. This allows custom indices to be used, and opens up many possibilities for improving performance and adding novel features.
There have been many other fixes and improvements for OWLIM, including many query optimisations. Query time-limits are now easier to specify - an important feature for protecting against over-use in public facing services. Also, shutdown handling has been optimised to drastically reduce the time taken to shutdown after a large load.
OWLIM Version 4.2
This version has been developed in parallel with the Sesame openRDF framework where Ontotext continue to invest development resources. It is bundled with Sesame 2.5 to deliver SPARQL 1.1 Update. This emerging SPARQL standard provides a much more powerful method to modify RDF databases without the requirement for developers to use frameworks and APIs. SPARQL 1.1 Query conformance has been brought up to the May 2011 working draft, i.e. all the remaining behaviour has been implemented along with all the new SPARQL filter functions.
This version of OWLIM-SE is already in use as the engine behind The National Archive's Semantic Knowledge-base (SKB). This installation is used to store government datasets, the FactForge dataset and metadata totalling almost 12 billion statements with high-performance query-answering. Updating queries in this environment to use SPARQL 1.1 Query features has in some cases shown dramatic hundred-fold improvements in query performance.
Using SPARQL 1.1 features through the sesame interface of OWLIM (instead of the Jena interface adapter) has shown a dramatic improvement in the explore-update and business-intelligence use-cases of the Berlin SPARQL Benchmark (BSBM). Updated results will be published in the very near future.
OWLIM Version 4.1
This is a maintenance release of the OWLIM family of RDF databases and addresses a number of bug fixes in both OWLIM and Sesame, together with some performance improvements over version 4.0.
This latest version is bundled with Sesame 2.4.2 which includes many fixes for SPARQL 1.1 Query support, especially those relating to aggregate functions and complex GROUP BY expressions. Some issues with property paths have also been dealt with.
OWLIM-SE and OWLIM-Enterprise have also been updated with fixes relating to query evaluation logic and improvements in query performance. Some problems when using named graphs with the Jena adapter have also been addressed.
OWLIM Version 4.0
This is a major development for the OWLIM family of RDF Databases and offers a number of fundamental changes aimed at better differentiation between the editions of the product as well as some important new features and enhancements.
With this release, Ontotext have rebranded their line of OWLIM editions and introduced a new member of the family:
- OWLIM-Lite is the new name of SwiftOWLIM
- OWLIM-SE (Standard Edition) is the new name of BigOWLIM; it is positioned as a single-server product,
- OWLIM-Enterprise is an extension of OWLIM-SE, which includes the Replication Cluster capabilities. It is priced and sold as a separate product intended for scalable, high-resilience, mission-critical installations, offering extended online re-configuration and other features which improve manageability
Scalability to 1 trillion nodes: A brand new indexing mechanism has been developed for OWLIM-SE that uses a configurable bit-size for internal identifiers of nodes in the RDF graph (URIs, blank nodes and literals). For very large datasets, the bit-size can be set to 40 bits, allowing up to 1 trillion unique entities to be stored.
Faster SPARQL 1.1 Query support: Thanks to Ontotext's investment in development resource for the Sesame project, all OWLIM editions now boast support for this evolving SPARQL standard. While handling of SPARQL 1.1 through Jena was fast enough to allow OWLIM to get excellent scores in the latest BSBM evaluation, using SPARQL 1.1 through Sesame makes it even faster and available in all editions of OWLIM. SPARQL 1.1 Query specification offers new features such as: Aggregates, Subqueries, Negation, Expressions in the SELECT clause, Property Paths, Assignment, Short form for CONSTRUCT and an expanded set of functions and operators. Federation will be supported in a subsequent release.
Easy server deployment: every edition of OWLIM includes re-packaged Sesame Web applications for easy deployment of OWLIM servers. Now OWLIM can be deployed simply by copying a WAR file. New OWLIM repository instances can be created using the Sesame Workbench UI administration tool.
There are number of other new features for specific editions:
- OWLIM-SE now provides a remote notification mechanism that includes transaction information.
- Performance analytics: OWLIM-SE provides information on cache utilization and behavior through a JMX interface. This feature can be useful when investigating loading behavior and query performance.
- OWLIM-Lite is now packaged as a single jar file, with a simpler free-for-use licensing scheme.
BigOWLIM Version 3.5
BigOWLIM 3.5 includes some updates to existing features, plus a range of new ones:
- All OWLIM plug-ins available with Jena interface: All the BigOWLIM advanced features are now fully supported when using BigOWLIM with the Jena framework. This includes RDF Rank, RDF Search, Node Search, RDF Priming and Geo-spatial extensions.
- Remote notifications: A new mechanism to complement the existing high-performance 'in-process' notification mechanism. This new mechanism allows clients to subscribe for the given statement patterns to remote BigOWLIM repository instances.
- Schema editing: Read-only schemas loaded at database initialization time allow very fast deletion of (instance) statements by using the 'fact-retraction' method that computes the necessary inferred statements to delete. A new mechanism is provided with this release that allows 'read-only' schema statements to be modified when necessary.
- Configuration spreadsheet tool: The memory calculator from previous versions has been updated to estimate appropriate BigOWLIM configurations for the specified hardware, dataset characteristics and selected features.
- Query optimizations: Several improvements have been made to query optimization, including the special case when using ORDER BY with LIMIT/OFFSET.
- Online documentation: As well as the PDF format user guides included in the OWLIM distribution zip files, the latest documentation for all editions of OWLIM is now available online.
- Storage files updated automatically: There are minor differences in storage file formats between versions. Versions of files back to 3.1 are now detected and updated automatically.
- owl:sameAs optimization can be disabled: The owl:sameAs optimization can now be switched off using the disable-sameAs configuration parameter. This update might be useful when using the empty or rdfs rulesets.
- Lucene-base full-text search enhancements: Even more fine-grained control over what to include in the indexed RDF molecule. Separate include/exclude lists are now supported for both predicates traversed and entities visited.
BigOWLIM Version 3.4
BigOWLIM 3.4 includes many bug fixes, several new features and some updates:
- Jena adapter (BETA): Applications which use the Jena framework or Jena-compliant RDF stores can seamlessly switch to BigOWLIM to take advantage of efficient loading and high-performance reasoning. At the same time, Jena's ARQ engine allows BigOWLIM to handle the latest SPARQL 1.1 extensions (e.g. aggregates). The adapter is still a beta version and has not been rigorously tested for conformance yet, but can be used with Joseki to make queries and has successfully passed BSBM and LUBM benchmarks. The results suggest that for most of the scenarios and tasks BigOWLIM can deliver considerable performance improvements when used as a replacement for Jena's own native RDF backend TDB.
- Geo-spatial extensions: Applications can efficiently make queries involving constraints such as "nearby point" and "within region". Special-purpose indices allow such constraints to be evaluated very efficiently on top of large volumes of location-related data, for example, finding airports within 50 miles of London in the GeoNames dataset (92 million statements, describing more than 6 million geographic features all over the world) becomes 500 faster when compared to the same query evaluated without the geo-spatial indices.
- OWL2-QL support: This OWL2 profile is based on DL-LiteR, a variant of DL-Lite that does not require the unique name assumption. It is designed to be amenable to implementation on relational databases, due to its suitability for re-writing queries to SQL. This release includes a rule-set for this profile in order to expand the range of standard rule-sets and to give users more flexibility when choosing a balance between complexity of inference and scalability.
- Rule engine enhancements, improving reasoning performance: The rule-engine now supports the ability to use context as part of rule premises and consequences. This allows for more efficient processing of certain RDFS/OWL constructions, particularly those rules using RDF lists. All predefined rule-sets have been upgraded to make use of this new expressiveness. As a result, there is now just a single rule-set for OWL2-RL, where in the last version there was a 'conformant' and a 'reduced' version. The new rule engine has lead to an improvement in LUBM loading performance of around 22%.
- Enhanced Lucene-based full text search: More flexibility is enabled for using Lucene full-text search. Users can create multiple customized indices and can decide whether to include URIs or literals, select literals by language tags, and use custom analyzers and scorers. Any number of custom indices can be used within the same query.
- Auto-restore: A configurable policy parameter can be used to specify how the user wishes the repository to start after an abnormal termination. By default, the database restorer tool will be run automatically to return the database to the state prior to the stop event, i.e. to the state after the last committed transaction.
- Simplified 'implicit-only' statement retrieval: When using the Sesame API to return statements, the 'implicit' pseudo-graph is now used. This is simpler and more consistent with query processing than the old method of invoking RepositoryConnection.getStatements() twice.
- Documentation: The distribution package includes two new guides: Replication Cluster Quick Start Guide that has details on installing and configuring a cluster and Performance Tuning Guide that brings together all information for optimizing loading time, inference and query processing.
BigOWLIM Version 3.3
BigOWLIM 3.3 consolidates a number of advances and new features, some of which have been available in previous versions as prototype implementations. The most important differences, compared to the previous versions of BigOWLIM are:
- Clustering support: The BigOWLIM software suite includes an additional Replication Cluster component that serves as a Master node for a cluster. Its purpose is to manage and distribute atomic requests (query evaluations and update transactions) to a set of standard BigOWLIM instances.
- Full-text search: Two approaches are implemented in BigOWLIM, a proprietary technique that uses special system predicates and a separate implementation that uses the Lucene text search utility. Both of them enable OWLIM to perform complex queries against character data, which significantly speeds up the query process.
- High performance retraction of statements: BigOWLIM stores explicit and implicit statements (inferred from the explicit statements). When explicit statements are removed from the repository, any implicit statements that rely on the removed statement must also be removed. In this version, removal of explicit statements is achieved by invalidating only those inferred statements that can no longer be derived in any way, which massively improves statement deletion efficiency.
- Powerful and expressive consistency/integrity constraints: Consistency checking rules have two forms - with consequences (used to check that certain inferences have occurred) and without consequences (which indicate inconsistency when the premises are satisfied).
- RDF Rank: This is an algorithm that identifies the more important or more popular entities in the repository by examining their interconnectedness. The popularity of entities can then be used to order query results in a similar way to internet search engines, such as how Google orders search results using PageRank.
- RDF Priming: This is a technique that selects a subset of available statements for use as the input to query answering. It is based upon the concept of "spreading activation" as developed in cognitive science. It is a scalable and customizable implementation of the popular connectionist method on top of RDF graphs. It allows "priming" of large datasets with respect to concepts relevant to the context and to the query.
- Notification: This is a publish/subscribe mechanism for receiving events from a BigOWLIM repository whenever new triples matching a certain graph pattern are inserted. The user of the notifications API registers for notifications by providing a SPARQL query.
- Multi-threaded rule compiler: Using this compiler, the generated inference engine can exploit multi-core and multi-processor hardware to greatly improve inference speed and thus improve load times.
BigOWLIM Version 3.0b5
BigOWLIM 3.0b5 represents the first public release of BigOWLIM, which is compliant with the Sesame 2.x and uses the new Unified TRREE architecture. The most important differences, compared to the previous versions of BigOWLIM (ver. 0.9.x/2.0) are:
- efficient native support for a "rich RDF model", including named graphs and triplesets;
- SPARQL support, based on the parser of Sesame and proprietary query optimization techniques;
- smooth data loading - in the previous generation, BigOWLIM 0.9.x needed to rebuild its indices after loading some amount of data and this operation caused growing overheads for larger datasets, limiting its scalability. Such an operation is no longer required and operation are smooth as the datasets grow;
- better scalability, through 40-bit statement identifiers.
BigOWLIM 3.0 sets the new threshold for scalable reasoning: it is the first engine that managed to demonstrate efficient reasoning against 2.7 billion statements. In the framework of the LUBM benchmark BigOWLIM managed to load the LUBM(20000) dataset. The forward-chaining reasoning over this dataset resulted in materialization of about 1.9 billion statements, thus the total amount of statements stored in the repository went up to 4.6 billion. Loading, storing, and indexing (without inference) of the data took as little as 17 hours, demonstrating a minimal slowdown compared to loading 1 billion statements (from 48 KSt./sec. for 1BSt. to 44 KSt./sec. in 2.7BSt). This result indicates excellent scalability in terms of speed. Loading, with inference, took 72 hours, delivering inference speed above 10K st./sec. - unmatched by any competitor at any comparable scale.
SwiftOWLIM Version 3.5
This release includes many bug fixes and enhancements, the most significant of these are:
- Online documentation: As well as the PDF format user guides included in the OWLIM distribution zip files, the latest documentation for all editions of OWLIM is now available online.
- Artificial limit on ruleset size removed: Very large custom rule-sets were causing out of memory exceptions or problems during compilation. The size of rule-sets is now practically unlimited.
- Bug Fix - Data loss from abnormal termination: This fix prevents data from being lost after two successive abnormal terminations. This was due to a misidentification of backup files.
- Bug Fix - Synchronization problem: A synchronization problem led to intermittent incorrect query answering of the LUBM-1 functional test. The issue appeared at an average of 1 of 80 runs and it was due to bad read-write synchronization when adding to collections.
SwiftOWLIM Version 3.4
This release includes one new feature and several important bug fixes:
- OWL2-QL: This OWL2 profile based on DL-LiteR, a variant of DL-Lite that does not require the unique name assumption. It is designed to be amenable to implementation on relational databases, due to its suitability for re-writing queries to SQL. This release includes a rule-set for this profile in order to expand the range of standard rule-sets and to give users more flexibility when choosing a balance between complexity of inference and scalability.
- Bug Fix - Concurrent commits: An important race condition has been eliminated that can cause SwiftOWLIM to enter an infinite loop when multiple concurrent users commit updates simultaneously.
- Bug Fix - Losing data between shutdown and restart: In some circumstances when running on Windows machines, data was being lost after serializing to disk during shutdown. This was due to the case-insensitivity of Windows operating systems. Special care is now taken with the naming of storage files for each of the predicates used in the repository.
SwiftOWLIM Version 3.3
SwiftOWLIM 3.3 includes functionality to bring it inline with what is offered in BigOWLIM 3.3, together with a range of bug fixes and maintenance updates that have occurred over the last year.
- OWL2-RL: full inference support for this OWL2 profile, but without the consistency checks
- Documentation: improved user documentation with new quick start guide
- partialRDFS: this flag has been deprecated and the optimizations made available with extra rule-set options
- Ontology imports: importing ontologies can be achieved using a URL as well as a local pathname
- Better JDK integration: custom rule-sets require the Java compiler, but does not now need the tools.jar in the classpath
A number of maintenance updates and bug fixes are included in this release, the most significant of which are:
- Incorrect handling of transactions in SwiftOWLIM
- Getting statements from an invalid context returns the full dataset
- Clearing a single context in SwiftOWLIM clears the whole repository
- Rule '<C> rdf:type <owl:Class> => <C> rdfs:subClassOf <owl:Thing>' not working
SwiftOWLIM Version 3.0b7
SwiftOWLIM 3.0 represents the first public release of SwiftOWLIM, compliant with Sesame 2.x and using SwiftTRREE engine, based on the so-called Unified TRREE architecture. The latter allows a higher level of code sharing between SwiftTRREE and BigTRREE and greater flexibility. An essential difference is that the new TRREE architecture supports the so-called rich RDF data model, which allows for efficient management of named graphs and triplesets (for more information on the data model, please see the data model specification of ORDI).
The major functional changes in SwiftOWLIM 3.0b7 compared to version 2.9.1 can be summarized as follows:
- Sesame 2.x: from version 3.0 onwards SwiftOWLIM is compliant with Sesame 2.x (instead of 1.2.x). The newer version of Sesame comes with serious re-engineering of its architecture and multiple new features. One of the most notable changes is the adoption of a quadruple data model in its APIs; although the fourth element is named "context", it facilitates smooth support for named graphs. Further, it supports a range of new languages and syntaxes, e.g. SPARQL, TRIX, TRIG;
- SPARQL support: multiple optimizations in the basic SPARQL query evaluation support of Sesame allow for better query evaluation performance;
- Single-threaded inference: the multi-threaded inference capability of SwiftOWLIM 2.9 is still not implemented in the new architecture;
- Instant initialization: the contents of the repository (including the inferred statements) are kept in a proprietary binary format, which allows for instant initialization;
- Multi-threaded rule compiler: making the scenario of launching multiple OWLIM repositories, which use different custom rulesets, within a single process;
- Multiple bug-fixes and improvements: earlier beta versions of SwiftOWLIM 3.0 have been provided to partners and pilot customers, which helped with early identification of various issues and corresponding fixes.
SwiftOWLIM Version 2.9.1
Major changes in SwiftOWLIM version 2.9.1, 10 September 2007, compared to version 2.9.0:
- 1000 properties fix: an internal limitation of TRREE for handling up to 1000 unique properties was removed.
- Fix for using custom rule-sets under OSGI: two JVM properties (Dtrree.jar.file and Dopenrdf-model.jar.file) are now considered to allow usage of custom rule-sets in environments that use custom class-loader schemes, e.g. the OSGI frameworks.
- Minor extensions of the OWL support: rules and axioms added to the rule-set owl-maxRules_builtin.pie to support the reflexivity of owl:sameAs and the fact that all OWL classes are sub-classes of owl:Thing and their instances are owl:Thing-s.
- Minor fix in the owl-max rule-set: an incorrect rule caused significant degradation in performance for some datasets when the partialRDFS parameter is set to false.
- partialRDFS versions of the rule-set files discarded: in previous versions, there was a pair of rule-set files for each of the predefined rule-sets, except empty - one version with partialRDF optimizations and one without them. The versions with the optimizations are now excluded because they can be derived automatically, following the behavior of the TRREE rule compiler, which is documented.
- Sesame 1.2.7 bundled in the release.
SwiftOWLIM Version 2.9.0
Major changes in SwiftOWLIM version 2.9.0, 12 June 2007, compared to version 2.8.4:
- Multi-threaded inference: loading speed improves 37-71% on a dual-CPU (4-core) server, depending on the rule-set; 33% speed up on a desktop machine (P4 with hyper-threading);
- Improved transaction isolation: corresponding to READ COMMITTED level in RDBMS;
- Transitive closure optimization: the materialization of the “closure” of transitive properties can be switched off. This prevents the generation of O(N2) implicit statements, for a chain of N individuals connected through a transitive property. This optimization improves dramatically the scalability and performance on datasets with long “chains” of transitive properties;
- Stack-safe inference: in ver. 2.8.3/4 a “stack-safe” mode, was allowing handling very “deep” inference chains; in this mode, OWLIM was slower. Now the reasoning algorithm is stack-safe without performance penalty or need of a specific configuration parameter;
- Improved management of implicit and explicit statements: separate retrieval of explicit and implicit statements is straightforward;
- Rule compiler fix: now it can process rules with virtually unlimited number of premises.
- Getting-started introduced: a sample application setup (incl. source code, binaries, scripts, and configurations), allowing for easy bootstrapping of applications that use OWLIM;
- WordNet: a sample application loading W3C's RDF/OWL representation of WordNet is provided;
- Distribution improvements: OWLIM is now packed with all libraries necessary to run it; numerous improvements to the accompanying scripts make running OWLIM trivial.
BigOWLIM Version 0.9.2-Beta
Major changes in BigOWLIM version 0.9.2-Beta, 4 Oct. 2006, compared to version 0.9-Beta:
- Query evaluation fixes: few problems, related to proper handling of Sesame construct queries, were fixed in BigTRREE; they were detected after a bug report from a user;
- Equivalence classes support fixes: some
owl:sameAs statements were not properly inferred in the previous version;
- Initialization from file images fixed: some bugs related to the generation of B-Nodes and some of the in-memory structures were fixed;
- Temporary file creation fixed: improper handling of relative storage folder name was causing problems with temporary file creation;
- The new features from SwiftOWLIM v.2.8.4, were introduced in the BigOWLIM version as well: semantics customization support; command line parameters; fixes in the owl-max rule-set; Linux shell scripts were added.
An update to the distribution package of SwiftOWLIM v.2.8.4 also took place on 30 Sept. 2006 - it includes updated documentation and some fixes to the accompanying scripts.
SwiftOWLIM Version 2.8.4
Major changes in SwiftOWLIM version 2.8.4, 16 Sept. 2006, compared to version 2.8.3:
- Custom inference: the TRREE rule compiler became part of the distribution, which allows using custom rule-sets for inference (see section 6.3 of the System Documentation for more detailed information). This way one can specify semantics that best fits the particular application in terms of expressivity and performance;
- Command line parameters: some of the OWLIM parameters can now be passed through the command line. In the previous versions, they could be specified only as SAIL parameters in the system.conf file of Sesame or programmatically.
- Minor fixes in the owl-max rule-set: they allow for covering some extra cases of A-Box reasoning and eliminate most of the cases when B-Nodes have been generated.
- Linux shell scripts: Linux scripts were added to the distribution, which helps to control (start/stop) a standalone version of OWLIM and running tests. In the previous versions, such scripts were available only for Windows.
SwiftOWLIM Version 2.8.3
Major changes in version 2.8.3, compared to version 2.8.2:
- Improved concurrency: several improvements took place to allow swift handling of hundreds of simultaneous users.
- Stack-safe mode: a new
stackSafe parameter allows switching the engine in a slower mode and prevents stack overflows that could occur for some datasets and ontologies in the standard mode.
- Namespace fix: improper handling of namespaces in queries and elsewhere was fixed.
- Serialization fix: the main storage file was serialized in NTriples, disregarding the
dataFormat parameter – fixed, Turtle and RDF-XML are properly supported now.
- Persistence control fix: the
noPersist parameter was not supported properly – fixed; this parameter switches off any persistence, i.e. OWLIM runs 100% in-memory.
- eLUBM benchmark: eLUBM is an extended version of the LUBM benchmark, developed by IBM’s IODT team, to allow evaluation of more comprehensive reasoning over OWL DL and Lite. eLUBM is provided with OWLIM as an extension of the standard LUBM benchmark.
Version 2.8.2
Major changes in version 2.8.2, compared to version 2.8:
- TRREE: OWLIM uses the TRREE engine for in-memory reasoning and query evaluation. TRREE is a newer version of the IRRE engine that was part of OWLIM v.2.8.
- 7 different inference modes: OWLIM can be configured to work with one of three pre-build sets of rules that support respectively the semantics of RDFS, OWL Horst, and a specific fragment we name owl-max (combining OWL Lite with unrestricted RDFS). These rulesets can be altered to “partial-rdfs” mode, when some of the normative RDFS entailments are discharged for performance reasons. In addition, the entailment is made optional, so that it is possible to switch it off completely and to use OWLIM as a plain RDF store.
- Extended OWL support:
owl:oneOf, owl:minCardinality, owl:maxCardinality, owl:cardinality; partial OWL-Lite T-Box (schema-level) reasoning added.
- Configurable index size: allows the user to manage the trade-off between required RAM and performance.
Version 2.8
Major changes in version 2.8, compared to version 2.0:
- IRRE: OWLIM uses IRRE for in-memory reasoning and query evaluation. Sesame's standard in-memory SAIL, implementing the RDFSchemaRepository, is no longer used.
- Upload and reasoning speed up: there is a major improvement to the upload and reasoning speed due to using IRRE.
- Persistence: Persistence implementation was changed, but it is still compatible with the previous releases of OWLIM in terms of storage formats and SAIL configuration options.
- Multi-threading: The inference process is not multi-threaded and not even thread-safe. It requires special attention if used in a multi-thread context. OWLIM still uses multi-threading for other functions (e.g. persistence).
- Extended support for OWL: intersectionOf, unionOf, AllDifferent, someValuesFrom are already handled in this version.
Version 2.0
Major changes in version 2.0, compared to version 1.0:
- Sesame 1.2.1: compliance with Sesame release 1.2.1 (the previous version was compatible with Sesame 1.1).
- Upload speed up: there is a major improvement to the upload speed, through caching add triples operation.
- Persistence alignment with Sesame: persistence organization is now converged to the standard Sesame mechanism for synchronization of the repository contents with its persistence file.
- Concurrent multi-thread inference: it delivers serious improvements of the inference (and thus repository modification) speed for machines with multiple processors or Hyper-Threading.
- Extended support for OWL: support for allValuesFrom, hasValue, equivalentClass was added; the support for equivalentProperty was improved.
- RMI access enhancements: the SailAccessor interface was enriched with some methods to retrieve the access rights of a repository and extract/export its content.