Read about OntoRefine - a new tool that allows you to do many ETL (extract, transform and load) tasks over tabular data.
Kids of all ages have two things in common: they all tend to make clutter and mess and they all would love to have the clutter and mess cleared up for them. And while younger kids often times have their mommies and daddies clean the mess, grown-up kids prefer to use tools. GraphDB is one such tool: it is a tool for cleaning up messes of data.
Data is what grown-up kids use when they talk about facts and statements. Data, or pieces of information, can be facts or statements just about anything: people, animals, plants, countries, events, measurements, you name it. These facts and statements are recorded with words, numbers, symbols and signs.
Here are four examples:
These four statements look easy to understand and use. But what if you don’t have only four of them? What if there were hundreds of billions of them scatters across different places? Imagine the mess if you decide you need to find some information from them.
If the pieces are not well organized and classified, finding and using them would be harder than finding and playing with the toys in even the messiest kid’s bedroom.
If you have ever made a mess in your room, you probably know how hard, exhausting and not fun at all it is to clean, declutter and put things into their places. At the same time, you know how rewarding it is to have things cleaned-up and to get rid of the mess. When all things are neatly filed and sorted, it is easier to retrieve anything that’s important, isn’t?
The same goes for messes of facts and statements and for cleaning them up. Putting order into messes of data helps grown-up kids store and find all the kinds of facts and statements they have gathered with time on their computers and on the Web without worrying that they will lose or overlook something. Click To Tweet
At the end of the day, this is why anyone would ever go into the trouble of decluttering and cleaning a mess (be it data or toys):
The way GraphDB tidies information is by putting a label on each and every piece of the information and then storing the labeled pieces in places where they can be very easily reached.
To understand how GraphDB works, imagine a robot with a magnetic arm that is set to automatically clean the mess in your room and thus makes it easy for you to find any toy within seconds, fetching it to you when you ask for it or showing you where you can get it by yourself.
First, the robot uses the magnetic arm to pick pieces of information. Then the robot labels and loads the pieces onto a train car. The labels contain more descriptions (adults call this semantic metadata) and the train cars help the piece of information go anywhere, anytime when needed (adults call them URIs).
Train cars usually travel in triples (two things connected by a third) and this is also why GraphDB is sometimes called a triplestore. Because it stores data in triples – two train cars connected by a coupling that is labeled, too.
For example, the statement: “Mike has a Tyrannosaurus Rex” would travel in two train cars – Mike and Tyrannosaurus Rex loaded in each and coupled with a coupling, named “has” (couplings serve adults to express relationships between things, in this case, possession).
GraphDB tidies up not only to help people get rid of the mess but also to discover things. When all the pieces of information are stored in the train cars, they can be very easily found and assembled in many combinations. Guess what assembles them! A locomotive (adults call it a SPARQL query). Whenever someone needs some kind of information, they send a locomotive to connect the needed train cars (with the information pieces in them) and bring them the right information in seconds.
If GraphDB was a physical robot, and not a program on your computer and it had to clear a mess of toys in your room instead of a mess of information pieces, it would tidy it up in 3 steps:
After the robot declutters your room and organizes everything in these three steps you will be able to do a lot more with your toys. You will be able to ask the robot to find a toy with whatever words you might want to use. For example, you can ask for the toy you played with two days ago, or for the dinosaur you played with yesterday, that has green wings and blows fire.
And there’s even more. You will also be able to ask the robot to give you all the toys that don’t belong to you. If the robot has labeled a toy with a label: “Belongs to Tom”, the robot will automatically know whether a toy is yours or not (adults call this inference).
Let’s say your daddy is a journalist. He is writing an article about Tyrannosaurus Rex. Throughout his research, he has gathered tons of information about dinosaurs and in particular about Tyrannosaurus Rex. But the thing is, with so many facts and statements on his Kindle, and on his voice recorder, he has a really hard time finding the specific ones he needs for the article.
In this case, GraphDB will help him store and organize everything in one place and then help him quickly and easily find the most relevant information for his article – facts, images, sounds, similar articles, related topics.
Thus your GraphDB will give your daddy access to any type of information from anywhere – from his computer and from the Web. Daddy will be able to explore, connect and find new facts and statements about Tyrannosaurus Rex.
To recap, when all the information pieces are labelled and stored in their places, they become ready to travel across the Web and across computers and connect with other pieces of information (adults call these train cars, labeled and… Click To TweetWhen one uses GraphDB, they can quickly and easily find things and also do a bunch of other cool stuff such as:
By and large, this is what GraphDB does and can be used for. For more detailed explanation, suitable for grown-up kids, check Ontotext’s Fundamentals on the subject: What is RDF Triplestore?
If you know someone who would love to have their mess of data cleaned and neatly stored and managed, tell them they can download GraphDB Free and see what a semantic graph database can do for their data.
And don’t worry, you can’t break GraphDB. Just don’t feed it biscuits or milk or jelly. Only data. Any data.
Want to learn more about RDF triplestores like Ontotext’s GraphDB?
White Paper: The Truth About Triplestores |