So, the gauntlet has been opened. Apparently, I'll get showered with Amazon vouchers if I produce a tutorial that meets the following criteria. But I'd like to challenge the criteria because they make no sense.

Install an RDF store from a package management system on a computer running either Apple’s OSX or Ubuntu Desktop.

You already have one. In-memory storage. You don't need a triple store yet. If you don't understand RDF, you don't need a triple store. If you don't understand JSON, you don't need MongoDB, right? If you don't understand XML, you don't need eXist.

To start parsing JSON, you don't need to install MongoDB or CouchDB. To start parsing RDF data, you don't need a triple store.

This step seems to presume that the process people use to use RDF data works something like this:

  1. Download data.
  2. Put data in (non-memory) triple store.
  3. Query triple store.
  4. Parse return from triple store and turn into some kind of nested array/hash structure.
  5. Use data in nested array/hash structure.

That's certainly one way of doing it. But it makes about as much sense as:

  1. Download JSON data.
  2. Put data into MongoDB.
  3. Query MongoDB.
  4. Use data returned from MongoDB.

That's a bit stupid.

Why not just do:

  1. Download JSON data.
  2. Parse and use.

Well, you can do that in RDF too.

  1. Download RDF data.
  2. Turn into object structure and use.

Or you can do the SPARQL route.

  1. Construct SPARQL query.
  2. Post to SPARQL server.
  3. Interpret result.
Install a code library (again from a package management system) for talking to the RDF store in either PHP, Ruby or Python.

Again. You don't need a triple store, yet. If you can't work out how to handle RDF data, why do you need an RDF store? The cart is before the horse. I do a lot of RDF hacking, and only very rarely do I actually need to start up a triple store. In fact, I have helped publish a fair quantity of RDF and I don't use a triple store. (Why? Well, we have a perfectly functional MySQL database.)

Now, you can install a good RDF library that doesn't need to talk to a triple store but can if you really want to.

If you are using Python, you type the following into your command line:

sudo easy_install -U "rdflib>=3.0.0"

If you are using Ruby, you type the following into your command line:

sudo gem install linkeddata

If you are doing PHP, I dunno, try ARC2. It's used in Drupal apparently.

Programatically load some real-world data into the RDF datastore using either PHP, Ruby or Python.

Python tutorial.

Ruby tutorial.

My contributions to both of these articles are released under the Creative Commons Attribution-ShareAlike 3.0 license.

Programatically retrieve data from the datastore with SPARQL using using either PHP, Ruby or Python.

Well, if you want to query using SPARQL, you use the HTTP library in your programming language. For Ruby, that's net/http or Curb.

In Python, you can use SPARQLWrapper.

sudo easy_install sparqlwrapper

Then follow the instructions in the second half of the "Getting data from the Semantic Web" Python tutorial.

Convert retrieved data into an object or datatype that can be used by the chosen programming language (e.g. a Python dictionary).

It doesn't work like that.

The whole point (sigh, I have to explain this so often) is that it's a blimmin' graph. It doesn't just magically turn into a Python dictionary. Just like XML documents don't magically turn into Python dictionaries. Nor do JPEGs or a whole lot of other data.

But all is not lost.

If you follow the above-linked tutorials for the Python or Ruby libraries, you'll learn how to get data out of in-memory RDF graphs. The Python tutorial uses a filtered graph approach, while the Ruby tutorial uses a BGP matching approach. Both will allow you to construct a nested array/hash structure from RDF data. To see roughly how this works, have a look at this example Python code.

Now, this didn't answer the specification as set. But that's because the specification is misguided. Triple stores are a second lesson once you've gotten data off the 'net, parsed it and fiddled around with it. You only need them for persistence or querying or whatnot. As I said, I've used mine so little, it's not an essential lesson. If you learn, say, rdflib, you'll find that you do about 95% of your RDF mangling without ever touching a triple store. And once you know how to use something like rdflib and SPARQL, the times when you do want to use a triple store, it's pretty easy to do.

If you still want to know about triple stores, even after I've said all that, I can still do that. But, trust me, walk before you can run. (And the cynical person inside me says "walk before you complain you can't run".)

Powered by WiGit