Transforming CSV data to RDF with Grafter
Part of my work is to develop pipelines to transform already existing Open Data (Usually CSVs in some data portal, like CKAN) into RDF and hopefully Linked Data. If I have to do the transformation myself, interactively, I normally use Google Refine with the RDF plugin. However, what I need now is a batch pipeline that I can plug into a bigger Java platform.
Therefore, I’m looking at Grafter. Even though I have never programmed in Clojure (or any other functional language whatsoever!), Grafter’s approach seems very sensible and intuitive. Additionally, I have always wanted to use Tawny-OWL, so probably it will be easier if I learn a bit of Clojure with Grafter first. Coming from Java/Perl/Python, the functional approach felt a bit weird in the beggining, but it actually makes more sense when defining pipelines to process data.
I have gone through the Grafter guide using Leiningen in Ubuntu 14.04. So far so good (I had to install Leiningen manually though, since Ubuntu’s Leiningen package was very outdated). In order to run the Grafter example in Eclipse (Mars), or any other Clojure program, one needs to install first the CounterClockWise plugin. Note that if you want to also use GitHub, like me, there is bug that prevents the project from being properly cloned, when you choose the “New project wizard”: I cloned with the General project wizard, copied the files from another Grafter project, and surprisingly it worked (trying to convert the project to Leiningen/Clojure didn’t work!).
My progress converting data obtained in Gipuzkoa Irekia to RDF can be seen at GitHub. Also, I’m aiming at adding Data Cube SPARQL constraints as Clojure test, here.