Three levels of reproducibility: Docker, Galaxy, Linked Data
[Originally posted at LinkedIn]
I have just stumbled upon this thread on why one should use Galaxy (https://www.biostars.org/p/50034/). One of the reasons posted is reproducibility, but Galaxy only solves one level of reproducibility, "functional reproducibility" (What I did with the data). There is at least two other levels, one "bellow" Galaxy and another one "above" Galaxy:
- Bellow: computational environment: Operating System, library dependencies, binaries.
- Above: semantics. What the data means.
- Computational: Docker.
- Functional: Galaxy.
- Semantics: URIs, RDF, SPARQL, OWL.
3.- Semantics: what the data means.
2.- Functional: what I did with the data.
1.- Computational: how I did it.