this is for holding javascript data
Alberto Pepe edited Rule 4. Publish workflow as context.md
almost 11 years ago
Commit id: 318b713a294552a32bf444d474d299d8a3f385f3
deletions | additions
diff --git a/Rule 4. Publish workflow as context.md b/Rule 4. Publish workflow as context.md
index edd41bf..004c7d6 100644
--- a/Rule 4. Publish workflow as context.md
+++ b/Rule 4. Publish workflow as context.md
...
# Rule 4. Publish workflow as context.
Traditionally, what computer and information scientists call "workflow" has been captured in what scientists call the "methods" and/or "analysis" section(s) of a scholarly article, where data collection, manipulation, and analysis processes are described. Today, nearly every study uses computer software to carry out the bulk of its workflow, but rarely is the end-to-end process described in a paper captured in just one software package. Thus, while directly publishing code is critical (see Rule 6), publishing a description of your processing steps offers essential context for interpreting and re-using data. In the future,
we envision that the most useful workflow documentation will be part of an electronic provenance record that "automagically" links together all the pieces that led to a result: the data citation (Rule 2), the pointer to the code (Rule 6), the workflow (this Rule), and a scholarly paper.
Systems that document workflow in a way that they can plug into provenance visions like this one are best, so keep an eye out But for
such systems in your field. Web services the time being, you can use a system that
encapsulate documents workflow
are a good way to capture provenance. In life sciences, systems like [Taverna](http://www.taverna.org.uk/) and
[Kepler](https://kepler-project.org/) are good examples. Other standardized workflow documentation systems are offered by "notebooks" within some versioning of data workflow, software
packages, such as the [Mathematica](http://www.wolfram.com/mathematica/) code and
[iPython](http://ipython.org/notebook.html) notebooks. Systems even the article writing process (see
Rule 2) that offer HDL and DOI identifiers for data can, and do, offer those identifiers for workflow files as well. Appendix). At a minimum,
provide a simple sketch of data flow across software, indicating how intermediate data and final results are generated, and parameter values used in the
analysis, should be offered. analysis. Keep in mind that even if the data used are not "new," in that they come from a well-documented archive, it is still important to document the archive query that produced the data you used, along with all the operations you performed on the data after they were retrieved.
Just as in Rules 1 through 3, keeping Keeping better track of workflow, as context, will likely benefit you and your collaborators enough to justify the loftier, more altruistic, goals espoused here.