Alberto Pepe edited green_OS.md  over 11 years ago

Commit id: a42f3d307556e1af8f56b1fc8b97e8e5c273cd7a

deletions | additions      

       

The problem at hand is that the type of science we conduct today does not fit in the format and scope of the scholarly article. The code to assemble and statistically analyze a dataset, the workflows employed to visualize that dataset as a plot, and the dataset itself are three examples of research materials which cannot realistically fit in an article as we know it, both for their size and for their scope. To overcome this crucial problem, libraries, governments, and funding bodies, are starting to require data and other ancillary materials to be distributed alongside papers so that the entire lifecycle of research can be reconstructed. By parallel with the flavors of Open Access, this strategy --- of providing access to scientific sources after the publication of a scholarly article --- can be thought of as the "green road" to Open Science. The green road to Open Science, however advantageous, is not without its shortcomings. There are at least two reasons that make green-flavored Open Science tortuous and impractical. The first has to do with curation. Depositing research materials alongside publications often means that fulltext and data will live on different repositories, hosted by different bodies, under different regulations and practices. Making sense of and curating the conceptual links that exist among papers and research materials is already difficult today. Was this plot generated using dataset one or dataset two? Was the data analyzed using the first or the second version of the code? Answers to these questions may just be impossible to obtain today, let alone in a few decades. The second problem has to do with incentives. Those familiar with the recent NSF Data Management Plans --- which mandate publication of data sources alongside published papers resulting from NSF-funded research --- know very well that complying with the mandate was a big headache. Once a paper is published, authors have very little incentive to publish data and the full sources of their research. Asking **Asking  authors to deposittheir  data a posteriori after having authored its host article  --- the green road to Open Science  --- is a partial and unsustainable solution. solution.**