e -nlibrary.
Establish methods meetings, technical meetings as side event/parallel track of scientific meetings
Engineer when it is necessary (good useful code that you want to be reproducible). Don't over-engineer (finalize your code when you're just trying out an idea)
11:30-12:30 Tim Head, Wild Tree Tech, Zurich, Switzerland
As a PhD student, don't try to launch your own software project. Good software is hard. Instead, join an existing project.
How to find a project? Ask a friend, a colleague. It's good to have people around to answer your questions. Then ask Google using a keyword and "GitHub".
How to choose a project is harder. Look at how many people have contributed to it, then the last time someone contributed to it. If you want to work with companies, or create a company in the future, then do not contribute to a GPL project, but rather MIT etc.
Why contribute back? Often people think: I'm too busy. But turning your personal fix into a permanent change will save you from fixing the code each time there is an "official" update.
Creating software is a craft, be an apprentice! (for a long time, before you take responsibility for maintaining software). Contributing to a well-known software makes you benefit from the reputation of that software.
This is a long tempr
Instructions for the workshop
16:30-17:00 Michel Jaccard, id est avocats
Intelectual property can contain many different notions, but they are mainly grouped in patents and copyrights.
"Open science: as open as possible, as closed as necessary." However:
- the data needs to belong to someone legally.
- open science is not the right solution for everyone.
- CC is more flexible.
- The solutions used today are temporary: they need to be improved.
Friday 29th
Open science in different fields.
Earthquake science:
- data in the size of 100 Gb - 1 Tb
- stored in 3 formats: raw data, processed data, most useful data (the one most people are interested in)
- stored on zenodo and figshare
- open data takes <5% of their money and funding
Systems biology and genetics:
- open science has always been part of genetics to avoid patenting of genes
- own webpage with links to data on databases and algorithms
- there's many citations to the datasets (or the articles for which the data was first collected)
- Genohm: tool to help manage lab data (http://www.genohm.com)
Biometrics:
- BEAT: open science plateform (http://www.beat-eu.org)
- aim to link experiment in a paper to the data
- possibility to manage data as teams
- you can modify the data on the platform and re-calculate it
Climate science:
- problem: a ridiculous % of papers are open access
- no information about how data was collected
Music science:
- problem: copyright (the songs belong to the artists)
- platforms where the artists upload songs in open access
- once you have your dataset, you still need to clean it to remove background noise
Swiss science data center:
- aim to make the bridge between people that collect the data and those who make algorithms
Scientific IT at EPFL:
Quantum mechanical simulation:
- data automation with AiiDA
Chemistry:
- aim: store information as quickly as possible on databases.
- developped specific tools for chemistry online.
- javascript software registry: http://www.npmjs.com
- test codes (can be synchronized with github): http://travis-ci.org
- want to see what the whole thing looks like? it's here --> http://www.c6h6.org (:
Libraries:
- aim: organise information (data collection)
- since the XIX century: everyone should have access to data
- now, they need to select the information they have (journals subscription fees increase by 6% each year)
- over the history, there is a natural shift towards open access in libraries
CERN hardware:
- written in the funding document that the data produced by CERN should be open (that was by the way in 1953!!)
- open hardware: important because hardware design is easy and useful.
- but, it's important to have a coopetition: balance between competition and cooperation
Acknowledgements