1280px svalbard global seed vault (23273281972) (cropped)1

Fantin Reichler

and 1 more

1st Keynote : No (Open) Science without Data Curation: Five lessons from the study of Data Journeys (Sabina Leonelli)  --> Open Research as an opportunity, including scientific infrastructures, governance and how this should be credited and disseminated:   https://www.datastudies.eu/publications https://icsu.org/cms/2017/04/open-data-in-big-data-world_long.pdf Open Research on three aspects: Global ScopeSystemic Reach Local ImplementationFAIR data improves your research at many level. BUT their are requirement to make data FAIR, such as :coordination of data infrastructuresmaking data accessible on many platformsetc.Awareness of Open Science and its tools is still very low in the scientific community (EU Working Group on Educaiton and Skills under Open Science, 2017) https://www.garnetcommunity.org.uk/sites/default/files/GARNet_Paper_nplants201786-1.pdfhttps://www.datastudies.eu/publications  It is important for researcher to have a bit of knowledge of the tools/methods to make their data FAIR. The most important thing, is that some people - us - can share with them an expertise about these tools/methods, and help ease the confusion that the researchers might feel while putting in practice FAIR.Focus on qualitative data: Databases, example of plant scienceData Re-use cases  Data journey example : TAIR (not FAIR) https://www.arabidopsis.org/preparing specimenspreparin gand performing imagingdata storage dissemination.... ...AnalysisEpistemic troubles :- RD collected represent highly selected data types- selection basesd on political-economic conditions of sharing- peer reviews structure unclear- misalignement between it and research need- no sustainable plans for maintenance - ....Lessons Learnt on a general field Context specific data curaiton is key to data re-use Long-term maintenance is key to trustowrthiness (update, LT Policy)Which data and why? data & materials (connect digital data with data in the physical world)Role of ethics, humanities & social sciences in data management (increase quality and reusability)http://press.uchicago.edu/ucp/books/book/chicago/D/bo24957334.html---------------------------------------------------------------------------------------------------PLENARY : RESEARCH PAPERSMeasuring FAIR Principles to Inform Fitness for UseCarolyn Hank from University of Tennesseepast paper on  http://datacurationprofiles.org/ => 10.1002/pra2.2016.14505301046 "Fitness for use" > focus on the "reusable" aspect of FAIRMethod : interviewJob-related demographics with questions such as, 'what is your current job title?', 'how many years have you work in this instition?, 'how many have you been work in the discipline?' etc.Findability >'how did you find the data?', 'DOI', 'metadata?'Accessibility > 'How did you access the data?' 'Open format?' 'was the data free?' 'was the metadata accessible?'Interoperability>'was the data in a useable format' 'encoded?' 'machine-actionnable?'Reusability>'were the metatadata sufficient ?' etc.Potential implications : data can be FAI, but R requires more research=> create ne knowledge of how scientists access and use data=> producing a framework to enable re-use----------------------------------------Giving datasets context : a comparison study of institutional repositories that apply varying degrees of curaiton(Amy Koshoffer, Cincinnati, USA) questions :1. How do the metadata vary for each insittution? 2. completeness of metadata3. curated datasets do have more documentation4. DOIs more with curated datasets5. keywords What is curation? - appraisal/selection- check/run files : include clode review, review sensitive information, merde elle parle trop vite! 4 universities : one repository per institution20 datasets per repository. Comparaison with mandatory mData / unmandatory for each universityResults Question 1:  - all universites use title in metadata (for instance), but all of them understand something different - all datasets had above the minimum metadata required Question 2: - 53% completedness, but different for every institution she looked at - use of the Mann-Whitney U test : https://fr.wikipedia.org/wiki/Test_de_Wilcoxon-Mann-Whitney- no optional use of supplementary metadata in curated and not curated repositories --> Does curation really have an impact then? note sure if the curation service does.Question 3:- curation does have an impact on documentation Question 4: - all support DOIs, but in different ways. They might be other factors to take into account than curation processConclusion :- Curation process may have had a measurable impact, BUT more factors may be impactful- Curation > more documentation & more readme.txt---------------------------------------------------------------------------------Complexities of digital preservation in a virutal reality environment, the case of virtual Bethel University of Indianpolis,  Angela Murillo "CHUUUUUUUUURCH!"E. Blumer, 2018Creation of a VR space for a churche > the question is : how do we preserve this VR space ?-  at the time, there was an archive (docuemnts/physical objects) but recently the building was sold- 3D virtual space of the church + learning space (history of the building)https://comet.soic.iupui.edu/bethel/Preservation challenges :- nature of 3D data- VR operationTypes of data : pre-prod / prod / post-prod + files that make links between the three phaseUpdates : 40gb to 60gb (mainly for the creation of learning spaces)The problem is that until now, there are no VR object preservation frameworkUse of NDSA Standard for Levels of Digital Preservationhttp://ndsa.org/activities/levels-of-digital-preservation/Essentialy : Work in progress .... progress ... progress.... progress... progress...---------------------------------------------------------------------------------------------------------------------------------------------------PARALLEL SPEAKSENABLING AND MEASURING FAIR (Fantin) Are research data sets FAIR in the long run - Dennis Wehrle, Freiburg Spoiler alert : there's no definite answer to this questionPick 10 public rep. through the 1800+ Re3data.orgFor each of these 10 rep. , selection of 10 datasetsLimitation for the datasets :- Open- etc. (too fast)Use of Havard's File Information Tool Set (FITS), which contains 12 analysis tools. Test dataset : 237 GB to analyse, which represent 5h20 of processing : it represent 85 days of processing for the whole sample (100 datasets), so they took shortcuts (too fast to note)FITS result : no result / single result / conflicting result / unkown resultAggregation of identical named formatUnification of "unknown result", still there were 28 conflicts (2150 files) to post-processIn the end : app. 145 formats identified (lower estimation) - a few files were still unidentifiedimages : png/jpgText encoded format : CSV, XML, RTF, HTMLScript/source code : readable with text editor, base64-encoded in XML, JavaScript in HTML, refereence to external dat in (X)htmlProblematic "text files" : unown binaries, matlab, SPSS, OCtet StreamSUSTAINABILITY :Formats division :- high probability (plain text/pdfa)- medium probability (open formats such as OPEN Office)- low prob (.doc, prioritary formats)Applied from data format to datasets : Most of the datasets had LOW PROB (3/4 approx.)Advice to datasets creators : change their formatResult : single file format migration may not be sufficient. As a matter of fact, most of datasets are heterogeneous.Lesson learnt : -Data service shouldn't refuse "bad file format" (poorly ranked one), but help researchers create workflow to embed them in LT preservation process. Involvment of datasets creator is necesserary.- tools mentionned (FITS) have weak supportEnabling FAIR Data in the Earth and Space SciencesShelley Stall, American Geophysical UnionAgu position of data tends to respect FAIR principleSurvey was taken, the top for issues are the following without surprise :- data complexity- findingrelevent existing data- TOO- FASTStorytelling : a student had his computer stollen, the data  was only on it. Later the publication was retracted because of this (because of the fact the data weren't deposited anywhere)A new funder's grant is taking place : to get it, your data has to be FAIRAGU service :- streamline data policies-help researchers find support- dmp support- etc.Data Management Traing Clearinghous : bit.ly/DMTC_events :  http://dmtclearinghouse.esipfed.org/ (not AGU project, but communitary project) => online learning resources.Face 2 face meeting : rd-alliance.org Include your organization sstall@agu.org------------------------------------------------------------------------------------------------------------------PARALLEL SPEAKSCross-institutional and national data services (Eliane) Lisa R. Johnston - Data Curation Network: A cross Institutional Staffing Model for curating research data- Building the data curation network - all universities in USA Idea: collaboratively sharing data curation staff - How would we deal with conflicting policy issues? - What do researchers actually need our help with? Will they care if curation is distributed? - Can I trust someone else to curate our data? What about quality control? Start with: 9 institutions (all of them contributing to the curators) , 19 data curators, 1 project cooridnator, 1 program director, 8 DCN representatives, 2 admin leads Day 1: Business Meeting Day 2-3 : Curator Training/Network Process: Ingest, Appraise and Select, DCN, Facilitate Access, Preserve Long-term DCN: Review, Assign, CURATE, Mediate, Approve- Check files and metadata- Understand and run files- Request missing information - Augment metadata- Transform file formats- Evalute for FAIRness = CURATEAssessment: Is a network approach to curate research data more efficient? Indicators: number of datasets, frequency, variey, efficiencyAre Curated data more valuableIndicators: track reuse indicators, implement a DCN registry, apply badges and metadata to signal that data sets curated by the DCN are FAIR Making everything available: British Library Research Services and research Data Strategy : Rachel Kotorski, British Library - New department : everything available "research services" - change management portfolioResearch data strategy: make research data business as usual (this is not the case at the moment), users will be able to use reserach data via toolshttp://blogs.bl.uk/digital-scholarship/2017/08/announcing-the-new-british-library-research-data-strategy.html Four themes: data management: documented data management processes, british library data management plans, data management plan engagement, data management trainingdata creation: generating data at the library, advising on data creation, carify approach to data collection, engaging and linking with others --> idée for EPFL: rajouter quelques sources de données sur le site web des BL? https://data.bl.uk/ : also digitized content data archiving and preservation: preserving library data, sharing data preservation expertise, data preservation services for third parties, digital shared storage data discovery, access and reuse : discovery for library data, third-party data discovery, new models of data access, tools and skills for data exploration, datacite UKThis cannot be done alonE!!! Internaitonal Reserach Infrastructure - funder or partner ? Angeletta Miranda Leggio (ANDS) Working together with other existing groups A lot of collaborations on local levels Funded projects like : Open Access to Marine Data, Open River, PetaJakarta Do you see ANDS as funder, provider or partner? Too fluffy ....I do not reall now what to do out of it -------------------------------------------------------------------------------------------LUNCH. It was really good.---------------------------------------------------------------------------------------------Minute madness Metro Fun - Train the Trainer R. Schneider (vote 1) Of coooooooooose! The Bible holds the answer to eeeevrything!There goes my time....joggling during RDM trainings. Data Citation in Social Sciences (à regarder)FDMentor www.forschungsdaten.org/index.php/FDMentor (more than one minute)Maredata - uanc apon a taim, thea wea several (best accent!) - RDM Iberia HODs/rd: research data harvester based on repositories ...Gugeell=Google Holistic RDM service HannoverJisc RDM toolkit for international community (expörts in se field)Grace : exporing the cost and scalability of reserach data management services (Göttingen) https://www.sub.uni-goettingen.de/en/projects-research/project-details/projekt/grace/  (2 vote) Building a reserach data management training community (efaluation foorm) How federated reserach data infrastructure work Crosswalk - Resurrecting data back from the deadLong-tail of dataAgile data eco-system Data sharing workflow for large datasets with globus (the shortest presenter of the posters)Dtaa Processing Pipeline - Finnish National Preservation Service (a little bit taller than the speaker before) (à voir)Defining Library Capacity for Big Data curation (the tallest presenter from all posters) (à voir)Research data management courses : overview and gap analysisscientific data science service at Brown University (à voir) Springer Nature Research Data Service (äreeund=around) (*buuuuuuh*) Curriculum RDM at Toronto University Supporting Open research using KiltHub https://kilthub.figshare.com/RDM for phd (icecreeeeeeam!!!!!)Preservation of Canadian reserach data (service model) scaling up data management services with metadata in gene sequencing (à voir) surveying data management practices among neuroimaging researchers (àvoir ) Forsbase (ELLE S'APPELLE ELIANE!!!!!) New online course, deliver RDM services from DCC--------------------------------------------------------------------------------------------Demonstrations (Fantin et Eliane vont dans la même session, car DMPOnline on connaît par coeur) The Arctic World Archive  https://www.piql.com/arctic-world-archive/ Piql is a norvegian preservation companydigital vault designed to protet most valuable dat a from wars, cata strophes and cyberattacks.

Nathalie Lambeng

and 3 more

AbstractMost research intensive institutions provide some form of data management support. However, the form in which these services are offered and how extensive these are differ and are often difficult to compare. Objective comparison of the different types of services is needed to evaluate the effectiveness of the diverse approaches and to make informed decisions about their usefulness. In this practice paper, we discuss a collaborative effort between Delft University of Technology (TU Delft), École Polytechnique Fédérale de Lausanne (EPFL), University of Cambridge and University of Illinois, which resulted in the development of a short survey to assist institutions in increasing the effectiveness of their data management support services and their evolution. Different approaches to a common goal Informal discussions between the research data service teams of TU Delft, EPFL, University of Cambridge and University of Illinois revealed that each institution had undertaken a different approach in designing their data support services. TU Delft has a central research data support team at the Library1, which is also part of a consortium of four Dutch technical universities (4TU)2. In addition, TU Delft is embarking on a Data Stewardship project, which will provide disciplinary support for data management embedded at faculties (Teperek et al., “Data Stewardship – addressing disciplinary data management needs”, abstract submitted to IDCC18). EPFL has a central data management support team, which provides generic, as well as on-demand, tailored training and data consultations to the research community3. This team is also assisted by liaison librarians, who know the data management needs of their faculties and help the central support to shape their service to meet disciplinary requirements. EPFL is also an active player in the national Digital Lifecycle Management project4. The University of Cambridge, in addition to small central team supporting researchers in data management, also has a dedicated programme of Data Champions - researchers volunteering their time to advocate for good data management in their local communities5. The University of Illinois has a central data management support team and is also part of a national network of subject-specific data curators6. Despite these differences, the goal of the four service providers is the same: to improve data management practice within their research communities. How can we therefore compare how good our approaches are towards achieving our common goal? Evaluation of existing measures Members of the four institutions first reviewed existing tools to assess data management support services. We first looked at the Research Infrastructure Self Evaluation Framework (RISE) framework survey created by the Digital Curation Centre7. However, we thought that this framework was more suitable for assessing the maturity of the data services offered. We then looked at the Data Asset Framework (DAF) used by several UK institutions8. The DAF survey is a comprehensive tool that allows institutions to assess researchers’ data management practice and identify gaps in service provisions; thus in principle, it should meet our requirements. However, the DAF survey consists of over sixty questions, which was not compatible with the repeated assessment we plan to do. We therefore decided to follow its general principle, but do something simpler and less resource-intensive. Short survey on data management practice Based on the DAF survey, we came up with a list of ten multiple choice questions that we found essential to reflect on researchers’ data management practice. By limiting the number of questions to ten and by ensuring these were multiple choice, we thought that first we were respectful of researchers’ time, and secondly, the approach would allow for results standardisation and comparison. In addition to a commonly agreed set of questions, each institution was able to add their own specific questions to obtain more granular information about the different research units and to get feedback about specific services provided to their research communities. Anticipated outcomes TU Delft and EPFL will launch the survey in October 2017, and will be followed by the University of Cambridge and the University of Illinois. We anticipate that the first comparative results will be available at the beginning of 2018. We expect that the results of the survey will provide a useful initial assessment of current data management practices across research communities, which will highlight to institutions where biggest gaps are and where more work is needed. The results will help understand the different disciplinary needs and the maturity of subject-specific data management practice, thus, allowing a more targeted approach. In addition, comparing the results between the institutions will hopefully highlight strengths and weaknesses of the different approaches they took in developing their data management support and will hopefully lead to best practice exchange. Limitations As with any other methodology based on surveys, there are limitations to our approach, which will affect the type of conclusions that can be drawn. First, the respondents will be self-selected, and therefore may not be representative of the research communities we are trying to sample. Secondly, institutions need to be cautious interpreting potentially different results for diverse groups of respondents as these might not be directly related to the quality or availability of data support services and might be affected by external factors, such as community norms, specific funders’ policies, influence of local authorities etc. Finally, the limited number of questions used in the survey limits the depth of possible conclusions about data management practices.Nonetheless, we believe that the benefits of our lightweight data management practice assessment, make the approach worth testing. Next steps The initial results, expected in early 2018, will allow us to evaluate whether the survey allows for comparative assessment of data management practice. If the survey proves to be suitable for such measurements, we will continue to use it to regularly evaluate the maturity of researchers’ data management practice at our respective institutions. Additionally, we plan to share the survey under a CC BY licence to enable others to use the tool for their assessments and to allow comparisons and collaborations with other institutions. References 1.         Research Data Management. TU Delft Available at: https://www.tudelft.nl/library/themaportalen/research-data-management/. (Accessed: 19th October 2017) 2.         4TU.ResearchData: Home. Available at: http://researchdata.4tu.nl/home/. (Accessed: 19th October 2017) 3.         ResearchData | EPFL. Available at: https://researchdata.epfl.ch/. (Accessed: 19th October 2017) 4.         Home :: DLCM. Available at: https://www.dlcm.ch/. (Accessed: 19th October 2017) 5.         Higman, R., Teperek, M. & Kingsley, D. Creating a Community of Data Champions. bioRxiv 104661 (2017). doi:10.1101/104661 6.         Johnston, L. R. et al. Data Curation Network: A Cross-Institutional Staffing Model for Curating Research Data. (2017). 7.         RISE, a self-start tool for research data management service review | Digital Curation Centre. Available at: http://www.dcc.ac.uk/news/rise-self-start-tool-research-data-management-service-review. (Accessed: 15th October 2017) 8.         Rob Johnson, Tom Parsons & Andrea Chiarelli. Jisc Data Asset Framework Toolkit 2016. (Zenodo, 2016). doi:10.5281/zenodo.177876

Aude Dieudé

and 10 more

Plan d'action pour le site DLCMDeadline for new website : end of July 2017 Contact : Bineta Ndaye, UNIGE (bineta.ndiaye@unige.ch) General DesignPlease check Branding Guidelines of DLCMPay attention to contrast (which currently is very bad)Search functionalitiesfiltersMenuMenu on top: Home, Project, Services, Partners, Contact  For inspiration, take a look at : http://dhlab.unibas.ch/activity/ Content of menu tabsHome What is DLCM? Add data life-cycle in the style of : http://www.data-archive.ac.uk/create-manage/life-cycle Key Resources and services available from DLCMUpcoming Events (upcoming trainings will be integrated here as well) : https://www.dlcm.ch/events/category/events/News :  https://www.dlcm.ch/category/news/Project Project Organization https://www.dlcm.ch/about-us/project-organization/UpdatesServices Every service as tile as in : https://www.vigiswiss.ch/fr/data-center/Services to be promoted so far:DMP DLCM ChecklistDLCM TemplateTo learn more (other resources)PolicyDLCM TemplateTo learn more (other resources)Tools Collection (?) or a few assorted tools, which we support? Coordination DeskELN ServiceLabkeysLIMSopenBISTo learn more (other resources)Long-Term PreservationCost Model avec estimation dateTo learn more (other resources)Expert NetworkDLCM Expert Map DLCM Consulting https://www.dlcm.ch/consulting/ Partnersall project partners, to be found here : https://www.dlcm.ch/about-us/partners/ SwitchEngineSDSCOther? ContactGood idea: https://www.favre-guth.ch/a-propos/teamAnother good idea: http://dhlab.unibas.ch/team/Names and generic contacts of all responsibles of each track At the moment, not all have photos, that's why, the name and contact is enough 

Eliane Blumer

and 3 more

_First brainstorm_ AUDE DIEUDÉ, ANA SESARTIC, PIERRE-YVES BURGI, ELIANE BLUMER WHAT WE ACTUALLY WANTED TO DO... With this background, this workshop aims at providing the unique opportunity to share the insights, experiences and best practices regarding innovative practices regarding long-term preservation of research data. This Swiss DLCM project, with its experience gained from concrete implementations, and which involves at the various partner institutions research units, IT services, and libraries – will serve as springboard for promoting and animating the discussion and debate across countries and continents. The audience for this workshop will target large communities of researchers nationally and internationally. HERE, SOME FIRST IDEAS FROM MY SIDE... We have in total 2x90min = 180min 1. Make an interactive exercise directly at the beginning by asking them about their experiences in RDM (30min) - Introduction round, where everyone tells their background & motivation for attending the workshop. - “placemat method” (was used in our training course) proved to be an effective way to gather one’s RDM experiences (see https://de.wikipedia.org/wiki/Placemat_Activity and http://www.humber.ca/centreforteachingandlearning/instructional-strategies/teaching-methods/classroom-strategies-designing-instruction/activities-and-games/place-mat.html) - (if you need ideas for icebreaker activities, http://www.icebreakers.ws/ has a good overview) 2. Give a short overview of what has happened in DLCM for all the points above (30min) - Insights - Experiences - Best Practices - Concrete Implementations 3. Split into smaller groups (if we have enough participants), give them our concrete implementations (see below) and let them discuss/compare it to other existing examples (if necessary share other examples, if they do not have any on their own (45min) - specifications - Active Data Research Management (ADRM) - checklist - policy 4. Make a plenum/discussion (45min) TO DO - What material do we need to prepare? - Marketing material (which one? from which institutions?) - Will we give handouts to the people, summarizing what’s been done at DLCM? Including contacts & pointer to website...? - Flipcharts and the like will be available on the premises, I assume?