1280px svalbard global seed vault (23273281972) (cropped)1

Fantin Reichler

and 1 more

1st Keynote : No (Open) Science without Data Curation: Five lessons from the study of Data Journeys (Sabina Leonelli)  --> Open Research as an opportunity, including scientific infrastructures, governance and how this should be credited and disseminated:   https://www.datastudies.eu/publications https://icsu.org/cms/2017/04/open-data-in-big-data-world_long.pdf Open Research on three aspects: Global ScopeSystemic Reach Local ImplementationFAIR data improves your research at many level. BUT their are requirement to make data FAIR, such as :coordination of data infrastructuresmaking data accessible on many platformsetc.Awareness of Open Science and its tools is still very low in the scientific community (EU Working Group on Educaiton and Skills under Open Science, 2017) https://www.garnetcommunity.org.uk/sites/default/files/GARNet_Paper_nplants201786-1.pdfhttps://www.datastudies.eu/publications  It is important for researcher to have a bit of knowledge of the tools/methods to make their data FAIR. The most important thing, is that some people - us - can share with them an expertise about these tools/methods, and help ease the confusion that the researchers might feel while putting in practice FAIR.Focus on qualitative data: Databases, example of plant scienceData Re-use cases  Data journey example : TAIR (not FAIR) https://www.arabidopsis.org/preparing specimenspreparin gand performing imagingdata storage dissemination.... ...AnalysisEpistemic troubles :- RD collected represent highly selected data types- selection basesd on political-economic conditions of sharing- peer reviews structure unclear- misalignement between it and research need- no sustainable plans for maintenance - ....Lessons Learnt on a general field Context specific data curaiton is key to data re-use Long-term maintenance is key to trustowrthiness (update, LT Policy)Which data and why? data & materials (connect digital data with data in the physical world)Role of ethics, humanities & social sciences in data management (increase quality and reusability)http://press.uchicago.edu/ucp/books/book/chicago/D/bo24957334.html---------------------------------------------------------------------------------------------------PLENARY : RESEARCH PAPERSMeasuring FAIR Principles to Inform Fitness for UseCarolyn Hank from University of Tennesseepast paper on  http://datacurationprofiles.org/ => 10.1002/pra2.2016.14505301046 "Fitness for use" > focus on the "reusable" aspect of FAIRMethod : interviewJob-related demographics with questions such as, 'what is your current job title?', 'how many years have you work in this instition?, 'how many have you been work in the discipline?' etc.Findability >'how did you find the data?', 'DOI', 'metadata?'Accessibility > 'How did you access the data?' 'Open format?' 'was the data free?' 'was the metadata accessible?'Interoperability>'was the data in a useable format' 'encoded?' 'machine-actionnable?'Reusability>'were the metatadata sufficient ?' etc.Potential implications : data can be FAI, but R requires more research=> create ne knowledge of how scientists access and use data=> producing a framework to enable re-use----------------------------------------Giving datasets context : a comparison study of institutional repositories that apply varying degrees of curaiton(Amy Koshoffer, Cincinnati, USA) questions :1. How do the metadata vary for each insittution? 2. completeness of metadata3. curated datasets do have more documentation4. DOIs more with curated datasets5. keywords What is curation? - appraisal/selection- check/run files : include clode review, review sensitive information, merde elle parle trop vite! 4 universities : one repository per institution20 datasets per repository. Comparaison with mandatory mData / unmandatory for each universityResults Question 1:  - all universites use title in metadata (for instance), but all of them understand something different - all datasets had above the minimum metadata required Question 2: - 53% completedness, but different for every institution she looked at - use of the Mann-Whitney U test : https://fr.wikipedia.org/wiki/Test_de_Wilcoxon-Mann-Whitney- no optional use of supplementary metadata in curated and not curated repositories --> Does curation really have an impact then? note sure if the curation service does.Question 3:- curation does have an impact on documentation Question 4: - all support DOIs, but in different ways. They might be other factors to take into account than curation processConclusion :- Curation process may have had a measurable impact, BUT more factors may be impactful- Curation > more documentation & more readme.txt---------------------------------------------------------------------------------Complexities of digital preservation in a virutal reality environment, the case of virtual Bethel University of Indianpolis,  Angela Murillo "CHUUUUUUUUURCH!"E. Blumer, 2018Creation of a VR space for a churche > the question is : how do we preserve this VR space ?-  at the time, there was an archive (docuemnts/physical objects) but recently the building was sold- 3D virtual space of the church + learning space (history of the building)https://comet.soic.iupui.edu/bethel/Preservation challenges :- nature of 3D data- VR operationTypes of data : pre-prod / prod / post-prod + files that make links between the three phaseUpdates : 40gb to 60gb (mainly for the creation of learning spaces)The problem is that until now, there are no VR object preservation frameworkUse of NDSA Standard for Levels of Digital Preservationhttp://ndsa.org/activities/levels-of-digital-preservation/Essentialy : Work in progress .... progress ... progress.... progress... progress...---------------------------------------------------------------------------------------------------------------------------------------------------PARALLEL SPEAKSENABLING AND MEASURING FAIR (Fantin) Are research data sets FAIR in the long run - Dennis Wehrle, Freiburg Spoiler alert : there's no definite answer to this questionPick 10 public rep. through the 1800+ Re3data.orgFor each of these 10 rep. , selection of 10 datasetsLimitation for the datasets :- Open- etc. (too fast)Use of Havard's File Information Tool Set (FITS), which contains 12 analysis tools. Test dataset : 237 GB to analyse, which represent 5h20 of processing : it represent 85 days of processing for the whole sample (100 datasets), so they took shortcuts (too fast to note)FITS result : no result / single result / conflicting result / unkown resultAggregation of identical named formatUnification of "unknown result", still there were 28 conflicts (2150 files) to post-processIn the end : app. 145 formats identified (lower estimation) - a few files were still unidentifiedimages : png/jpgText encoded format : CSV, XML, RTF, HTMLScript/source code : readable with text editor, base64-encoded in XML, JavaScript in HTML, refereence to external dat in (X)htmlProblematic "text files" : unown binaries, matlab, SPSS, OCtet StreamSUSTAINABILITY :Formats division :- high probability (plain text/pdfa)- medium probability (open formats such as OPEN Office)- low prob (.doc, prioritary formats)Applied from data format to datasets : Most of the datasets had LOW PROB (3/4 approx.)Advice to datasets creators : change their formatResult : single file format migration may not be sufficient. As a matter of fact, most of datasets are heterogeneous.Lesson learnt : -Data service shouldn't refuse "bad file format" (poorly ranked one), but help researchers create workflow to embed them in LT preservation process. Involvment of datasets creator is necesserary.- tools mentionned (FITS) have weak supportEnabling FAIR Data in the Earth and Space SciencesShelley Stall, American Geophysical UnionAgu position of data tends to respect FAIR principleSurvey was taken, the top for issues are the following without surprise :- data complexity- findingrelevent existing data- TOO- FASTStorytelling : a student had his computer stollen, the data  was only on it. Later the publication was retracted because of this (because of the fact the data weren't deposited anywhere)A new funder's grant is taking place : to get it, your data has to be FAIRAGU service :- streamline data policies-help researchers find support- dmp support- etc.Data Management Traing Clearinghous : bit.ly/DMTC_events :  http://dmtclearinghouse.esipfed.org/ (not AGU project, but communitary project) => online learning resources.Face 2 face meeting : rd-alliance.org Include your organization sstall@agu.org------------------------------------------------------------------------------------------------------------------PARALLEL SPEAKSCross-institutional and national data services (Eliane) Lisa R. Johnston - Data Curation Network: A cross Institutional Staffing Model for curating research data- Building the data curation network - all universities in USA Idea: collaboratively sharing data curation staff - How would we deal with conflicting policy issues? - What do researchers actually need our help with? Will they care if curation is distributed? - Can I trust someone else to curate our data? What about quality control? Start with: 9 institutions (all of them contributing to the curators) , 19 data curators, 1 project cooridnator, 1 program director, 8 DCN representatives, 2 admin leads Day 1: Business Meeting Day 2-3 : Curator Training/Network Process: Ingest, Appraise and Select, DCN, Facilitate Access, Preserve Long-term DCN: Review, Assign, CURATE, Mediate, Approve- Check files and metadata- Understand and run files- Request missing information - Augment metadata- Transform file formats- Evalute for FAIRness = CURATEAssessment: Is a network approach to curate research data more efficient? Indicators: number of datasets, frequency, variey, efficiencyAre Curated data more valuableIndicators: track reuse indicators, implement a DCN registry, apply badges and metadata to signal that data sets curated by the DCN are FAIR Making everything available: British Library Research Services and research Data Strategy : Rachel Kotorski, British Library - New department : everything available "research services" - change management portfolioResearch data strategy: make research data business as usual (this is not the case at the moment), users will be able to use reserach data via toolshttp://blogs.bl.uk/digital-scholarship/2017/08/announcing-the-new-british-library-research-data-strategy.html Four themes: data management: documented data management processes, british library data management plans, data management plan engagement, data management trainingdata creation: generating data at the library, advising on data creation, carify approach to data collection, engaging and linking with others --> idée for EPFL: rajouter quelques sources de données sur le site web des BL? https://data.bl.uk/ : also digitized content data archiving and preservation: preserving library data, sharing data preservation expertise, data preservation services for third parties, digital shared storage data discovery, access and reuse : discovery for library data, third-party data discovery, new models of data access, tools and skills for data exploration, datacite UKThis cannot be done alonE!!! Internaitonal Reserach Infrastructure - funder or partner ? Angeletta Miranda Leggio (ANDS) Working together with other existing groups A lot of collaborations on local levels Funded projects like : Open Access to Marine Data, Open River, PetaJakarta Do you see ANDS as funder, provider or partner? Too fluffy ....I do not reall now what to do out of it -------------------------------------------------------------------------------------------LUNCH. It was really good.---------------------------------------------------------------------------------------------Minute madness Metro Fun - Train the Trainer R. Schneider (vote 1) Of coooooooooose! The Bible holds the answer to eeeevrything!There goes my time....joggling during RDM trainings. Data Citation in Social Sciences (à regarder)FDMentor www.forschungsdaten.org/index.php/FDMentor (more than one minute)Maredata - uanc apon a taim, thea wea several (best accent!) - RDM Iberia HODs/rd: research data harvester based on repositories ...Gugeell=Google Holistic RDM service HannoverJisc RDM toolkit for international community (expörts in se field)Grace : exporing the cost and scalability of reserach data management services (Göttingen) https://www.sub.uni-goettingen.de/en/projects-research/project-details/projekt/grace/  (2 vote) Building a reserach data management training community (efaluation foorm) How federated reserach data infrastructure work Crosswalk - Resurrecting data back from the deadLong-tail of dataAgile data eco-system Data sharing workflow for large datasets with globus (the shortest presenter of the posters)Dtaa Processing Pipeline - Finnish National Preservation Service (a little bit taller than the speaker before) (à voir)Defining Library Capacity for Big Data curation (the tallest presenter from all posters) (à voir)Research data management courses : overview and gap analysisscientific data science service at Brown University (à voir) Springer Nature Research Data Service (äreeund=around) (*buuuuuuh*) Curriculum RDM at Toronto University Supporting Open research using KiltHub https://kilthub.figshare.com/RDM for phd (icecreeeeeeam!!!!!)Preservation of Canadian reserach data (service model) scaling up data management services with metadata in gene sequencing (à voir) surveying data management practices among neuroimaging researchers (àvoir ) Forsbase (ELLE S'APPELLE ELIANE!!!!!) New online course, deliver RDM services from DCC--------------------------------------------------------------------------------------------Demonstrations (Fantin et Eliane vont dans la même session, car DMPOnline on connaît par coeur) The Arctic World Archive  https://www.piql.com/arctic-world-archive/ Piql is a norvegian preservation companydigital vault designed to protet most valuable dat a from wars, cata strophes and cyberattacks.


and 3 more

TU Delft 30/11/2017 1. TU Delft university presentation Larger tech university in Netherlands (founded 1842)- more than 20000 students, 8 faculties , >5000 staff.Open science as one of the main stakes for the old and also for the new rector Curiosity and openness as anchor value; being confident and renewing as aspiration value; foundation value is connecting -> promise let knowledge flow freely ("freedom to excel")2020 milestones: relevant scientific information should be findable, searchable, reusable, divisible. -> 46% publications OA 4 main areas of work: discovery and deliver; data at work (they started 7-8 years ago); publication and impact (negotiations with publishers in order to complain with OA; contracts with OA publishers; OA fund); library environment (individual and group working spaces). -> there is also the R&D group for innovating and future-proof services. Organisation: research services; education services; resources; overarching (R&D, liaison & policy support, communication). 124 people in the library. (92 EPT)Open research, education, innovation, campus -> values for 2018-2024 (Open Science PhD training @Delft) 2. TU Delft Library collection Content and licensing not managed by the same people. Barely buy paper books. In 2003 change of acquisition policy; 1.000.000 paper books in the basement; PDA from 2010 (-> Mediate PDA, with motivation from the reader, and someone in the library who check if the ebook is already accessible via another providers; there is also a limitation on price - no more than 200 USD/book). Missions -> support university education and research programs AND anticipate on new multidisciplinary research fields. Journals in the framework of Consortium deals. Impossible to have more tailored (and smaller) collections (even if it is done with the other TU). Focus only on the requests/needs of users (not on turnaways :-)). System: via OCLCMain issues: budget cuts: split between core collection and additional collection -> difference in the way of financing publishers policy (see above) less HR for collection tasks (no subject librarians): no direct links with faculties anymore. Twice a year: meeting with faculties to collect wishes and needs (for books acquisition only: PDA/Proquest; Evidence Based acquisition/Elsevier). insights in usage: need for better numbers (which faculty is using what -> this is the main info they would like to have). Implementation of EZ Proxy (can create more problems compared to IP range). Through ezPAARSE they can monitor and identify major user groups. 3. TU Delft Research services (35 people)  Strategic framework 2018-2014 (to be published in November :-)) for TU Delft "Impact for a better society" (slogan of the university). The present slogan in "Freedom to excel". Big focus on the main world wide issues (energy, resources, etc.). Open science as systemic change - all is about sharing! Enhance scientific research: - Enlarge dissemination; - open FAIR data- open courses- open software- Stimulate cooperation (instead of competition; new rewarding system; new metrics); - Work on citizen participation (scholarly comm & involving people in science). The service offers full support to the research cycle. Scientific information : from subscriptions to OA; from just in case to just in time; for ownership to providing access. Data sharing and archiving: 4TU.Centre for RD; Digital lab notebook (put everything in one system... ongoing, not easy work); Open software (less published; problem of recognition)Publishing support: academic visibility (all TU Delft outputs are made available - at least metadata -  with institutional /faculty brand); TU Delft Open -> proper publisher (already published journals, mainly architecture); research intelligence (new ways of evaluating research).  Governance of scientific process -> it is a need because: - they had issues on scientific integrity in the Netherlands - Archiving research and education - Repository, data archive, lab notebooks. [Labservant (?) -> chemistry domain -> CHECK! ]4. Open Access promotion and financing in Netherlands Promotion kit -> How to guide; Making an impact with OS (-> course for PhD https://www.tudelft.nl/onderwijs/opleidingen/phd/doctoral-education-programme-de/training-programme/r1a3-making-an-impact-with-open-science/); POLICY ON OA; National OA monitor (now mature). TU Delft on OA publishing pretty easy -> as TU Delft author, you have to publish at least the final accepted version of a peer-reviewed article with required metadata in the TU Delft Institutional Repository. Most common question: how to finance OA? At the national level -> OFFSET AGREEMENTS WITH PUBLISHERS. There around 8000 journal with OA discount. Schéma in the library catalogue (library.wur.nl) for journals, with info related to the journal (APC Discount, Green OA - info from Sherpa/Romeo - , impact, recent articles). Pre-paid models (proper to TUDelft): Copernicus, MDPI, PLOS, IOS Press, Frontiers, PeerJ. Classic TU Delft OA fund (starting in 2008) - Individual financial support: around 380 sponsored publications (increase of requests for OA ebooks). Repositories @TU Delft:Research repository -> core one Education repository Cultural heritage repository Different repositories because of different purposes, and metadata, and different (or no) implication of publishers. OA TU Delft publications: 2013 / 15%; 2015: 30%; 2016 / 44%. Ambition for 2018 -> 60%. Copyright Information Point TU Delft, with a dedicated website. -> list of useful links at the end of presentation (if shared). 5. Research analytics toolbox Why: gain insights into the positioning of a research filed, identify patterns and trends within research fields, ... -> research analytics in Delft introduces researchers and faculties with a toolbox (data sources and powerful and easy to use analysis and visualization tools). Need for a better use of data science tools in order to provide appropriate responses to these questions (ex.: datasets available for biblio metric; position of my group in relation to the competitors; etc.). 3 steps in the process: data collection, analysis, communication. Offer these steps in an integrated manner: AIDA (Automatic IDentification of reseArch trends), offering the toolbox out there, and the library provides the basics to let the researchers DIY. - Data collection: from where? WOS, Scopus, Pure, Almetrics, CWTS - Data analysis: some of the databases have already analytical features, but also excel is pretty powerful. Some of these tools need extra methodologies or TDM. - Communication: the main purpose of the rest of the work (Aida.tudelft.nl) Almetrics project @ TUDelft -> this toolbox (and related tools and databases) lay on the traditional citation system. To go further in the path of Open Science, they are running a test/exercise with Altmetrics. Choice of 900 doi, and analysis creating a link between citations and altmetrics score. TU Delft library provides research analytics support (check library website) for this toolbox to Individual researchers, faculties, policy makers. (Almetrics.com -> check). 6. Data stewardship (slides on the blog) - Policy framework : develop a system where every faculty could develop its own policies (DMP, PhD training, Obligatory deposit), in a general policy framework. - Culture of data stewardship: What issues? How to build a responsible RDM? Responding to strategic drivers from funders & recognition and rewards for OS Ensuring transparency and reproducibility Not losing data What actions? Data stewardship project embedded in the faculties (first contact point in each faculty) and extra support from data officers in the library. (1 repr per Faculty (0.5 EPT))Role of data stewards -> Assisting in the planning collection, management and publication of data in new and ongoing projects, Help writing DMP Outreach and advocacy Understanding trends Running faculty specific training Providing advice on specific issues (mainly publication) Quantitative and qualitative metrics to assess the success of data stewardship Direct link with senior management in the faculty Infrastructure: 4TU.Research Data -> data archive provision to researchers (guaranteed for 15 years). Future plans: API to allow programmatic access to archive, greater functionality for sharing data during the project; different publishing options; ongoing work on technical infrastructures. Reviewing the metadata (not the data!) of each dataset (title, link to the article, ORCID; etc). If there is no readme file, the dataset is rejected (methodology, acronyms, ...). The back office cannot check the reliability of data itself. The time spent is 1 hour per dataset (more or less 150 dataset/year). Ongoing project to make data deposition mandatory for PhD at the end of their cursus. Process of outsourcing the infrastructure. Too expensive to maintain the current one (HR and costs only to maintain the current structure, not to improve it/make it evolve). -> Figshare or Mendeley Data (TUDelft in the middle of the tender procurement to choose). Embedded data stewards, systematic and discipline-base training, ... Rotterdam - Erasmus University 01/12/2017 1. Strategy University Library(Gert Geris Deputy director, research intelligence, research outputs )University: 28000 students, 3000 staff, social sciences, biomedical sciences, humanities. Budget 588M €Library: 900 places, 75 employees (62 FTE), 900'000 volumes -> dept.: Academic services (supporting education and research), Library learning center, Information provisioning. Mission: support to creations and dissemination of knowledge. Promise: content manager of the UniversityBridge between customers and library.Creation of new services: - creation together with customers, but still gap or mis-knowledge of them. Need for a new way of building new services. - Research intelligence RESEARCH SERVICES: Grant support Research Intelligence (data scientists , bibliometricians, ...) Legal RDM Valorisation -> Build communities around these axes, and build within these communities services and communication. IT and other support dept are also involved. Other university library services: information literacy, data service center, research evaluation assessment service. Services are put in several communities (among the ones before mentioned). Research intelligence community : creation of the community -> by invitation (?) Missions: Monitor and manage research performance (quality, relevance, strategy, visibility,...) Support international collaboration networks Improve funding win-rate Monitoring global research trends (multidisciplinaire developments and emerging topics) Responsible metrics (analysis reports and support for research quality assurance Action to enhance dissemination Further developments : funding intelligence; RD intelligence (data citations, mapping the use of RD through research output analysis) Many tools used for that (data collection and analysis tools). Last summer a Research Intelligence network Netherlands has been established (CHECK!!!). LDE research intelligence initiative (Leiden, Delft, Erasmus). 2018 goals: identity management, research evaluation; training in use of Scival, InCites, Altmetrics; support in publishing OA; development of a predictive modeling service. 2. ACADEMIC SERVICES departmentResearch and education support (also tailor made), Front office, Liaison, RDM, ... Both "traditional" and "new/research oriented" ones (OA, OS; RDM, Research intelligence, copyright, Privacy law); training in academic skills. Specialists: faculty liaison, information specialist, data librarians, data intermediary, web coordinator, specialist licences and OA. Focus 2018: digitalization, intensify cooperation with faculties. Strong move to digital content in 2015 (250.000 less books - out of collections). 5-years-strategy: merging of different libraries, and working processes accordingly. National consortium -> UKB, 13 libraries, for negotiations (from 55 to 75 licences in 4 years). Transition to Worldshare platform (check SURFmarket). Modification in the budget allocation, loans, usage of resources (printed and paper) -> need for a shift on the hired people (new skills and competencies needed). For electronic resources, different models are used according to fit the services (Pick & Choose, EBS; PDA; Short term loans). Enhanced used of statistics. Pilot on e-studybooks. 2018: only e-acquisitions, unless you have a very good excuse :-) 3. RDM, FAIR and Open science Central policy starting with the FAIR principles and Baseline protocol (main elements to be put in RDM responsible reflection -> 15 points). Implementation in faculties of the protocol based on FAIR. Data services -> information provision, legal support, practical advice for storage, sharing and privacy, support for DMP, and data paragraphs in grant and research proposals, Erasmus Data Service Centre (EDSC -> best practice example on how to organize RD services) EDSC: largest portfolio of economic and financial databases, access to on-site DB, guidance and support for DB (based on a voucher system), workshops for students, manual & FAQs, facilitation of student events. 2018-2021 strategy: providing access to more DB (TDM); subject as a base to organize a workshop: boost and more explicit role in the research data infrastructure (help more in the active phase of research); ... Open Science -> data education standards access. How to get people on board? How to transform principles in real practice? OS priorities: - providing access to more and different content (new databases, TDM, GIS analysis; collect and stimulate the use of Open Data). (Fair _> accessible, interoperable; OS: empowerment) - Focus on subject oriented support and development of EDSC knowledge center - Boosting the EUR research data infrastructure (lead the IRODS workgroup in the RDM community (FAIR: findable and accessible; OS: collaboration transparency and sharing) - Expansion of EDSC services (EDSC should become data collector and producer; draw up a service for TDM; support for the use and analysis of large datasets with google big query, and google cloud platform). FAIR-> interoperable and reusable; OS: collaboration and empowerment - Intensify internal and external cooperation: stronger collaboration with Erasmus Data Science community ; participate in blockchain project (FAIR: I, R; OS: collaboration and empowerment).