Christian Pagé

and 2 more

Researchers and end users using climate data face a challenge when they analyze the data they need. Data volumes are increasing very rapidly, and the ability to download all needed data is often no longer possible. Most of the climate analysis tools for research and application needs must use very large datasets, often distributed among several data centres and into a large quantity of files. This is especially true when they are stored in a federated architecture like the ESGF. One of these tools is icclim (https://github.com/cerfacs-globc/icclim ), a flexible python software package to calculate climate indices and indicators. This tool adhere as much as possible to metadata conventions such as CF, implementing also provenance information. It also aims at providing increasing support for all FAIR aspects. It is designed with performance and optimisation in mind, because the goal is to provide on-demand calculations for users. It provides the implementation of most of the international standard climate indices such as ECAD, ETCCDI, ET-SCI, including the correct methodology for calculating percentile indices using the bootstrapping method. It has been validated against R.Climdex as well (https://cran.r-project.org/web/packages/climdex.pcic/index.html ). The new 5.x version of icclim is now based on functions from the xclim python library, which was inspired by earlier versions of icclim, but using xarray and dask for data access and processing. icclim is also a candidate as the software to calculate climate indices for the C3S toolbox (https://cds.climate.copernicus.eu/cdsapp#!/toolbox ). icclim is integrated in the IS-ENES C4I 2.0 platform (https://climate4impact.eu/ ), using a Jupyter notebook collection in a SWIRRL environment (Software for Interactive Reproducible Research Labs https://gitlab.com/KNMI-OSS/swirrl ). Having access to this type of analysis tool is very useful, and seamless integration with front-ends like C4I enable the use of those tools by a larger number of researchers and end users. This project (IS-ENES3) has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement N°824084.

Christian Page

and 4 more

Researchers and end users using climate data face a challenge when they analyze the data they need. Data volumes are increasing very rapidly, and the ability to download all needed data is often no longer possible. Most of the climate analysis tools for research and application needs must use very large datasets, often distributed among several data centres and into a large quantity of files. This is especially true when they are stored in a federated architecture like the ESGF. One of these tools is icclim (https://github.com/cerfacs-globc/icclim ), a flexible python software package to calculate climate indices and indicators. This tool adhere as much as possible to metadata conventions such as CF, implementing also provenance information. It also aims at providing increasing support for all FAIR aspects. It is designed with performance and optimisation in mind, because the goal is to provide on-demand calculations for users. It provides the implementation of most of the international standard climate indices such as ECAD, ETCCDI, ET-SCI, including the correct methodology for calculating percentile indices using the bootstrapping method. It has been validated against R.Climdex as well (https://cran.r-project.org/web/packages/climdex.pcic/index.html ). The new 5.x version of icclim is now based on functions from the xclim python library, which was inspired by earlier versions of icclim, but using xarray and dask for data access and processing. icclim is also a candidate as the software to calculate climate indices for the C3S toolbox (https://cds.climate.copernicus.eu/cdsapp#!/toolbox ). An example of a complex analysis tool used in climate research and adaptation studies is a tool to follow storm tracks. In the context of climate change, it is important to know if storm tracks will change in the future, in both their frequency and intensity. Storms can cause significant societal impacts, hence it is important to assess future patterns. These tools are integrated in the IS-ENES C4I 2.0 platform (https://climate4impact.eu/ ), using a Jupyter notebook collection in a SWIRRL environment (Software for Interactive Reproducible Research Labs https://gitlab.com/KNMI-OSS/swirrl ). Having access to this type of complex analysis tool is very useful, and integrating them with front-ends like C4I enable the use of those tools by a larger number of researchers and end users. This project (IS-ENES3) has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement N°824084.

Christian Page

and 6 more

End Users of Climate data have nowadays to struggle with accessing the data they need for their research because of the rapid increase in data volumes. The whole climate data archive is expected to reach a volume of 30 Pb in 2018 and up to 2000 Pb in 2022 (estimated). On-demand data processing solutions as close as possible to the data storage are emerging, thanks to newly developed standards, provenance and infrastructures. In Europe several initiatives are taking place to support scientific on-demand data analytics at the European scale. They offer the huge potential of interoperability, as for example the DARE e-science platform (http://project-dare.eu), designed for efficient and traceable development of complex experiments and domain-specific services on the Cloud. Also, the IS-ENES (https://is.enes.org) consortium has developed a platform to ease access to climate data for the climate impact community (C4I: https://climate4impact.eu). The platform is based on existing standards (ISO and OGC), such as WPS (Web Processing Service). DARE will integrate services from the EUDAT CDI, enabling generic access and cross-domain interoperability, as well as providing compliance and integration with the future EOSC platform. The DARE platform will use containerization technologies, so that it can be easily deployed on heterogeneous architectures. A scientific pilot has been designed within the DARE project for the ENES community (climate domain). The objectives are to enable delegation of on-demand computational-intensive calculations to the DARE platform, from the IS-ENES C4I interface, seamlessly. The DARE architecture and the solutions being implemented will be presented, along with the generic and agile approach taken to implement the pilot.

Christian Pagé

and 8 more

Researchers and end users using climate data face a challenge when they analyze the data they need. Data volumes are increasing very rapidly, and the ability to download all needed data is often no longer possible. Also, it can be complex to install, configure and use some advanced analysis tools on such large datasets. This is especially true when they are stored in a federated architecture like the ESGF. An example of a complex analysis tool used in climate research and adaptation studies is a tool to follow storm tracks. In the context of climate change, it is important to know if storm tracks will change in the future, in both their frequency and intensity. Storms can cause significant societal impacts, hence it is important to assess future patterns. Having access to this type of complex analysis tool is very useful, and integrating them with front-ends like the IS-ENES climate4impact (C4I) would enable the use of those tools by a larger number of researchers and end users. Integrating this type of complex tool is not an easy task. It requires significant development effort, especially if one of the objectives is also to adhere to FAIR principles. The DARE Platform enables research developers to faster develop the implementations of scientific workflows more rapidly. This work presents how such a complex analysis tool has been implemented to be easily integrated with the C4I platform. The DARE Platform also provides easy access to e-infrastructure services like EUDAT B2DROP, to store intermediate or final results and powerful provenance-powered tools to help researchers manage their work and data. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreements N°824084 and N°777413.

Christian Page

and 5 more

Researchers using climate data are facing challenge to analyze the data they need. Data volumes are increasing very rapidly, and the ability to download all needed data is often not a possibility anymore. A platform called climate4impact (C4I) has been designed and developed to enhance the use of research data, to support researchers with analytics and to support other climate portals. It is currently under development within the European Project IS-ENES3 and builds on previous developments from previous IS-ENES projects, CLIPC and C3S-Magic. C4I offers a front-end and standard services (with APIs) on top of the climate data infrastructure, and it can be visited at https://climate4impact.eu. The current version provides processing services include climate indicator calculations, country based statistics and polygon extraction. C4I makes use of the DKRZ Birdhouse framework, which is an extendable and modular processing framework based on PyWPS. Data is obtained from various ESGF nodes using secure OpenDAP. C4I provides a personal basket where users can upload their own data and do research with the provided tools. The software is open, reusable, modular and packaged. Components are available via docker containers to enable easy re-use. The on-demand calculations are taking place on the front-end server, and this is not scalable and can lead to performance problems. Within the DARE project, delegation of the calculations on the DARE Platform using the DARE API has been implemented and tested in a prototype, using EUDAT B2DROP as an intermediate storage service. It is to be noted that the DARE Platform as well as the EUDAT B2 Services should be interoperable with the European Open Science Cloud (EOSC). This prototype service delegation will be made operational during the upcoming year. In the IS-ENES3 project, the web portal will be redesigned with a completely new architecture using a micro-services and containerized approach, building on experience gained during the previous projects. The next version of the portal will be built using the React framework, which allows for creating large web applications which can change data, without reloading the page. We are actively seeking input from current as well as potential users at this time, to make the next version of C4I useful to as many people as possible. The material presented here is made possible because the IS-ENES3 project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement N°824084.