Arthur Grundner

and 3 more

A promising method for improving the representation of clouds in climate models, and hence climate projections, is to develop machine learning-based parameterizations using output from global storm-resolving models. While neural networks can achieve state-of-the-art performance, they are typically climate model-specific, require post-hoc tools for interpretation, and struggle to predict outside of their training distribution. To avoid these limitations, we combine symbolic regression, sequential feature selection, and physical constraints in a hierarchical modeling framework. This framework allows us to discover new equations diagnosing cloud cover from coarse-grained variables of global storm-resolving model simulations. These analytical equations are interpretable by construction and easily transferable to other grids or climate models. Our best equation balances performance and complexity, achieving a performance comparable to that of neural networks ($R^2=0.94$) while remaining simple (with only 13 trainable parameters). It reproduces cloud cover distributions more accurately than the Xu-Randall scheme across all cloud regimes (Hellinger distances $<0.09$), and matches neural networks in condensate-rich regimes. When applied and fine-tuned to the ERA5 reanalysis, the equation exhibits superior transferability to new data compared to all other optimal cloud cover schemes. Our findings demonstrate the effectiveness of symbolic regression in discovering interpretable, physically-consistent, and nonlinear equations to parameterize cloud cover.

Aytaç PAÇAL

and 5 more

Extreme temperature events have traditionally been detected assuming a unimodal distribution of temperature data. We found that surface temperature data can be described more accurately with a multimodal rather than a unimodal distribution. Here, we applied Gaussian Mixture Models (GMM) to daily near-surface maximum air temperature data from the historical and future Coupled Model Intercomparison Project Phase 6 (CMIP6) simulations for 46 land regions defined by the Intergovernmental Panel on Climate Change (IPCC). Using the multimodal distribution, we found that temperature extremes, defined based on daily data in the warmest mode of the GMM distributions, are getting more frequent in all regions. Globally, a 10-year extreme temperature event relative to 1980-2010 conditions will occur 15 times more frequently in the future under 3.0oC of Global Warming Levels (GWL). The frequency increase can be even higher in tropical regions, such that 10-year extreme temperature events will occur almost twice a week. Additionally, we analysed the change in future temperature distributions under different GWL and found that the hot temperatures are increasing faster than cold temperatures in low latitudes, while the cold temperatures are increasing faster than the hot temperatures in high latitudes. The smallest changes in temperature distribution can be found in tropical regions, where the annual temperature range is small. Our method captures the differences in geographical regions and shows that the frequency of extreme events will be even higher than reported in previous studies.

Evgenia Galytska

and 6 more

In this study, we apply causal discovery to analyse causal links among key processes that contribute to Arctic-midlatitude teleconnections. First, we calculate the causal dependencies from observations. We then evaluate climate models participating in the Coupled Model Intercomparison Project Phase 6 (CMIP6) via a comparison of their causal graphs for the period of 1979-2019 with those derived from observations. Based on observations, we show that the increase (decline) of near-surface Arctic temperature is associated not only with the reduction (increase) of sea ice over the Barents and Kara seas, but also with the strengthening (weakening) of atmospheric blocking over central Asia. We show that the near-surface westerly winds are strongly associated with the phase of the North Atlantic Oscillation (NAO). Observations show that the phase of NAO is connected with the polar vortex (PV), which is affected by the strengthening of the poleward eddy heat flux at 100 hPa. The analysis of CMIP6 historical simulations is in good agreement with the observations but reveals a negative connection between near-surface Arctic temperature and sea ice over Barents and Kara seas, which was not found in observations during December-January-February 1979-2019. Moreover, climate models simulate a more robust link between Arctic temperature and Barents and Kara sea ice towards the end of the century. The analysis of future changes in the Arctic-midlatitude teleconnections during cold seasons 2059-2099 also reveals that the connection between the Aleutian Low and the poleward eddy heat flux is expected to become more robust than in the analysed past.

Arthur Grundner

and 5 more

A promising approach to improve cloud parameterizations within climate models and thus climate projections is to use deep learning in combination with training data from storm-resolving model (SRM) simulations. The Icosahedral Non-Hydrostatic (ICON) modeling framework permits simulations ranging from numerical weather prediction to climate projections, making it an ideal target to develop neural network (NN) based parameterizations for sub-grid scale processes. Within the ICON framework, we train NN based cloud cover parameterizations with coarse-grained data based on realistic regional and global ICON SRM simulations. We set up three different types of NNs that differ in the degree of vertical locality they assume for diagnosing cloud cover from coarse-grained atmospheric state variables. The NNs accurately estimate sub-grid scale cloud cover from coarse-grained data that has similar geographical characteristics as their training data. Additionally, globally trained NNs can reproduce sub-grid scale cloud cover of the regional SRM simulation. Using the game-theory based interpretability library SHapley Additive exPlanations, we identify an overemphasis on specific humidity and cloud ice as the reason why our column-based NN cannot perfectly generalize from the global to the regional coarse-grained SRM data. The interpretability tool also helps visualize similarities and differences in feature importance between regionally and globally trained column-based NNs, and reveals a local relationship between their cloud cover predictions and the thermodynamic environment. Our results show the potential of deep learning to derive accurate yet interpretable cloud cover parameterizations from global SRMs, and suggest that neighborhood-based models may be a good compromise between accuracy and generalizability.