Large urban centers like the Metropolitan Region of São Paulo (MASP) are impacted by air pollution, especially by Inhalable particle matter (PM10). Persistent exceedance events (PEE) are defined as exceedance events that last for many consecutive days and occur simultaneously at many air quality monitoring stations across the MASP. This study aims to develop a predictive model for the occurrence of PEE in the MASP based on surface meteorological variables. Hourly PM10 concentrations from 12 air quality monitoring stations in the MASP between 2005 and 2021 were provided by the São Paulo State Environmental Agency (CETESB). Daily data on surface meteorological variables were provided by the IAG/USP meteorological station. Persistent exceedance events (PEE) were identified using the criteria: exceedance events that occurred simultaneously in at least 50% monitoring stations, persisting for at least 5 consecutive days. PEE occurrence was represented as a timeseries of a binary variable. The resulting daily dataset had 6204 lines and 13 attributes, without missing values. The dataset was divided into a training set (80%) and a test set (20%). A logistic regression model was applied, having the PEE occurrence (positive = 1) as the target value. The Variance Inflation Factor and the Stepwise Feature Selection method was applied to obtain an optimized subset of predictors. Model accuracy was accessed by the ROC curve and by a confusion matrix. Results indicate that PEE can be satisfactorily predicted by surface meteorological variables using a logistic regression. As for the next steps, we intend to extract easy-tocommunicate classification rules, aiming to support the development of warnings systems for air quality poor conditions in the MASP.

Lucas Bauer

and 5 more

The Amazon rainforest has a great influence on the global energy balance and carbon fluxes, responsible for the net removal of approximately 4 million tons of carbon per year, via photosynthetic activity. Climate change and deforestation have impacts on the carbon budget in Amazonia, transforming CO2 sink areas into sources. Given the complexity of the factors that govern the carbon exchange in the Amazon and its influence on biological processes, the use of Data science strategies can promote a better understanding about the main environmental factors for different scenarios, and also, assist in public policies to mitigate the global warming effects. This study aims to identify the environmental factors that determine the temporal variability of carbon exchanges between the biosphere and the atmosphere in the Tapajós National Forest, in the Amazon, applying Data Science strategies in an integrated set of environmental data from energy and carbon fluxes and remote sensing data. The specific objective is to assess the influence of a selected set of environmental variables on the variability of carbon exchanges, with the use of an artificial neural networks classification model to identify the variables with great impact on source, sink and neutrality scenarios in Tapajós National Forest. Data Science strategies were applied to an integrated dataset of ground-based carbon flux measurements and remote sensing data, considering the period between 2002 and 2006. An artificial neural network (ANN) classification model was developed to identify the environmental variables with great impact on carbon source, sink and neutrality conditions. The average global score of ANN model was 65%. It was possible to identify the predictor variables with greatest impact to the carbon sink condition: radiation at the top of the atmosphere, sensible and latent energy fluxes and leaf area index. Thus, the ANN model with an ensemble of Data Science strategies can improve a better understanding of variability CO2 fluxes and be a powerful tool to promote new knowledge.