Randall Etheridge

and 3 more

Private groundwater wells have the potential to be an unmonitored source of contaminants that can harm human health for millions of people throughout the United States. Developing models that predict potential exposure to contaminants, such as nitrate, could guide sampling efforts and allow the residents to take action to reduce their risk. Machine learning models have been successful in predicting nitrate contamination using geospatial information such as proximity to nitrate sources or soil type, but previous models have not considered meteorological factors that change temporally. In this study, we test random forest (regression and classification) and linear regression models to predict nitrate contamination of wells using rainfall and temperature records over the previous 180-days. We trained and tested models for (1) all of North Carolina, (2) each geographic region in North Carolina, (3) a three-county region with high density animal agriculture, and (4) a three-county region with a low density of animal agriculture. All regression models had poor predictive performance (R2 = 0.04) for all areas tested. The random forest classification model for the coastal plain region showed fair agreement (Cohen’s kappa = 0.23) when trying to predict whether contamination occurred. All other classification models had slight or poor predictive performance. Our results show that temporal changes in rainfall and temperature alone are not enough to predict nitrate contamination in most areas of North Carolina but show potential in the coastal plain region.