Authorea

blasbenito edited materials_and_methods.tex over 9 years ago

Commit id: 1fb2c5f0039619c2be2e91e4a255deb84ff74ebf

deletions | additions

\textbf{Species distribution modeling} After considering very carefully the nature of our data (lack of absences, low number of presences, unknown bias, and overdispersion), we have chosen generalized linear models (GLMs) calibrated with the quasibinomial family and weighted background as species distribution modeling method. GLMs are simple methods, and therefore provide a higher tractability, allow a better understanding of the drivers of species distributions and are easily generalized to new datasets \cite{Merow1267} The application of the quasibinomial family to calibrate the GLM allows to deal with overdispersed data. The use of a weighted background compensates for the lack of absences by providing a comprehensive sampling on the available ecological conditions across the study area. But the geographical distribution of the background points affect modeling outcomes (CITATION VAN DER WAAL), and there are several studies (CITATIONS) recommending to restrict the background to areas accesible to the species by dispersal. In our case, there was no data to develop such criteria, and therefore we explored the sensitivity of the model to different sets of background points generated inside buffers around the presence records at increasing radius (100, 200, 400, 800 and 1600 km). To compensate for potential taphonomical or geographical bias, and following the ecological niche theory (CITATION) we assumed that the species responses to the environmental factors were gaussian. GLMs link model structure to hypothesis by allowing the users to define the shape of the response curves and the important interactions between variables. To include this assumption into the modelling process, we configured the GLMs to consider second degree polynomials using formulas with the form \textit{response ~ poly(variable1, 2) + poly(variable2, 2) + ...}. We do not considered interactions. One drawback of this approach arises when the number of presences is low. As a general rule, at least five presence points per predictor are required in a GLM fit to avoid overparameterization (CITATION), but this number raises to ten when using two degree polynomials to fit the model. In our case, with six predictors and up to (NUMBER OF POINTS) points, to fit a single model would have lead to an overly overparameterized model. To overcome this problem, we used the \textit{dredge} function of the R package \textit{MuMIn} (CITATION) to generate all the GLM equations combining the six predictors in groups of one, two and three, resulting in 41 different equations (EQUATIONS IN APPENDIX!). We calibrated one model for each combination of equation and background radius to obtain 205 different models.