blasbenito edited materials_and_methods.tex  over 8 years ago

Commit id: a6558ace1472d8758cbca966d8aacdd6e622a807

deletions | additions      

       

Random Forest, a machine learning method based on the ensemble of classification or regression trees \cite{Breiman20015}, is regarded as a robust method to assess variable importance \cite{Cutler20072783}. We applied the randomForest R function \cite{randomforestcitation} to assess the influence of the environmental factors over habitat suitability at the continental scale. This analysis also provided a set of response curves produced by the plotmo R library, \citet{plotmocitation}), useful to understand the effect of each predictor over habitat suitability.  The analyses conducted so far cannot evaluate whether a particular variable is important at a particular region or not. To provide a further insight into this question and to understand how habitat suitability is shaped by the different predictors at the local scale, we defined \emph{local scale} as the average home range of Neanderthals. According to \citet{Daujeard201232}, based on the transportation of raw lithic materials, the regional mobility range of Neanderthals during the Middle Palaeolithic was around 50 kilometers (\cite{FéblotAugustins1993211}). \cite{FéblotAugustins1993211}.  In consequence we divided the study area into cells of 50 per 50 kilometers, and for each cell with more than 30 original cells (the resolution of the habitat suitability map and the predictors) we fitted one linear regression model per environmental predictor using habitat suitability as response variable. For any given predictor, we considered the R squared to be an indicator of its importance at the local scale, and the coefficient to be an indicator of its effect over habitat suitability. We interpreted near zero coefficients linked to low habitat suitability as regions with extreme values for the given predictor, while near zero coefficients linked to high habitat suitability values were interpreted as optimum habitat. Positive coefficients indicated a positive (but still sub-optimum) effect of the given predictor over habitat suitability, while negative coefficients indicated that the predictor values were beyond the optimum (e.g. too hot, too wet). All the results with p-values higher than 0.05 were recoded to no-data to reduce noise in the following analyses. We selected 44 European localities with different values of habitat suitability (See Table 1), and applied recursive partition trees (rpart library, \cite{rpartcitation}) using habitat suitability as response variable to group them in three different ways: 1) using the values of the environmental variables as predictors, to group the localities according to their environmental similarity; 2) using the local R-squared as predictors, to group the localities depending on the importance of the predictors; 3) using the local coefficients as predictors, to group together the localities with similar effect of the environmental variables.