\documentclass[10pt,a4]{article}
\usepackage{fullpage}
\usepackage{setspace}
\usepackage{parskip}
\usepackage{titlesec}
\usepackage[section]{placeins}
\usepackage{xcolor}
\usepackage{breakcites}
\usepackage{lineno}
\usepackage{hyphenat}
\usepackage{times}
\PassOptionsToPackage{hyphens}{url}
\usepackage[colorlinks = true,
linkcolor = blue,
urlcolor = blue,
citecolor = blue,
anchorcolor = blue]{hyperref}
\usepackage{etoolbox}
\makeatletter
\patchcmd\@combinedblfloats{\box\@outputbox}{\unvbox\@outputbox}{}{%
\errmessage{\noexpand\@combinedblfloats could not be patched}%
}%
\makeatother
\usepackage[round]{natbib}
\let\cite\citep
\renewenvironment{abstract}
{{\bfseries\noindent{\abstractname}\par\nobreak}\footnotesize}
{\bigskip}
\titlespacing{\section}{0pt}{*3}{*1}
\titlespacing{\subsection}{0pt}{*2}{*0.5}
\titlespacing{\subsubsection}{0pt}{*1.5}{0pt}
\usepackage{graphicx}
\usepackage[space]{grffile}
\usepackage{latexsym}
\usepackage{textcomp}
\usepackage{longtable}
\usepackage{tabulary}
\usepackage{booktabs,array,multirow}
\usepackage{amsfonts,amsmath,amssymb}
\providecommand\citet{\cite}
\providecommand\citep{\cite}
\providecommand\citealt{\cite}
% You can conditionalize code for latexml or normal latex using this.
\newif\iflatexml\latexmlfalse
\providecommand{\tightlist}{\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}%
\AtBeginDocument{\DeclareGraphicsExtensions{.pdf,.PDF,.eps,.EPS,.png,.PNG,.tif,.TIF,.jpg,.JPG,.jpeg,.JPEG}}
\usepackage[utf8]{inputenc}
\usepackage[ngerman,english]{babel}
\usepackage[printfigures]{figcaps}
\begin{document}
\title{Spatial analysis showing that deprived people are more subjected to
road-traffic accidents: case study in the Municipality of Vernier~}
\vspace{-1em}
\date{}
\begingroup
\let\center\flushleft
\let\endcenter\endflushleft
\maketitle
\endgroup
\sloppy
\par\null
\selectlanguage{ngerman}\textbf{Nyffeler Cécile}
\textbf{Ecole Polytechnique Fédérale de Lausanne}
\section*{Introduction}
{\label{212335}}
Road-traffic accidents represent the ninth cause of death worldwide
(Murray and Lopez, 1997). It is thus important to get some insight on
which societal group might be more exposed to such hazards. A clear
association could be made between the probability of injuries by car
crashes and the poverty of the person which was hit during the crash
(Aguero-Valverde and Jovanis, 2006).~ This finding was supported by
several other studies.~ Siddiqui et al. (2012) were indeed likewise able
to affirm that lower median household incomes could be associated with
higher road-traffic accidents probability.~
The aim of this paper is to judge if those findings are applicable to
the communal level and to be able to state that the people living in
poorer neighborhoods are indeed more vulnerable to car crashes than
wealthier regions of the municipality. The commune of Vernier,
Switzerland, was selected on the grounds that it is a highly contrasted
municipality, and might thus be representative of what may happen at
larger scales.
\section*{Data}
{\label{731576}}
~Several vector layers and text files were used in order to proceed to
this analysis. The vector layers were all provided by the OpenData
service of the Canton of Geneva SITG. Geographical point data containing
the accident locations and housings addresses were used, as well as
polygon layers defining the extent of the municipality zone and the
inhabited areas.~ The inhabited areas were characterized using a
hectometric grid. The demographic data about the allowances were
probably taken from the Swiss Federal Office of statistics FSO (not
clearly mentioned in the data set).~
\section*{Methods}
{\label{370409}}
The investigations concerning the correlation between road accidents and
poverty were done using the QGIS and GeoDa softwares, as indicated by
the~ ``QGIS User Guide'' (Sherman et al., 2004) and the~ ``GeoDa User
Guide'' (Anselin, 2003) respectively.~
The address point data vector file was imported on QGIS and intersected
with the municipality borders in order to keep the information relevant
to Vernier only. The excel data sheets containing the addresses of the
households receiving allowances (housing assistance and malady insurance
subsides) were imported as tables in QGIS. Using a common identifier (
IDPADR ) between the allowance tables and the address vector file joints
were performed in order to have the geopraphical locations of the people
needing financial help. Once this was done a count of the number of
people receiving malady insurance subsides per cell of inhabited area
(thus for 100 m x 100 m regions) was performed. The sum of the people
receiving housing assistance was done in a similar way. Finally,
centroids of the population grid containing the allowance data were
created.
The vector file containing the accident points was then imported on
QGIS. A distance matrix analysis was thereupon performed, seeking for
the minimal distance between the recently created centroids and the
accident points. This allowed to get a table characterizing the
economical status of the population per cell (poorer regions being
characterized as having higher number of persons needing allowances) and
their minimal distance to the closest accident.~
This joined table was then imported on GeoDa, were several spatial
analyses could be performed. Scatter plots between the accident and~
housing assistance and malady insurance subside variables~ were
primarily executed. A simple multivariate linear regression followed,
using the allowance variables as independent variables in order to
characterize the dependent accident variable. Finally, a multivariate
regression with dependent spatially weighted variables was implemented,
using a Queen contiguity of order 1 for the creation of the weighting
file.
The predicted values obtained by the linear and spatial regressions were
then imported back in QGIS, where the minimal predicted distances to the
nearest accident could be mapped. The number of classes chosen as to
depict these estimated values was chosen with the help of the
Huntsberger Index:~\(Nbr\ of\ classes\ =\ 1\ +\ 3.3\cdot\log\left(nbr\ of\ spatial\ units\right)=1+3.3\cdot\log\left(404\right)\)~ ~ ~
\section*{Results}
{\label{786541}}
Scatter plots were done with the minimal distance to an accident as
dependent variable, and the number of people needing housing aids and
health insurance subsides as independent variables. The obtained results
were poor, with R\textsuperscript{2} values of 0.032 and 0.067 for the
housing aid and insurance subsides respectively. Such low values
indicate that no valuable correlation can be made between these
variables. The Moran's I values, which are represented by the regression
slopes of the standardized data, were both negative (I = -0.179 for the
housing aids and~I = -0.260 for the health insurance subsides).~\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/housingaid-mindist1/housingaid-mindist1}
\caption{{Minimal distance to an accident in function of the location of the
people needing housing aids
{\label{445285}}%
}}
\end{center}
\end{figure}\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/healthins-mindist/healthins-mindist}
\caption{{Minimal distance to an accident in function of the location of the
people needing health insurance subsides
{\label{607300}}%
}}
\end{center}
\end{figure}
In order to increase the significance of the results a multivariate
regression approach was then adopted. As before, the distance to the
nearest road-traffic accident was considered to be the dependent
variable, while the allowance variables served as the independent
variable which would yield the multivariate regression. A summary of the
output is presented in Fig. 2.~\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/linear-reg/linear-reg}
\caption{{Summary of the output of the multivariate linear regression
{\label{309719}}%
}}
\end{center}
\end{figure}
The retrieved R\textsuperscript{2} value~ of 0.068 shows no improvement
to what was obtained using the scatter plots. The highest residual was
of 335.33 m, the smallest one of 0.2 m, illustrating the high
variability in the prediction of the minimal distance to the nearest
road- traffic accident. Looking at Fig. 4, which represents the residues
of the multivariate linear regression, it seems that the more extreme
values are rather located at the edges of the municipality of Vernier,
or in the isolated inhabited zones.\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/residue-ols/residue-ols}
\caption{{Standard deviation map of the residues of the multivariate linear
regression
{\label{366716}}%
}}
\end{center}
\end{figure}
The map presented in Fig. 5 shows that even though the
R\textsuperscript{2~}value does not allow to qualify the model as
appropriate to estimating the distance to the nearest accident, a pretty
decent idea of the situation can be conceived.\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/map-linear/map-linear}
\caption{{Quantile map of predicted minimal distance using multivariate linear
regression
{\label{771607}}%
}}
\end{center}
\end{figure}
It is indeed noticeable that regions containing a high concentration of
accidents generally show redder colors, meaning that these cells have
their nearest accident closer. The light blue regions, particularly at
the south of the municipality, do however show some erroneous
information. These zones being located where the residues were the
highest this discrepancy is not surprising.
Considering the first law of geography by Tobler stating that near
things are more correlated than distant things (Tobler, 1970),~ taking
the neighboring cells into consideration seems of importance. A
multivariate regression with dependent spatially weighted variables was
therefore executed. Different weighting methods were tested, the optimal
one being the Queen weighting with contiguity of order 1. The results
obtained using this scheme are presented in Fig. 6.\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/spatial-lag/spatial-lag}
\caption{{Summary of the output of the multivariate spatial regression
{\label{236228}}%
}}
\end{center}
\end{figure}
The coefficient of determination shows a much higher value, with an~
R\textsuperscript{2} of 0.47.~ Even if this is an improved value to the
other methods considered until this point, it is still not high enough
to allow for a good model. This is shown by the residues, which are
still excessive, with the largest residue having a magnitude of 320.2 m.
The standard deviation map illustrated in Fig. 7 shows once again that
extreme values are rather situated where the number of neighboring cells
is lower, meaning next to the border of the municipality or in less
inhabited areas.\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/residue-spalag/residue-spalag}
\caption{{Standard deviation map of the residues of the multivariate spatial
regression
{\label{132472}}%
}}
\end{center}
\end{figure}
The distances to the nearest road-traffic accident as predicted using
the locations of the houses of the people in need are presented in the
choropleth map of Fig. 8. A discernible association exists between the
the warmth of the color and the concentration of accidents (warmer color
meaning shorter distances to the nearest road-traffic accident). In
almost all dark red cells some accidents did take place, with the
exception of few cells being situated either at the border of the
municipality or in isolated places. There are however several errors in
the colder colored cells, with several cells containing accidents but
displaying distances higher than 80 m to the nearest accidents, which is
not possible, the centroid being situated at 50 m of the borders of the
cell.\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/map-spatial/map-spatial}
\caption{{Quantile map of predicted minimal distance using multivariate spatial
regression ~
{\label{965074}}%
}}
\end{center}
\end{figure}
\section*{Discussion}
{\label{134476}}
Observing the results yielded by the scatter plots, the distance of
one's home to the nearest accident is apparently not linked to the
economical status of the person living there. However, a negative
Moran's I expressing negative autocorrelation, if the
R\textsuperscript{2} values would have been more conclusive one could
have already made the assumption that the inhabited areas with less
allowances are situated at greater distances to the nearest accidents.~
This would indicate that poorer people are more at risk when considering
road-traffic accidents.
This initial hypothesis is reinforced when looking at the results of the
mutlivariable regressions. Even if the determination coefficient stay
pretty low, a clear improvement can be noticed when the neighboring
cells are taken into account. A higher value of R\textsuperscript{2}
means indeed that the model is better at approximating the values of the
dependent variable thanks to the independent ones.~~
The difference between the maps shown in Fig. 5 and Fig. 8 illustrate
this improvement. The spatial regression predicted distances illustrated
in Fig. 8 do indeed somewhat better represent the actual distances to
the road-traffic accident locations.~
Isolated cells and the border of municipality may induce problems for
linear regressions, as could be noticed in the standard deviation maps
(Fig. 4 and 7). Also, since no lower outliers are present in any of
these maps, it seems that the multivariate linear regression tends to
rather overestimate the distance to the nearest accident, as does the
multivariate spatial regression. This is also clearly shown in the
quantile maps (Fig. 5 and 8), which depict some neighborhoods using cold
colors even though having accidents within their cell boundaries.~
\section*{Conclusions}
{\label{447222}}
The analysis that was performed on the municipality of Vernier showed
that, when spatiality is taken into account, the distance to
road-traffic accidents can be correlated to the wealth of the
neighborhood. Indeed, the wealthier the region, the further away the
closest accident will be. This confirms the results found in
literature.~
Even if corroborating what has been affirmed in other studies , the
results found in this paper should be considered carefully. The best
determination coefficient found, using multivariate spatial regression,
is still quite low and indicates that the precision of the model, which
allows to estimate the distance to the nearest accident thanks to the
allowances people get at some locations, is still poor.~
The spatial model may be this mediocre because of the limited amount of
data that was used, and the restricted zone that was under analysis. The
high variability in the population of the municipality of Vernier may
also explain these discrepancies. Futures studies might does want to
focus on somewhat larger areas.
\section*{References}
{\label{715805}}
Aguero-Valverde, J., Jovanis, P.P., 2006. Spatial analysis of fatal and
injury crashes in Pennsylvania. Accident Analysis \& Prevention 38,
618--625. ~
Anselin, L., 2003. GeoDa 0.9 user's guide. Urbana 51, 61801.
Murray, C.J., Lopez, A.D., 1997. Mortality by cause for eight regions of
the world: Global Burden of Disease Study. The lancet 349, 1269--1276.
Siddiqui, C., Abdel-Aty, M., Choi, K., 2012. Macroscopic spatial
analysis of pedestrian and bicycle crashes. Accident Analysis \&
Prevention 45, 382--391.
Sherman, G.E., Sutton, T., Blazek, R., Luthman, L., 2004. Quantum GIS
User Guide.
~Tobler, W.R., 1970. A computer movie simulating urban growth in the
Detroit region. Economic geography 46, 234--240.
~DOI of the datasets used:~ 10.5072/zenodo.147367 ~
DOI of the
article:~\href{https://www.authorea.com/dois/pending}{10.22541/au.151173685.50941386}
~
\par\null
\par\null
{}
\par\null\par\null
~
\par\null\par\null
\selectlanguage{english}
\end{document}