City of New Orleans Emergency Medical Services Resource Optimization

NYU Center for Urban Science and Progress Capstone Project

Team Members:

Alexis Soto-Colorado, Adriano Yoshino, Connor Chen, Matt Sloane


Martin Traunmueller, Boyeong Hong, Constantine E. Kontokosta


City of New Orleans Office of Performance and Accountability

Executive Summary

The number of emergency medical services (EMS) incidents within the City
of New Orleans has increased on an annual basis in recent years, while
available EMS resources have not increased in step to meet this rising
demand. This dynamic has resulted in declining quality of service
metrics regarding City EMS services, such as increased EMS response
times and failure to respond to calls, resulting in the assignment of
those EMS requests to other EMS operators.

In order to correct for these deficiencies, the City of New Orleans has
requested a data driven analysis of EMS optimization that configures the
scheduling and availability of EMS resources in the most efficient
manner regarding ability to respond to calls. In order to achieve this a
two-fold analysis was developed, one which predicts future EMS incidents
within the City of New Orleans (based on a historical EMS incident data
and a host of the other relevant data) via various statistical
regressions and then predicts the ability of the City of New Orleans to
respond to an EMS incident given various parameters (i.e., date, hour or
day, and number of ambulances in service) based on the prediction

Given that literal lives hang in the balance with regard to the ability
to respond to EMS incidents effectively, the analysis presented herein
is a critical first step in optimizing EMS services within the City of
New Orleans via the data driven analysis of quantified metrics related
to such services.


Since 2010, EMS incidents within the City of New Orleans have steadily
increased while the resource capacity (i.e., ambulances with associated
staff) of the New Orleans Emergency Medical Services (NOEMS) has not
increased in step to meet this rising demand. This dynamic has resulted
in a declining quality of service on the part of NOEMS, primarily
evidenced by the following metrics (City of New Orleans, 2016):

  • Increases in “wall time” of over 200 percent from 2010 to 2016. Wall
    time is defined as the number of hours that ambulances and emergency
    medical technicians (EMTs) must wait at a hospital for a patient to
    be off-loaded to hospital staff;

  • Increased use of mutual aid agreements by 31 percent since 2013.
    Currently EMS incidents that NOEMS cannot respond to are transferred
    to other EMS service providers such as Jefferson Parish and Acadian
    Ambulance, based on mutual aid agreements; and

  • Failure of NOEMS to respond to high priority calls within 12 minutes
    80 percent of the time (as a departmental goal; the national goal /
    standard is to respond to such calls within 12 minutes 90 percent of
    the time). This metric is considered a Key Performance Indicator
    (KPI) with regard to EMS performance that NOEMS is regularly failing
    to meet.

With these resource inadequacies and associated service shortcomings in
mind, the City of New Orleans Office of Performance and Accountability
(NOOPA) has requested a data driven analysis of how to optimize the
scheduling of current NOEMS ambulance resources in order to maximize
their effectiveness in responding to EMS requests throughout the City of
New Orleans. Further, NOOPA has also requested that the optimization
also allow for the consideration of hypothetical additional NOEMS
ambulance resources in order to measure how additional ambulances would
affect NOEMS’s ability to respond to EMS requests.

Such improvements in service would result in a reduction of those EMS
providers through the aforementioned mutual aid agreement. Further, the
goal of such an analysis would also be to increase the profitability of
NOEMS; by expanding the ability of NOEMS to respond to more EMS requests
(which is billed based on the nature of the request), the profitability
of NOEMS increases as well. With these goals in mind, a template for a
data driven analysis of NOEMS resources optimization is in place.

The basic framework of the optimization analysis is two-fold. First, a
prediction model based on various data that will “predict” future EMS
incidents within the City of New Orleans will be developed. Second, an
EMS resource optimization model will be developed that incorporates the
results of this prediction model as well as the specification of
temporal and available ambulance parameters to predict the likelihood of
the inability of NEOMS to respond to an incident (i.e. a disposition
code of “NUA” – No Unit Available, resulting in an outside EMS operator
servicing the call) on a citywide basis. The nuance of this analysis is
described below in the “Methodology” section.

It is noted that, in support of the above described effort, NOOPA has
provided historical EMS incident data for the City of the New Orleans
which documents individual EMS incidents from September 2013 through
June 2017 with various associated descriptive attributes (e.g.,
location, time, nature, etc.) While the particulars of this data will be
described in the “Data Description” section below, it is important to
note that this data will be the primary building block for the study
contained herein.

Literature Review

The literature reviewed in support of this effort fits broadly into
three separate categories:

  1. Literature concerning the traditional metrics of EMS requests (e.g.,
    common demographic and socioeconomic indicators common to an
    increased likelihood of an EMS incident / request);

  2. Literature on the data driven optimization of pubic service
    resources, including EMS, police protection, and fire response
    services; and

  3. Documentation of real world examples of the implementation of data
    driven optimization of EMS resources for a given city or location.

Of critical importance to the prediction modeling was the identification
of various factors and indicators that could help predict the location,
timing, and nature of future EMS incidents and requests (in addition to
the provided historical incident data). Pursuant to the “National study
of ambulance transports to United States emergency departments” (2006),
several key demographic and socioeconomic indicators were identified as
associated with a significantly higher rate of ambulance usage. Such
indicators include age, race, and insurance status, among others (all of
which are easily obtainable from public data sources).

A review of data driven crime prediction also played an important role
in both identification of prediction indictors to incorporate in the
analysis as well as effective statistical modeling techniques to utilize
in such prediction efforts. In addition to historical incident data
(whether it be crime, EMS requests, or some other kind of service event
or request) and the above-described demographic indicators, Koster-Hale
(2017) identified several other factors that influence crime prediction
(as a proxy for EMS incident prediction for the purposes of this study),
including income, population counts, weather patterns, and holidays and
other events; such variables were thus identified as necessary for the
purposes of this study’s prediction modeling (described in further
detail in the “Methodology” section below). Koster-Hale also applied
several statistical regression modeling techniques that were considered
relevant to our analysis due to the similarities in crime prediction to
EMS incident prediction, including the Random Forest, Gradient Boost
Regression Tree, L1, and L2 regression models.

The review of literature concerning the prediction of solid waste
generation in New York City also yielded analysis techniques relevant to
this analysis. Johnson Et Al. (2016) identified the Gradient Boost
Regression Tree technique for the prediction on solid waste generation
based on data indicators similar to those described above. This
statistical regression will be considered for this study’s predictive
modeling undertakings – commitment to it as the primary regression model
will be dependent on its strength in considering all of the data to be

In the study “Assessing an ambulance service with queuing theory”
(2006), optimization via the ERLANG B modeling is applied to EMS
optimization. Specifically, the ERLANG B technique was used to gauge
performance of EMS resources in Chile, South America, both under
existing EMS resource capacity as well as operational enhancements
(including ambulance fleet augmentation). This methodology is critical
to the second portion of this project’s analysis, where, as previously
described, the likelihood of existing and hypothetical NOEMS resource
capacity to respond to predicted future EMS incidents given a specified
date and time of day is modeled.

The evidence for the utilization of data drive EMS optimization is
mounting. Goodloe Et Al. (2017) noted significant improvements in
response times to various types of EMS incidents within Oklahoma City
through data driven optimizations, a city similar to New Orleans in that
it has experienced a significant growth in EMS incidents in recent
years. Recognizing the successful predictive power of data analytics
from previous projects, Cincinnati has begun to implement an EMS
optimization analysis as well (Eidam, 2016).

Data Description

Based on the goals of the previously described EMS optimization and a
review of various literature sources, various data were collected for
the purposes of this analysis, described below and linked in the
“References” section:

  1. NEOMS Historical Incident Data – This data was provided by the
    client (NOOPA), and contains individual records for every recorded
    EMS incident within the City of New Orleans from September 2013
    through June 2017, with approximately 250,000 total records. Each
    individual incident record contains variety of attributes that have
    significance to this optimization analysis, including location of
    the incident (usually by street address), zip code, police district,
    priority level, timing (i.e., dispatch, arrival, and route times),
    and disposition type (e.g., incident nature, patient refusal, no
    units available, etc).

  2. New Orleans Open Data Portal : Several datasets were obtained from
    the New Orleans Data Portal for the predictive and optimization
    analyses. These data are in the form of either geospatial shapefiles
    or spreadsheet data files, described as follows:

  • Zip Code Tabulation Shapefile – A geospatial data file delineating zip code tabulation areas within the City of New Orleans. There are 17 zip codes within the City of New Orleans pursuant to this shapefile. The individual records associated with the historical incident data indicate the zip code of the incident; by aggregating incident counts to zip codes, the data can be spatially normalized for comparison to other data sets procured for this analysis, such as zip code level census data.

  • Police Districts Shapefile – A geospatial data file delineating police districts within the geography of City of New Orleans. There are 8 police districts within the City of New Orleans pursuant to this shapefile. The individual records associated with the historical incident data indicate the police district of the incident; by aggregating incident counts to police district, the data can be spatially normalized for comparison to other data sets procured for this analysis, such as the aforementioned census data.

  • Business Activity Data Files – Several data files for business activity within the City of New Orleans were obtained from the New Orleans Open Data Portal, including active occupational business licenses data, alcohol beverage licenses data, and live entertainment venues data. This data provides the location of such business (and other spatial and business-related details), concentrations of which could play a significant role in the prediction of EMS incidents in the predictive modeling.

  1. U.S. Census American Community Survey (ACS) 2011 – 2015 Data: As
    described in the literature review, a number of demographic and
    socioeconomic indicators have been identified as significantly
    influential in the prediction of public service provider requests
    (EMS or otherwise). As the U.S. Census is the definitive data source
    for geographic level demographic and socioeconomic statistics, the
    indicators identified in the literature review were obtained from
    this source. Zip code geographic level demographic data for
    population, income level, age, racial composition, and health care
    insurance status were obtained.

  2. LEHD Origin-Destination Employment Statistics (LODES) Data : LODES
    data was obtained from the U.S. Census Department of Commerce. Of
    the various information that the LODES data provides, this analysis
    utilized the spatial distribution of employment within the City of
    New Orleans during typical working hours in order to determine the
    influence of such patternson the likelihood of EMS incidents.

  3. Weather Data – Historical weather data for the City of New Orleans
    for those periods matching the historical incident data was obtained
    from the Weather Underground ( A
    specific focus on precipitation events of various intensities is
    considered relevant to our prediction modeling (as well some extreme
    weather events, such as tornadoes), as such events were identified
    in the literature review as significant influencers in crime
    prediction (serving as a proxy for EMS incident prediction in this

  4. Holidays and Other Major Events – National (e.g., Memorial Day)
    and local (e.g., Mardi Gras) events were identified as precursors to
    increased likelihood of EMS incidents / requests, per the literatue


The EMS incident prediction and resource optimization methodology can be
broadly categorized into four separate steps, including: (1) data
preprocessing; (2) prediction modeling; (3) optimization modeling; and
(4) R Studio application development (the client requested delivery
platform), described in detail below and illustrated in a workflow chart
in Figure 1 at the end of this document.

Step 1: Data Preprocessing

The primary goal of the data preprocessing step is to assemble all of
the disparate data collected and described in the previous section into
a single data file with a common index. This historical incident data
contains three spatial attributes, including a physical location
(usually an address, but this varies), the incident zip code, and the
incident police district. Ideally, the historical incident location data
would be used to aggregate to a relatively small administrative boundary
that is relevant to our analysis (such as police subzones), but due to
the sheer size of the data set, time constraints, missing / incomplete
location descriptions, and / or ambiguous locations, this was not
feasible by the required project delivery date. While police districts
provide a potentially relevant geography of analysis (in that these
districts are an administrative boundary that is relevant to various
types of emergency events), the zip code geography was selected for this
analysis as it provides a more granular level of analysis, allowing for
more spatially refined modeling.

Given this dynamic, the historical incident data was aggregated to the
zip code level (with incident counts for this geography) and indexed by
date, month, and year. Data obtained from the U.S. Census, including the
ACS and LODES data, was joined directly to this aggregated incident
data, as the census data was procured at the zip code level. Using the
date index, other data indictors were then assigned to this data set,
including various weather conditions within the city on a given day
(mean temperature, humidity, wind speed, and weather events such as
precipitation and tornado events) and national and local holiday events.

The resultant aggregated and indexed set contains approximately 557,000
records. Several chloropleth visualizations of the zip code level,
including total incident counts, per capita income, median age, and
population are provided on Figures 2 through 5 at the end of the
document. The finalized preprocessed data can be obtained via link in
the “References” section.

Step 2: Prediction Modeling

The EMS incident prediction analysis used statistical regression
analyses to predict the location (by zip code), date, and timing (by
hour of day) of future EMS incidents within the City of New Orleans.
With that approach in mind, within the compiled data set, the location
and temporal attributes functioned as the dependent variables, while the
remaining indicators (e.g., demographics and weather, etc.) functioned
as independent variables in the employed regressions.

In order to test the regression models utilized for this effort, the
models utilized the first chronological half of the data (from September
2013 through September 2015) as “training” data. The regression models
were run on this data subset, and the prediction results were then
compared to the second chronological half of the data (October 2015
through June 2017) to test for accuracy.

Based on the literature review, three statistical regression models were
considered and reviewed for their efficacy in assessing the relationship
between these dependent and independent variables, including the Linear,
Random Forest, and Gradient Boost Regression Tree models, due their
ability to rank the effect of independent variables on dependent
variables for predictive purposes. The results of the modeling using
these regression techniques are indicated below in the “Results”
section. All of these regressions were applied to the data set via
Python statistical analysis packages and modules. The coding for this
exercise can be obtained from the publicly accessible repository linked
in the “References” section.

As indicated in the prediction modeling “Results” section, this
prediction effort was broken down into to separate subsections in order
the achieve EMS incident prediction by hour, described as follows:

  1. EMS Incident Prediction by Zip Code and Date – The Random Forest,
    Gradient Boost Regression, and Linear statistical regressions were
    employed to test the predictive power of the identified indicators
    on the number of incidents by zip code and date. The results of the
    statistical regression modeling that showed the strongest regression
    coefficient (i.e., the highest r-squared value) would then be
    utilized for the prediction of EMS incidents by hour.

  2. EMS Incident Prediction by Date and Hour – Using the modeling
    results of the first step described above, the number of incidents
    per hour given the date was then assessed utilizing the Random
    Forest and Gradient Boost Regression Tree statistical regression
    models. This was predicted on a citywide basis (i.e., not on the zip
    code later) for the purposes of the optimization prediction
    modeling, explained in more detail below.

The results of these modeling efforts are indicated below in the
“Results” section of this report.

Step 3: Optimization Modeling

The optimization modeling analyzes the results of the previously
described prediction modeling via the ERLANG B modeling technique, with
specification of date, time, and available resource capacity to predict
the likelihood of a “No Unit Available” disposition event on a citywide

The ERLANG B loss formula was developed to calculate the probability of
a call to be lost due to lack of connecting resources for telephone call
network routing purposes. These resources are the connecting circuits
available to connect calls and once any resource is being used by a call
it cannot be used by any other subscriber until released. The ERLANG B
formula considers the statistical amount of time (frequency: A) that is
resource is used out of the specified time frame. This amount could be
greater than one, as more than one resource could be required in a given
time frame. The ERLANG B loss formula uses the Poisson probability of
appearance given a time frame or frequency (A) and the number of
Resources to be used (C) to calculate a loss probability (B).

The expression below gives us the probability of having n resources
required given a frequency A.

This formula comes from Poisson Probability for Stochastic events. The
probability of loss will be given by the probability of all resources
being used out of all the possibilities of resource usage that could be
in use out of a given amount of resources, ranging from 0 to all
resources “C.” This is quantified via the following formula:

If we decompose this formula, the expression e^(-A) is repeated in all
terms, so it can be taken out from all of the expressions:

For the purposes of this study, the ERLANG B formula was utilized to
calculate the probability of “No Unit Available” disposition code given
a specified number of ambulances, a median case time (calculated from
the entire historical incident data set provided by the client), and the
median number EMS incidents by date and hour (which can be based on
either the historical incident data or the EMS prediction modeling
described above).

This ERLANG B prediction modeling approach was used as the basis for the
R Studio application development, the framework for which is described

Step 4: R Studio Application

Based on the aforementioned modeling efforts and pursuant to the stated
request of the client, R Studio application was developed, predicated on
the ERLANG B methodology described above, that allows for an
investigation of the city-wide likelihood of a “No Unit Available”
disposition code based on the user specification of date, time, and
number of ambulances available. This would allow the user (specifically,
NOOPA) the ability optimize EMS resource scheduling in a manner that
reduces the likelihood of “No Unit Available” disposition code.

As indicated in the description of the ERLANG B methodology, that
formula can incorporate either the historical incident data or results
of the prediction modeling for the purposes of its intended probability
modeling. To that end, the application allows the user to toggle whether
or not the EMS prediction modeling in applied.


Prediction Modeling

As discussed in the “Methodology” section, the first portion of
prediction modeling effort focused on the statistical prediction of EMS
incidents based on location (zip code) and date. The various resultant
coefficients of the selected regression models employed on the
preprocessed data are indicated below in Table 1.

Table 1 – Prediction Modeling Regression Metrics, EMS Incidents by
Zip Code and Date

As indicated in Table 1, the Gradient Boost Regression Tree statistical
regression provided the strongest regression coefficient (i.e.,
R-Squared value) with regard to the analysis at hand. However, upon
analysis of the influence of the various independent variables
considered in this step of the prediction modeling our team recognized
an opportunity for model simplification. See Figure 6 at the end of this

As shown in Figure 6, of the 15 independent variables with the highest
influence in this initial prediction modeling effort, only the top 10
factors exhibit measurably significant influence on this prediction,
including live entertainment venues, mean humidity, mean temperature,
date, maximum wind speed, day of the week, month of the year, number of
jobs, the occurrence of Mardi Gras, and number of businesses. Based on
this finding, this modeling effort was re-run with only these
independent variables considered, resulting in the follow regression
metrics indicated in Table 2.

Table 2 – Prediction Modeling Regression Metrics, Simplified DataSet

**Note:** The linear regression model was not employed on the revised preprocessed data due to its poor performance in the initial modeling effort.

This approach resulted in only slightly decreased regression
coefficients for both of the statistical regression models employed,
such that the revised preprocessed data is considered a viable input for
the prediction modeling. The logic of this decision lies in the
increased ease of simplification and, therefore, reproducibility; the
analysis presented herein is more easily reproducible and updatable if
data that is statistically insignificant to the prediction no longer has
to be accounted for.

This prediction modeling approach was then incorporated into the
prediction modeling for the number of EMS incidents per hour given a
predicted number of incidents on a given date. The various resultant
coefficients of the selected regression models employed on the
preprocessed data are indicated below in Table 3.

Table 3 – Prediction Modeling Regression Metrics, EMS Incidents by
Date and Hour

As shown in the table above, the Gradient Boost Regression Tree provides
the highest regression coefficient for this analysis, with an R-Squared
value of 0.37. While this can be considered a less than desirable
regression coefficient, due to a number of technical and timing
limitations in the undertaking of this study this result was proceeded
with for the purposes of the ERLANG B probability modeling / R Studio
application development. This must ultimately be viewed as an
opportunity for further research, given more time and resources to
undertake such a study.

It was noted that several types of events created unusually high EMS
incident counts, such as extreme weather / disaster events (e.g.,
flooding, tornados, and fire emergencies) and cultural events (e.g.,
Mardi Gras), outlier events which have the potential to distort the
prediction modeling effort (see Figure 7 at the end of this document for
an illustration of the difference in EMS incidents on the days of such
events). In order to mitigate this distortion, EMS incident count
outliers attributed to such events were reduced to a value equal to the
mean daily incident count plus a value equal to three standard
deviations of the mean daily incident count.

R Studio Application

Within the ERLANG B-based R Studio application developed in support of
this study users can specify various temporal and capacity metrics in
both applications to investigate the likelihood of a “No Unit Available”
disposition code at each hour of the specified day.


As indicated in the “Data Description” section, while the historical
incident data provided by the client contains a local location
attribute, usually in the form of an address, this analysis used the zip
code geography as the spatial level of analysis. Due to time and
technical limitations as well as shortcomings of the location data
itself (missing / incomplete locations and location ambiguity), we were
unable to geocode this attribute to geospatial point locations.
Successful geocoding would allow for aggregation of the incident data to
smaller geographic levels, which could potentially permit for a more
refined prediction and optimization model. This is discussed further in
“Opportunities for Future Research” section below.

Data Governance Considerations

With any study that draws on and analyzes various forms of public and
private quantified data there are important issues related to ethical
and prudent use of said data. While the analysis presented herein (or
any data driven analysis, really) certainly offers a multitude of such
issues to be conscious of, several specific issues within two broad
categories assert primacy and are deliberated upon below.

Ethical Considerations

The real world implementation and use of the EMS optimization model
within the City of New Orleans carries significant ethical implications.
While the purpose of the optimization model is to reschedule and
reallocate NEOMS resources in order to improve incident response
efficiency, the optimization is inevitably based on incomplete data.
While the historical incident data, demographic / socioeconomic
indicators and other data incorporated are undoubtedly significant
determinants (if not the most significant) in the prediction of the
location and timing of future EMS incidents, the optimization is
inherently limited in that it would be impossible to identify all of the
factors (quantitative or otherwise) that contribute to the need for EMS
services to incorporate into the modeling. As such, the NOEMS
optimization could potentially disrupt EMS services to adequately
serviced geographies, or could result in the continued marginalization
of geographies currently under-served by NOEMS resources, among other

Measures can be employed to mitigate and control for this reality,
however. As expounded upon in the “Methodology” section, the regression
models were trained using a subset of the historical incident data, the
results of which were compared to a chronologically later subset of that
data in order to test the accuracy of the models. Further, it is
important that the client remain attentive in keeping the model
contextually current. Much like many other urban areas, the City of New
Orleans is a constantly evolving metropolitan area that will inevitably
manifest new considerations in the prediction of future EMS incidents.
For instance, as a coastal city New Orleans will be more susceptible to
the effects of climate change and associated sea level rise (and
resultant new weather patterns). This inevitability will undoubtedly
influence the location and frequency of EMS incidents in the future; an
EMS optimization model predicated on the prediction of the location and
timing of EMS incidents within the City of New Orleans should account
for this evolving reality. It is thus incumbent on the client to ensure
some sort of semi-regular evaluation of the types of data included in
the model to ensure they remain appropriate or if a newly identified
data should be incorporated as well.

Another ethical concern associated with the real world implementation of
the model arises from the City’s desire to investigate how additional
EMS resource capacity (i.e., additional ambulances) would allow for the
City to reach additional EMS incidents and therefore increase the
service’s profitability (given a standard fee for each EMS request).
Currently, when the City cannot respond to an EMS incident, the request
is assigned to a private EMS operator – those “lost” requests equate to
a missed financial opportunity for the City. While additional EMS
capacity would undoubtedly expand the ability of the NOEMS to respond to
EMS incidents it could previously not fulfill, model prediction will
have to be tempered with real-world performance indicators of EMS
response under the optimization to ensure that EMS incidents usually
thought outside of the City’s response purview would actually be
obtainable. Potential profit in the ability to capture new EMS requests
should not subvert the actual ability of NEOMS to properly service the


Potential biases associated with this study are encountered in the
various datasets that were provided by the client, acquired from public
data sources, or within the analytical methods employed for predictive
purposes. As this analysis does not involve any sampling the possibility
for various biases often associated with surveying techniques are not

The first statistical bias considered is aggregation bias. As described
by Clark et al. (1976), information loss associated with the
consolidation of data from individual levels to macro levels represents
an aggregation problem. In the context of this analysis, such an issue
may arise in the aggregation of the individual locations of the
historical incident data to larger geographies (in this case, the zip
code level) in order to establish a consistent spatial reference to
compare the incident data to U.S. Census ACS demographic and
socioeconomic data. Such an aggregation can have the effect of over or
under generalizing the data, which is where the bias can manifest, both
at the data and analysis levels.

At a purely data level this project-specific aggregation can imbue a
geography with arbitrary significance, depending on the frequency of an
event (in our case, an EMS incident) within a geography. To continue
with this logic, some geographies may in fact contain significant trends
or patterns within the data, but because overall incidents within the
geography are fairly low, aggregation may undervalue said geography’s
importance. Steps can be taken to mitigate this impact, including
normalizing the aggregated data with either total incident counts or
total population counts and / or using relatively small geographies in
which aggregation is limited in its ability to over inflate or
undervalue significance.

These data-level aggregation biases also carry over into any regression
analyses run on the data. In terms of our EMS analysis, geographic
location becomes an independent variable (whether it be zip code or some
other kind of spatial region), such that previously discussed biases
present in said geographic aggregations would manifest in the various
regression metrics. Much like the conclusion reached for data level
aggregation bias, analysis level aggregation bias could over or
understate the value of the geographic location regression coefficient,
skewing the accuracy of the modeling.

Another bias to consider in our analysis of EMS incident prediction is
that of confirmation bias – where the statistical analysis employed
simply confirms pre-existing conceptions of why an event occurs in the
first place. Our team identified preconceptions such as demographic and
socioeconomic characteristics (e.g., poorer and older communities) that
would create a higher likelihood for an EMS incident in a given
geography. However, our literature review identified demographic and
socioeconomic characteristics of geographies, including age, race,
income, and insurance status, as significant contributors in the
likelihood of an EMS request / incident. Validation of our initial
preconceptions reveals an important truth about confirmation bias, in
that bias is not necessarily always a negative. While it may not always
be the case, this initial preconception helped us define a valid and
credible approach to our methodology and analysis based on logical
assumptions about what catalyzes the phenomenon we are studying.


The prediction and optimization modeling and associated application
represent an important first step in the development of a data-driven
analysis of EMS optimization within the City of New Orleans. Through an
accurate prediction of future EMS incidents within the City of New
Orleans, the ability (or inability) of NOEMS to respond to EMS incidents
given various existing and hypothetical parameters (i.e., date, time,
and available ambulance resources) can be accurately assessed. This has
significant implications for EMS scheduling regarding the development of
strategies to increase the number of EMS incidents NOEMS is able to
respond to. However, as previously mentioned, this only represents a
first step; as an initial building block in the data driven optimization
of NOEMS resource, this study can be evolved in various ways, as
described in the proceeding section.

Opportunities for Future Research

Smaller Geographies of Analysis for Prediction Modeling

As discussed in the “Limitations” section, if the individual incident
records could be geocoded to geographic point locations, the incident
data could then be aggregated to geographies smaller than zip codes;
more granular geographies could help refine the accuracy of the
prediction analyses, which would also serve to improve the optimization
modeling as well. However, in order for the geocoding effort to be
properly effective, the overall quality of the location attribute would
have to be improved, as many of the provided locations are missing
information or are ambiguous. This could help improve a shortcoming
encountered in the modeling effort; specifically, the low regression
coefficient associated with the EMS prediction by hour modeling.

Improved Historical Incident Data

The client-provided historical incident, while detailed, could be
expanded to include other attributes relative to both the incidents and
this study. New data of interest includes EMS resource response origin
point, more precise incident location data, and identification of the
medical facility the patient was discharged to. These details would
serve to both improve / refine the analysis contain herein as well as
open avenues for more advanced analyses to applied (e.g., EMS incident
heat map, optimal traffic network routing based on incident location,
etc.). This could help improve a shortcoming encountered in the modeling
effort; specifically, the low regression coefficient associated with the
EMS prediction by hour modeling. This data would be fairly simple to
collect, via the installation of GPS tracking devices on the ambulance

Cost-Benefit Analysis

As discussed in various sections of this paper, a goal of implementation
of the developed optimization model is to capture a greater number of
EMS incidents / requests that NOEMS could previously not respond to,
which are classified as “No Unit Available” and fulfilled via mutual aid
agreements with other EMS operators. In capturing a higher than average
proportion of annual EMS requests via the optimization, NOEMS would
generate more revenue given a fee associated with each fulfilled
request. The total annual revenue of could be projected and compared for
existing EMS resources under current scheduling, existing EMS resources
under optimized scheduling, and hypothetical EMS resource capacity under
optimized scheduling. Such financial projections could serve to
legitimize requests by NOEMS (or NOOPA) for additional funding to expand
the EMS ambulance fleet (or, at least, more dedicated maintenance to
keep more ambulances that are a part of the existing fleet in service
more often).


Figure 1 – Methodology Workflow Chart

Figures 2 and 3 – Selected Aggregated Census Metrics, 1 of 2

Figures 4 and 5 – Selected Aggregated Census Metrics, 2 of 2

Figure 6 – Influence of Independent Variables within the Statistical Regression Models

Figure 7 – Incident Count Graph Illustrating EMS Incident Outliers



  1. City of New Orleans (2016, March). City of New Orleans Staffing and
    Presentation .

  2. Larkin GL, Claassen CA, Pelletier AJ, Camargo CA: National study of
    ambulance transports to United States emergency departments: The
    importance of mental health problems. Prehosp Disast Med

  3. Koster-Hale (2017). Rent, Rain, and Regulations--What Predicts
    Crime in Portland, Oregon?

  4. Johnson, N. E. Et al. (2017). Patterns of waste generation: A
    gradient boosting model for short-term waste prediction in New York
    City. International Journal of Integrated Waste Management, Science
    and Technology
    , 62, 3-11.

  5. Singer, M., & Donoso, P. (2008). Assessing an ambulance service
    with queuing theory. Journal Computers and Operations Research, 35

    (8), 2549-2560.

  6. Goodloe, J. M. (2017). Oklahoma’s Data-Driven Approach to Urban EMS
    Response Time Reform. Journal of Emergency Medical Services .
    Retrieved from

  7. Eidam, E. (2016, August 9). Cincinnati Predictive Analytics Project
    Takes Aim at Emergency Medical Services. Government Technology .

  8. Clark, W. A. V. and Avery, K. L. (1976), The Effects of Data
    Aggregation in Statistical Analysis. Geographical Analysis, 8:
    428–438. doi:10.1111/j.1538-4632.1976.tb00549.x


  1. U.S. Census Bureau, American Community Survey:

  1. U.S. Census Bureau, Selected Economic Characteristics:

  2. U.S. Census Bureau, LEHD LODES Data: (select LODES 7.0 WAC for

  3. Weather Underground, Historic Weather Data for the City of New

  4. New Orleans Open Data Portal (, multiple
    data files:

    1. Zip Code Geography Shapefile

    2. Police Districts Geography Shapefile

    3. Occupational Business Licenses Data

    4. Alcohol Beverage Outlets Data

    5. Live Entertainment Venues Data

Study Data and Code Repositories

Note that due NYU protocol, permission to access this repository must be
granted from NYU.

[Someone else is editing this]

You are editing this file