# Progmosis: Evaluating Risky Individual Behavior During Epidemics Using Mobile Network Data

Abstract

The possibility to analyze, quantify and forecast epidemic outbreaks is fundamental when devising effective disease containment strategies. Policy makers are faced with the intricate task of drafting realistically implementable policies that strike a balance between risk management and cost. Two major techniques policy makers have at their disposal are: epidemic modeling and contact tracing. Models are used to forecast the evolution of the epidemic both globally and regionally, while contact tracing is used to reconstruct the chain of people who have been potentially infected, so that they can be tested, isolated and treated immediately. However, both techniques might provide limited information, especially during an already advanced crisis when the need for action is urgent.

In this paper we propose an alternative approach that goes beyond epidemic modeling and contact tracing, and leverages behavioral data generated by mobile carrier networks to evaluate contagion risk on a per-user basis. The individual risk represents the loss incurred by not isolating or treating a specific person, both in terms of how likely it is for this person to spread the disease as well as how many secondary infections it will cause. To this aim, we develop a model, named Progmosis, which quantifies this risk based on movement and regional aggregated statistics about infection rates. We develop and release an open-source tool that calculates this risk based on cellular network events. We simulate a realistic epidemic scenarios, based on an Ebola virus outbreak; we find that gradually restricting the mobility of a subset of individuals reduces the number of infected people after 30 days by 24%.

While these results are promising, it is important to underline the fact that this is only an initial foundational work and to stress some key points. First, this paper focuses on a theoretical model, rather than on its actual translation into a real-world system. In particular, centralized deployments of this model would pose several ethical questions, as they would require access to user data. Decentralized deployments for which user mobility data never leaves the mobile device of a user are possible and should be preferred, as they fully protect user privacy. Second, results are generated from computer-based simulations, under specific assumptions. Social factors and technical difficulties might greatly affect results obtained in the real world. Third, this risk-assessment tool is not designed specifically for implementing containment measures based on mobility restrictions. For example, it could be used to advise users about the most appropriate behavior given his/her risk profile (e.g., willingly change own behavior, see a doctor, and similar); users would finally choose whether to follow the advice or not. Finally, the simulations were run on data call records from a country that is according to WHO Ebola-free (WHO 2014), and this work has not been commissioned neither by Orange nor by any other entity for preparation to a real-world disease outbreak.

# Introduction

The world is facing a number of severe healthcare challenges and, indeed, the recent Ebola outbreak seems one of the most worrisome and urgent. Mr David Nabarro, Special Envoy of the UN Secretary-General, said at an informal UN meeting that he had never encountered a challenge like Ebola in 35 years of his professional life: “This outbreak has moved out of rural areas and it’s coming to towns and cities. It’s no longer just affecting a very well-defined location, it’s affecting a whole region and it’s now impacting the whole world"1.

Nowadays transportation systems make it possible for people to travel easily across a country and across the globe, but, unfortunately, they make that possible for diseases too. The spread of diseases is facilitated by today’s rich transportation networks that enable human disease carriers to quickly move across distant regions (Merler 2010). In this context, drastic measures like banning transportation to disease-affected areas are difficult to implement, have a high cost and are actually believed to worsen the outbreak (Chamary 2014) (Meloni 2011). The need for smaller, targeted interventions matches the increasing availability of large-scale data, especially coming from mobile networks. The benefit of mobile-phone records to combat quickly-spreading diseases like Ebola is unquestionable (Economist 2014).

When an outbreak becomes global, an infected person can be found anywhere, in cities as well as rural areas, and regardless of country boundaries; this might suggest that no place is really safe. However, we argue that some people and places are more exposed to the risk than others.

We propose to use such heterogeneity to our advantage and to use mobile networks to unveil such heterogeneity. We envision a system that utilizes the data coming from mobile carriers and, where available, social networks and smartphones, to construct individual-based risk models. The system can assess the risk associated with a person, primarily based on that person’s mobility patterns and, optionally, on other demographic or behavioral indicators that can be inferred from the data. We would like to highlight the characterizing features of the proposed solution: first, it can use data that is readily available (such as cellphone carrier data), and second, it is be able to operate under uncertainty (it does not require the knowledge of the identity of the infected).

The risk model can be used in several real-world scenarios, especially when a urgent response is required. Thus, the model can be used to answer the following questions. Who should be tested early for signs of the disease, and possibly put into quarantine if positive, given that vaccinations can be produced and performed with a certain rate? Who should get vaccinated first? Who should receive information about prevention, for example by means of text messages? All these scenarios describe individual-based interventions that are very hard to administer quickly over large populations. This model can prioritize the people to be targeted with the intervention sooner rather than later.

# Motivation

People behavior is highly heterogeneous. Existing epidemic models are based on analyses conducted at population level to assess how infectious a disease is, based on the basic reproductive ratio $$r_0$$, i.e., the average number of secondary cases generated by a single infected person. However, several studies have concluded that spreading processes are usually highly heterogeneous and that some individuals remain responsible for a large proportion of the spreading. The presence of these influential spreaders has been investigated for generic networks (Kitsak 2010), as well as in epidemics processes. Superspreading seems to be a common feature of the spread of diseases and targeted individual-based control measures are much more effective than population-wide measures, as reported by Lloyd-Smith et al. (Lloyd-Smith 2005). For this reason, identifying superspreaders is extremely important in order to contain epidemics.

Existing techniques, such as contact tracing, are not sufficient. Moreover, efforts in fighting disease outbreaks mainly focus on contact tracing techniques, as it is happening for Ebola (Murphy 2014). Contact tracing works by finding all the people who have been in contact with an infected person, and then interviewing, monitoring, isolating them when necessary. The process is repeated for everyone who is found to be infected. While contact tracing can be effective, it has some drawbacks. First of all, information provided by people might be subject to errors, due to fear, shame, faulty memory or other reasons. Secondly, contact tracing needs time: contact tracing only starts when a person is already diagnosed with the disease, or at least shows symptoms. Tracing the contacts also takes time: if the disease has an asymptomatic phase or is highly infective, the contacts might be likely to have infected others before they are traced.

Localization techniques have already been used successfully during critical scenarios. Recently, Nigeria also resorted to GPS technology to improve, scale up and speed up contact tracing, repurposing GPS devices used for polio vaccinations (Fasina 2014, Gates 2014). The huge effort of the country resulted in eradication of Ebola and Nigeria was declared “Ebola-free” by the WHO1. While this success story demonstrates how location tracking can be very useful during similar scenarios, the very same strategy could have not been used if the epidemic was in a more advanced state, i.e., if many more people had already been infected. For this reason, we believe it is very important to investigate the use of alternative systems that can provide coarser location tracking but for a large number of individuals.

Medical treatment is scarce and costly. For example, in the case of Ebola, although the disease is seen as a serious challenge by the whole world, vaccinations have to face serious technical and financial issues before being administered2). When a commodity such as vaccinations is scarce, who should be given priority during vaccination?