A Data-Driven Evaluation of Delays in Criminal Prosecution

Hrafnkell Hjorleifsson and Michelle Manting Ho and Christopher Prince and Achilles Edwin Alfred Saxby NYU Center for Urban Science and Progress
Mentors: Federica B. Bianco NYU Center for Urban Science and Progress
Sponsors: NYU BetaGov/Litmus and the Santa Clara District Attorney’s Office.


The District Attorney’s office of Santa Clara County, California has observed long durations for their prosecution processes. It is interested in assessing the drivers of prosecutorial delays and determining whether there is evidence of disparate treatment of accused individuals in pre-trial detention and criminal charging practices. A recent report from the county's civil grand jury found that only 47% of cases from 2013 were resolved in less than year, far less than the statwide average of 88%. We describe a visualization tool and analytical models to identify factors affecting delays in the prosecutorial process and any characteristics that are associated with disparate treatment of defendants. Using prosecutorial data from January through June of 2014, we find that the time to close the initial phase of prosecution (the entering of a plea), the initial plea entered, the type of court in which a defendant is tried and the main charged offense are important predictors of whether a case will extend beyond one year. Durations for prosecution are found not significantly different for different racial and ethnic population, and do not appear as important features in our modeling to predict case durations longer than one year. Further, we find that, in this data, 81% of felony cases were resolved in less than one year, far greater than the value reported by the civil grand jury.

Prior Work on Evaluating Prosecutorial Efficiency

Existing measurements of prosecution performance

Previous studies have examined case-processing time as a standardized measurement allowing comparison across jurisdictions (Klemm 1986). In order to use case-processing time, researchers first must subdivide case timelines into appropriate time frames and reduce the scope to time under the control of the court system (Neubauer 1983). Early studies have also shown that case complexities such as prior convictions, mandatory minimums, and the number of defendants in specific jurisdictions may contribute to the length of a case (Luskin and Luskin 1986; Walsh et al. 2015). These findings align with the expectations of prosecutors at the SCC District Attorney’s office and form the basis for our capstone project.

In recent years, there have been other data-driven efforts to evaluate and compare court system performance. One such effort is (Measures for Justice 2017), an initiative to aggregate and compare the performance of criminal justice systems from arrest to post-conviction for the entire country via an interactive public dashboard. One of the largest challenges is that criminal justice data are neither recorded uniformly across local jurisdictions nor are publicly available. The solution from Measures for Justice is to reach out individually to jurisdictions to obtain data and then create standardized core measurements for evaluating performance.

In addition to parsing and understanding case timelines, another motivation of this capstone is to determine whether the addition of defendant characteristics can explain delays in resolution, which would indicate the presence of disparities. It is widely perceived that race/ethnic disparities pervade the criminal justice system, and much research has been conducted on biases at the point of arrest and police interaction (Ross 2015). However, no previous work has found the presence of racial disparities in criminal-case processing times.

Previous analytical techniques

Machine learning models can be helpful in decision making in the presence of a large amount of data. To be adopted by policy makers, though, they must be easily interpretable and cost-effective. Previous studies on the topic of time to disposition is dominated by linear regression and basic exploratory analysis. The use of machine learning techniques in the field of criminology is just beginning to emerge. Use of tree-based classifiers to model the outcomes of cases (Katz 2017) and advanced techniques in modeling cost-effective treatment regimes to optimize bail decisions (Lakkaraju 2016) focus on accuracy of prediction and optimization. The employment of advanced models on case processing time could help inform prosecutors in making decisions that both minimize case length and prioritize fair outcomes.