PUI2016 Extra Credit Project Report: Medicaid Hospital Visits Analysis in New York City

Zhaohong Niu, EstherNiu[github], zn352


The research examined Medicaid beneficiaries’ hospital visits data to help better understand Medicaid program in New York City. Using python coding and statistic tools, I want to identify the most common type of chronic diseases, compare the difference between ER visits and inpatient visits, and using OLS regression to predict beneficiary number by total inpatient admissions. The result shows that cardiovascular disease, mental disease, and substance abuse are the most common types of disease among Medicaid beneficiaries. There are a high number of mental health and disorder patients visiting ER. Those findings help better distribute medical resources. And for OLS regression model, unique number of beneficiaries of inpatient admissions = 0.3899*total inpatient admissions. This prediction helps understand the growth scale of Medicaid beneficiary in the future.


This research focuses on how Medicaid Program in New York City helps its beneficiaries who suffered from chronic diseases to receive treatment. Besides conducting basic visualization to find out the most common types of disease for Medicaid beneficiaries to visit the hospital, I want to identify differences within the category, predict the future trend of the patient number and solve other questions related to the effectiveness of Medicaid Program. The result can help NYC government have a deeper understanding of how Medicaid has helped uninsured families in treating chronic diseases, and what type of chronic disease is most common for people in need of a Medicaid program. It will benefit the process of policy-making to better balance and distribute Medicaid resources, funds, and vacancies.


I acquired and analyzed two sets of data from New York State Department of Health.

Medicaid hospital visits (Medicaid Chronic Conditions, Inpatient Admissions and Emergency Room Visit by County, 2012-2014)[1] is an essential source data of my analysis. Delivered by New York State Department of Health, the dataset contains the number of visits to Emergency Room and to Inpatient Department of every hospital in New York State, as well as number of inpatient admissions for each year from 2012 to 2014. Below is the definition for some important categories: [5]

  • Major Diagnostic Category: major disease type
  • Episode Disease Category: disease subordinate to major disease
  • Beneficiaries with Condition: number of Medicaid beneficiaries that was diagnosed to have a certain type of disease
  • Beneficiaries with Admissions: unique number of Medicaid beneficiaries that had at least one inpatient admission
  • Total Inpatient Admissions: total number of Medicaid beneficiary's inpatient admission
  • Beneficiaries with ER Visits: unique number of Medicaid beneficiaries that had at least one ER visit
  • Total ER Visits: total number of Medicaid beneficiary’s ER visits

Each type of visits (inpatient, ER) has the number of beneficiaries and number of total beneficiary visits, indicating that beneficiaries with chronic disease conditions are visiting hospitals more than one time. Addressing this clarification is important because later analysis will refer to these definitions. With number of visits by chronic disease, I am able to compare visit number across different disease type and within a subgroup.

Another dataset is Medicaid Enrollment Number by Month (2009-2016)[2] with demographic information attached, e.g. number of recipients, race, gender, age group and plan type. The demographic information allows me to analysis the enrollment number over the years by different demographic indicators.