Endogenous Variables
My target variables were healthy venues per person in each PUMA, and each PUMA’s average minimum distance to a healthy/unhealthy venue.  My first step was to sample 5000 lat-long points at random in New York City (excluding Staten Island)[3], distributed according to PUMA zone area.  These lat-long coordinates were fed into the location field for Yelp’s Fusion API to query businesses within 800 meters; this sampling method was to ensure that I was sampling business appropriately in accordance with each PUMA’s area—this is important to validate the minimum distance calculation.