Methodology:
After processing the data, I analyzed the correlation between all variables to identify relationship patterns. Then, I selected the independent variables with the strongest correlation coefficients, and ran a regression analysis against the dependent variable, which is the ratio of unbanked population.
The dependent variables in this model are strongly co-variate, and it would have been a better option to go with more complex models such ridge regression. However, the linear regression model had an r-squared of 0.859, which means the the regression model is probably a good fit. It is noteworthy, that regression analysis was also used in UI's analysis to create their prediction model.
Conclusions:
The correlation analysis showed that there is a strong positive correlation between banking status on one hand, and unemployment on the other. Similarly, there is a strong negative correlation between banking status and median income. These findings are in-line with Urban Institute's results. The UI report identifies the Bronx as the borough highest unbanked population, highest poverty and unemployment level and lowest median incomes. 6