Step 5:
Linear regression was performed using Cross Validation and Regularization using the 3 variables - year, population and county. Since County was a categorical variable, dummies were used. 2 models were used for Regularization - Ridge and Lasso. For this particular data, Ridge regression gave a slightly better result. Also, 5 fold Cross Validation was used to avoid over-fitting the data.