Algebraic Properties of OLS Statistics

1.  The sum, and therefore the sample average of the OLS residuals is zero. Mathematically,
\(\sum \limits _{i = 1} ^n \hat u_i = 0\)
This needs no proof, since residuals are defined by \(\hat u_i = y_i - \hat \beta_0 - \hat \beta_1 x_i\). In words, the beta estimates are chosen to make the residuals add up to zero, though this says nothing for the residual for any particular observation i.
2.  The sample covariance between the regressors and the OLS residuals is zero. This follows from the first order condition:
\(\sum \limits _{i = 1} ^n x_i \hat u_i = 0\)
3. The point (\(\bar x, \bar y\)) is always on the OLS regression line. 
In sum, we can view OLS as decomposing each \(y_i\) into two parts, a fitted value and a residual. Or,
\(y_i = \hat y_i + \hat u_i\)
The fitted values and residuals are uncorrelated in the sample.
Define the total sum of squares (SST), explained sum of squares (SSE), and residual sum of squares (SSR) as follows:
\(SST \equiv \sum \limits _{i = 1} ^n (y_i - \bar y)^2\)
\(SSE \equiv \sum \limits _{i = 1} ^n (\hat y_i - \bar y)^2\)
\(SSR \equiv \sum \limits _{i = 1} ^n \hat u_i ^2\)
SST is a measure of the total sample variation in the \(y_i\), that is, how spread out the dependent variable is in the sample. If we divide SST by n - 1, we obtain the sample variance of y. Similarly, SSE measures the sample variation in the \(\hat y_i\), that is, estimated values of the dependent variable. SSR measures the sample variation in the \(\hat u_i\). Thus, 
\(SST = SSE + SSR\)

Goodness-of-Fit

So far, we have no way of measuring how well the independent variable explains the dependent variable. It is useful to compute a number that summarizes how well the OLS regression line fits the data. Assuming that the total sum of squares is not zero, we can divide by SST to find the R-squared of the regression, or coefficient of determination:
\(R^2 \equiv \frac{SSE}{SST} = 1 - \frac{SSR}{SST}\)
Thus, R-squared is the ratio of the explained variation compared to the total variation, or interpreted as the fraction of the sample variation in y that is explained by x. If all data points all lie on the same line, OLS provides a perfect fit to the data; in this case, \(R^2 = 1\).

Units of Measure and Functional Form

So far we have focused on linear relationships between the dependent and independent variables. Fortunately it is rather easy to incorporate many nonlinearities into simple regression analysis by using logs.
Why logs? First, because it gives us a constant percentage, i.e. an increase in education from 5 to 6 yrs increases wage by 8% and though the percent remains even, the actual dollar amount will grow as wage grows with years. Secondly, it allows us to obtain a constant elasticity model. This occurs because regardless of the numbers, natural logs approximate a proportional change. Summarized interpretations for level, semi-log, and log regressions are below: