Logistic Regression


Logistic regression is used for classification. The independent variable of the data is quantitative and the dependent variable of the data is binary (0 or 1) i.e. in a class or not. Instead of modeling the response (0 or 1), the logistic model’s dependent variable is the probability that a data point belongs in the class.

Replace this text with your caption

The logistic function is fit to the data. Regerring to the plot above, the model (blue curve) can take in the balance and output the probability of default. Notice that it does tell you if the person defaults or not; it tells you the probability of default. In order to perform classification, you must choose a cutoff, say 0.5, where all values above are predicted as default and all below are predicted as not default.

Starting Out

  1. Introducion to Statistical Learning \cite{James_2013} - section 4.3

  2. Advanced Data Analysis from an Elementary Point of View \cite{Dobson_2001} - chapter 7

  3. An Introduction to Generalized Linear Models - chapter 7

More Detail

What function are we fitting?

Where \(y\in\lbrace0,1\rbrace\), we are fitting the logistic function,

\[{p(y=1;\beta)} = \frac{1}{1 + e^{-(\beta_0+\beta_1x)}}\]

This is the probability that \(y=1\) parameterized by \(\beta_i\), also written as