Logistic regression is used for classification. The independent variable of the data is quantitative and the dependent variable of the data is binary (0 or 1) i.e. in a class or not. Instead of modeling the response (0 or 1), the logistic model’s dependent variable is the probability that a data point belongs in the class.

*logistic function* is fit to the data. Regerring to the plot above, the model (blue curve) can take in the balance and output the probability of default. Notice that it does tell you if the person defaults or not; it tells you the probability of default. In order to perform classification, you must choose a cutoff, say 0.5, where all values above are predicted as default and all below are predicted as not default.

Introducion to Statistical Learning \cite{James_2013} - section 4.3

Advanced Data Analysis from an Elementary Point of View \cite{Dobson_2001} - chapter 7

An Introduction to Generalized Linear Models - chapter 7

Where \(y\in\lbrace0,1\rbrace\), we are fitting the logistic function,

\[{p(y=1;\beta)} = \frac{1}{1 + e^{-(\beta_0+\beta_1x)}}\]

This is the probability that \(y=1\) parameterized by \(\beta_i\), also written as

## Share on Social Media