Formulation
The classic empirical ROC curve is computed by comparing a binary outcome \(Y\)with a continuous measure \(X\) where each observed level of \(X\) is evaluated as a candidate cutpoint discriminating observed \(Y=1\ \text{(positive) from}\ Y=2\ \text{(negative)}\). Observations exceeding the candidate cutpoint are classified positive with respect to the continuous measurement, while those less than or equal to the cutpoint are classified negative. As in a \(2\times2\) contingency table, the count of correct classifications among positive outcomes comprises the true positives (\(TP\)) and among negative outcomes the true negatives (\(TN\)). The count of incorrect classifications among negative outcomes comprises the false positives (\(FP\)) and among positive outcomes the false negatives (\(FN\)).
These counts are used to compute: sensitivity, which is the probability that an observation with a positive outcome is correctly classified by a continuous measurement above a candidate cutpoint \(\left(\text{sensitivity}=\text{TP/[TP+FN]}\right)\); and specificity, which is the probability that an observed negative outcome is correctly classified by a continuous measurement at or below a candidate cutpoint \(\left(\text{specificity}=\text{TN/[TN+FP]}\right)\). Thence, coordinates for the empirical ROC curve are computed where the abscissa is \(1-\text{specificity} (=\text{false positive rate; FPR})\) and the ordinate is \(\text{sensitivity} (=\text{true positive rate; TPR})\). The best cutpoint \(X^{*}\) given the data may be identified from te ROC curve coordinates with a criterion that maximizes \(\text{TPR}\) and minimizes \(\text{FPR}\). Cross-referencing the identified ROC curve coordinate with its observed continuous measurement yields the cutpoint distinguishing the binary outcomes. A variety of cutpoint criteria are available, such as the Youden Index, Matthews Correlation Coefficient, and Total Accuracy \cite{Youden1950,Matthews1975,metz1978basic}. In addition, the ability of the continuous measurement to discriminate between outcome levels, which is equivalent to the strength of the association between the two, may be represented by the area under the ROC curve (AUC; also known as the c-statistic), which is the probability that an observation with a positive outcome will have a higher continuous measurement than an observation with a negative outcome.
Since the ROC curve describes the relationship between a binary outcome and continuous predictor, it is directly related to logistic regression \cite{Lloyd2002,Qin2003,Krzanowski2011}. For the binary outcome where \(Y=2\) is the reference outcome level, let \(\pi_1 = \text{Pr} \left[ Y=1 \right]\). The univariate logistic model with continuous predictor \(X\) and linear parameters \(\{\alpha,\beta\}\) is: