The “false positive paradox” and the risks of testing asymptomatic people for COVID-19

Widespread screening of asymptomatic people leads to high numbers of false positives when background prevalence is low, even with accurate tests. During the Covid-19 pandemic, not only has the background prevalence been low (vaccine clinical trial baseline testing finds 0.5-0.6% even during periods of higher prevalence), but the various COVID-19 tests are not very accurate. When inaccurate tests are combined with a low background prevalence, this results in a massive and unacknowledged problem of far more false positive test results than true positive test results, leading also to inaccurate characterization of COVID-19 hospitalizations and deaths. Tam Hunt, J.D. Blaine Williams, M.D. Daniel Howard, Ph.D 1 Univ. of California, Santa Barbara 2 Kaiser Permanente Oahu 3 Independent Researcher Correspondence to: Tam Hunt tam.hunt@psych.ucsb.edu It is well-known that widespread testing of people with a low probability of having the disease at issue will lead to high levels of false positives, even with accurate tests (Skittrall et al. 2020; Bokhorst et al. 2012; Dinnes et al. 2021; Madrigal et al. 2020). This has been described as the “false positive paradox” (Flender 2019). It’s a paradox because even quite accurate tests can lead to high levels of false positives when used widely in a population with low actual prevalence of a given disease. For example, Skittrall et al. 2020 calculated that hypothetically screening 100,000 people chosen randomly from the general UK population in June 2020 would result in 25 times more false positives than true positives (50 false positives and 2 true positives), even with a test thought to have a very high 99.95% specificity. Widespread screening during previous outbreaks and pandemics has generally not been recommended because of the potential for high false positives. The Center for Disease Control (CDC)’s 2004 guidance from the SARS pandemic (CDC 2004), for example, stated: “To decrease the possibility of a false-positive result, testing should be limited to patients with a high index of suspicion for having SARS-CoV disease.” The debate around screening asymptomatic individuals

It is well-known that widespread testing of people with a low probability of having the disease at issue will lead to high levels of false positives, even with accurate tests (Skittrall et al. 2020;Bokhorst et al. 2012; Dinnes et al. 2021;Madrigal et al. 2020). This has been described as the "false positive paradox" (Flender 2019). It's a paradox because even quite accurate tests can lead to high levels of false positives when used widely in a population with low actual prevalence of a given disease.
For example, Skittrall et al. 2020 calculated that hypothetically screening 100,000 people chosen randomly from the general UK population in June 2020 would result in 25 times more false positives than true positives (50 false positives and 2 true positives), even with a test thought to have a very high 99.95% specificity.
Widespread screening during previous outbreaks and pandemics has generally not been recommended because of the potential for high false positives. The Center for Disease Control (CDC)'s 2004 guidance from the SARS pandemic (CDC 2004), for example, stated: "To decrease the possibility of a false-positive result, testing should be limited to patients with a high index of suspicion for having SARS-CoV disease." The debate around screening asymptomatic individuals The World Health Organization (WHO) and CDC did, however, recommend testing of asymptomatic people early in the COVID-19 pandemic, but the CDC revised this guidance in August of 2020 to recommend not testing asymptomatics even after potential exposure, only to reverse course again after public and expert pushback in the U.S. (Fang, M. 2020).
The US Food and Drug Administration (FDA) issued a strongly worded letter to healthcare providers in November 2020 warning about the potential for false positives from antigen testing, describing the problems associated with screening populations with a low background prevalence of COVID-19 (FDA 2020). The letter reminds practitioners that: "As disease prevalence decreases, the percent of test results that are false positives increase." CDC's most recent (March 2021) guidance does, however, still recommend widespread screening, which necessarily includes testing of mostly asymptomatics, despite the widely known issues regarding such policies. CDC's guidance states: "Rapid, point-of care serial screening can identify asymptomatic cases and help interrupt SARS-CoV-2 transmission. This is especially important when community risk or transmission levels are substantial or high." Many countries, including the UK and the U.S., have engaged in widespread population testing (Mercer and Salit 2021: COVID-19 testing is the "largest global testing programme in history, in which hundreds of millions of individuals have been tested to date.").
Not surprisingly, there appears to have been significant internal debate about this important issue in these agencies. Those arguing for screening asymptomatic people seem to have overlooked the extremely high false positive rate that such testing necessarily entails with low background prevalence (Skittrall et al. 2020;Dinnes et al. 2021).
We explain below why it is almost always unwise to test asymptomatics on a widespread basis. Such testing can lead to extremely high levels of false positives even with highly accurate tests. Unfortunately, the available PCR and antigen tests are not very accurate. And inaccurate tests combined with widespread testing of asymptomatics can lead to catastrophically high levels of false positives, as explained below.

The false positive paradox regarding prostate cancer
Let's first, as an illustration of the issue, look at the history of screening for prostate cancer. In the U.S. screening for prostate cancer was normal practice, under the common sense notion that it is good to detect and treat illnesses early. What resulted, however, was a high level of false positives due to the false positive paradox, and a growing awareness that most prostate cancers don't grow fast enough to be an issue for the patient, particularly in the elderly.
Consequently, the CDC, American Cancer Society, American Medical Association, and most other groups have stopped recommending widespread testing for prostate cancer, due to potential for false positives, overdiagnosis, overtreatment, and possible iatrogenic harm. CDC, for example, states at its prostate cancer website (CDC Prostate Cancer website, emphasis in original): Possible Harm from Screening: False positive test results: This occurs when a man has an abnormal PSA test but does not have prostate cancer. False positive test results often lead to unnecessary tests, like a biopsy of the prostate. They may cause men to worry about their health. Older men are more likely to have false positive test results. Possible Harms from Diagnosis: Screening finds prostate cancer in some men who would never have had symptoms from their cancer in their lifetime. Treatment of men who would not have had symptoms or died from prostate cancer can cause them to have complications from treatment, but not benefit from treatment. This is called overdiagnosis.
What is the background prevalence for COVID-19?
Even a test with a very high 99% specificity (1% chance of false positives), when used to screen asymptomatic populations with a low background rate of actual infection, will yield high levels of false positives. Similarly, Baden et al., 2020, found a 0.6% background positive PCR test result in the 30,420 clinical trial participants for the Moderna vaccine, after initial testing. Study participants for this trial were selected based on being at higher risk for exposure to the virus and the testing was conducted from late July to late October 2020.
In the UK, the Government's survey of the population in June 2020 found about 1 in 2,200 people with an active infection in the study window, which is 0.05%, an order of magnitude lower than the vaccine trials just mentioned (Connors and Williams 2020).
Voysey et al. 2021, the published results of the Astrazeneca vaccine trial, found 1.65% baseline antibody test positive results in their 20,675 study participants, but the antibody test (also known as serology testing) measures the presence of any past infection, not current infections, so this will necessarily be a significantly higher number than a snapshot in time of current infections, as is the case for PCR or antigen tests. The Astrazeneca trial study did not include PCR testing at baseline like the Moderna and J&J trials did.

Why we shouldn't test asymptomatics for COVID-19
Common sense would suggest that a test with 99% specificity would return only about 1 in a 100 false positive results. But this is not how it works. The false positive rate is far higher when disease prevalence is as low as the studies just cited have found. In other words: the Positive Predictive Value of screening testing is very low when background prevalence is low ( Here's why: If we test 1,000 people randomly in a population where 1% have the illness at issue, and our test is 99% specific to that illness, we will have one true positive and one false positive for each 100 tests. So testing 1,000 people results in 10 true positives and 10 false positives.  Figure 3).
In populating the three cells in the calculator (at the top of the image) we've conservatively assumed 1% pre-test probability of active infection, which is, based on the data reviewed above, a higher level of active infection than was found in the large vaccine clinical trials.
We also assumed 58% sensitivity and 99% specificity, which are the findings of a recent Cochrane metaanalysis combining 64 published studies of antigen test accuracy, when used to test asymptomatics (Dinnes, J. et al. 2021).
The result in this scenario is 50% false positives (1 true positive and 1 false positive) -even with a 99% specificity test. There would theoretically be zero false negatives, so the risk of missing actual infections is not at issue. This result is not surprising because the numbers are the same as the hypothetical scenario just discussed. The Dinnes meta-analysis concludes similarly, but for a lower background prevalence and a slightly higher test specificity (99.6%): "At 0.5% prevalence applying the same tests in asymptomatic people would result in [Positive Predictive Value] of 11% to 28% meaning that between 7 in 10 and 9 in 10 positive results will be false positives, and between 1 in 2 and 1 in 3 cases will be missed." 50% is the same as random chance. In other words, this 99% specificity test can do no better than a coin flip when declaring a positive result. So screening in this scenario is not warranted because data that is no better than a coin flip is not data -it's random chance.
However, the situation is much worse than this because neither PCR nor antigen tests are close to a 99% specificity level in practice, for various reasons (Braunstein et al. 2021). Lee 2020 performed a lab analysis of the CDC PCR test accuracy, which was widely used in the first months of the pandemic, and found it had a 70% specificity (i.e. 30% false positives) and 80% sensitivity (20% false negatives). This level of inaccuracy matches the CDC's own internal report that found 33% false results when its PCR test was released in late February 2020, as reported on by National Public Radio (Temple-Raston 2020).

Why intuition is a poor guide regarding testing
Intuitively, and in an emergency situation, we may think that a 70-80% accuracy rate is far from perfect but may still be "good enough." But this is where common sense and intuition gets us -and the public -into trouble. If we input these figures in the BMJ calculator, we obtain a catastrophic 30 out of 31 false positives ( Figure 2).
In other words, at a 1% pre-test probably (background prevalence), just one out of 31 positive test results is a true positive. And, again, we have zero false negatives, so the tests are not missing true positives in this scenario. This dynamic is a large part of why there have been so many allegedly asymptomatic carriers of the virus: 1) a "confirmed case" was defined by the CDC as anyone who tested positive (CDC Interim Case Definition 2020); 2) however, with highly inaccurate tests and widespread testing of asymptomatic individuals, the large majority of "cases" seem to have been false positives (Braunstein et al. 2021 makes a similar point). Figure 3 summarizes the false results for the two scenarios already discussed, but in a range from 1-20% pre-test probability. False positives remain very high through 5% and higher background prevalence and false negatives remain low. The end result of these various policy mis-steps is fear of asymptomatic transmission leading to release of rushed and highly flawed tests (Bandler et al. 2020), which were used to screen widely in asymptomatic populations, which then seemed to find large numbers of asymptomatic carriers (who were actually in the large majority of cases false positives), creating a vicious cycle of faulty data.
This problem relates to more than just misidentifying positive COVID-19 cases; it also is relevant to data on hospitalizations and death rates. After testing became widely available, it became standard practice to test all patients admitted to hospitals in the U.S., regardless of symptoms. While this may have been a necessarily cautious step in order to minimize outbreaks in hospitals, it significantly inflated hospitalizations and deaths attributed to COVID-19. A positive test result was the primary basis for defining COVID-19 hospitalizations and deaths since no symptoms were required to designate a COVID-19 hospitalization or death as such.
In other words, since the CDC and WHO case definitions took the unprecedented step of defining a "confirmed case" as simply a positive lab test result, and then most jurisdictions also defined a COVID-19 hospitalization and a COVID-19 death in the same manner, if the large majority of positive test results are false positives, it is necessary to re-examine the pandemic surveillance data chain from the beginning.
We'll close with a homework problem for the reader: think through how to apply this Bayesian thinking to the J&J baseline PCR testing, which as mentioned above, found 0.5% baseline PCR test positives at the beginning of the trial period. How many of those positive results were true positives? How many were false positives?