3. Generation and Pursuit-Worthiness in Medical Diagnosis
A typical diagnostic process begins when a patient arrives at a hospital
or clinic and reports certain symptoms or ailments. Insofar as the
situation allows it, the physician will start by interviewing the
patient and performing a physical examination to gather information
about the patient’s state, how long they have experienced the symptoms
and their broader medical history. Based on these, the physician tries
to generate one or more possible explanations for the salient aspects of
the case. For example, if a patient has uncontrollable hypertension
(high blood pressure), the physician may conjecture that the patient has
renal artery stenosis (narrowing of kidney arteries), since this would
explain the signs.
Our use of the term ‘generation’ here should be understood in a broad
sense. In most cases, medical diagnosis does not involve formulating
completely novel hypotheses. Rather, it will primarily be a case of
recalling already known conditions and realizing that they could
potentially account for the salient signs and
symptoms.5 However, this is not a sharp distinction.
When facing atypical or complex cases, physicians may have to combine
their knowledge of possible diseases in novel ways to explain the
condition of that specific patient.
While physicians will often be able to think of a large number of
theoretically possible diagnoses, it is neither practically possible nor
advisable to consider every single one. Physicians need to pick out a
limited number of hypotheses to focus on. The set of diagnostic
hypotheses actively considered at a given time is called thedifferential diagnosis .6 There are good reasons
why physicians need to limit themselves to a relatively narrow
differential diagnosis. First, limitations of working memory preclude
working on too many hypotheses at once (Sox, Higgins and Owens 2013, 9).
Second, actively pursuing too many hypotheses can lead to potentially
harmful over-testing (Richardson et al 1999, 1214-15). Third, in
emergency situations there is no time to test every conceivable
hypothesis. With a patient’s health or life on the line, we need to be
able to effectively, rapidly and efficiently determine the
likeliest cause of their ailments. This requires wisely selecting a
limited range of hypotheses to focus on.
These arguments are often applied to the choice of a differential
diagnosis, but similar points apply already at the generativestage. Just as it is inadvisable to select too broad a differential
diagnosis, physicians cannot—and should not—try to generate a list
of every single possible explanation before selecting a differential
diagnosis. As argued above, generating hypotheses and selecting them for
pursuit are subject to the same normative considerations. Just as
physicians need to make good choices about which hypotheses to include
in their differential diagnoses and which of these to prioritize for
testing, they must choose how to generate possible diagnoses, as well as
when to stop .
On the grounds of what kinds of considerations, then, should these
decisions be made? The most popular approach to the problem of choosing
whether to test a given hypothesis in the medical literature is the
so-called threshold approach (Pauker and Kassirer 1980;
Djulbegovic et al 2015). This approach is based on decision-theoretic
models which compare, e.g., a choice between: (i) applying treatment on
the assumption that the hypothesis H is true; (ii) applying a
test for H , and then only apply treatment if the test is
positive; (iii) stop working on H , i.e. neither test nor treat.
Given quantitative estimates of (a) the reliability of the test, (b) the
likelihood of the salient consequences of treating and testing and (c)
the utility of these consequences, one can derive thresholds for how
probable the hypothesis needs to be in order for it to be most rational
to test, treat or abandon the hypothesis.
Threshold models highlight a number of factors that should be weighed
against each other in clinical decision-making, including: How reliable
are the available tests? How safe/harmful are the tests? How dangerous
would the disease be, if missed? How effective is the available
treatment? How safe/harmful is the treatment in itself? Briefly put, on
this approach, physicians have to consider whether their confidence inH is high enough for the potential benefits of treating the
disease (if H is true) to outweigh the potential harms of
treating or testing unnecessarily (if H is false).
While these factors are indeed important, we want to highlight a further
type of consideration, which can be called strategic
considerations , that go beyond the direct consequences of tests and
treatments for the health of the patient. As Peirce (1938-1952, §7.220)
points out, the pursuit-worthiness of a hypothesis also depends on what
we might learn from pursuing the hypothesis even if it turns out to be
false. Testing a hypothesis can have important downstream effects for
later stages of inquiry, in addition to merely confirming or
disconfirming the tested hypothesis.7 For instance, an
imaging study which fails to detect renal artery stenosis may also show
that the adjacent adrenal gland is enlarged, thus instead suggesting
pheochromocytoma (a tumor of the adrenal gland) as the cause of
hypertension. At other times, it can be worth trying to rule out a
potential diagnosis simply to make the diagnostic space more manageable,
i.e. to pre-emptively prune back possibilities that might otherwise
become relevant later on. If testing can be done reliably and without
risk of harm, it can be worth trying to rule out even fairly unlikely
hypotheses early on. Examples of this could include serologic testing
for Lyme disease, fat aspiration for amyloidosis and ferritin levels for
myocardial iron.
Strategic considerations involve reasoning about how pursuing a specific
hypothesis can influence later stages of inquiry, including future
generation of hypotheses. It is this dynamic and intertwining
relationship between hypothesis generation and selection for pursuit
which threshold models, in their current form, fail to capture. Before
making this argument, however, we want to provide a concrete
illustration of our framework, by way of analyzing a detailed clinical
case.