Methods
Some systematic review approaches, such as those produced by the
Cochrane collaboration, aim at exhaustively summarizing the evidence
about the effects of a given intervention. The realist review approach
differs in that identifying the effects of interventions is not the end
result so much as a step toward understanding the causal processes
involved in the production of those effects 1-3. This
is similar to what the field of evaluation describes as reverse logic
analysis where the aim is to identify the causal links between
characteristics of given interventions and their outcomes in order to
provide insights on how to produce similar outcomes in the design of new
interventions 4. We initially expected to be able to
conduct a detailed reverse logic analysis based on the available
scientific literature documenting home care delivery models. However,
the literature identified provided too few insights on the causal
processes involved to allow us to go beyond the identification of three
main characteristics of promising interventions.
Search Method
To maximize the breadth of the search, we relied on three different,
sequential, search approaches. The starting point was a keyword-based
search in MEDLINE and CINAHL conducted in June 2019. This search led to
the identification of 1628 non-duplicate references that were reviewed
independently by two reviewers on the basis of title and abstract. Two
criteria were used. First, the document had to provide relevant
information on the delivery of case managed, integrated or
consumer-directed home and community services. Home and community
services could include but could not be limited exclusively to medical
care. Second, the population receiving the care needed to be community
dwelling, with either a majority aged 65 years and over, or with a
subsample of persons aged 65 and over for whom results were reported
separately. Among the references identified, 107 were selected for
full-text analysis. The full text was then independently appraised by
two reviewers. 35 articles were selected at the end of the first step.
For the second step, the bibliographies of the 35 papers previously
selected were compiled and reviewed to identify potentially relevant
titles. This led to the identification of 94 new references that were
then reviewed according to the same double-blind processes used in the
first step. Of those, 50 documents were selected for full-text review
and 34 included in the analysis. We also included one paper
independently identified by a co-author. At the end of the second step,
70 documents were identified.
The third step was a reverse search in MEDLINE for all articles citing
at least one of the 70 documents identified through the previous two
steps. This led to the identification of 1102 non-duplicate references.
Of those, 71 had already been reviewed previously (recaptures). The
remaining 1034 were processed in the same way as described previously.
Of those, 78 were deemed appropriate for a full-text review and 42 were
retained. At this stage, a second paper provided by a co-author was also
added. The low number of recaptures suggests that the total number of
articles that fit our focus of interest is likely very large5.
In the end, 113 documents were included in the analysis11The
complete list can be accessed as a PubMed bibliography at
https://www.ncbi.nlm.nih.gov/myncbi/1Dm-PibJgyqcPF/bibliography/public/..
Figure 1 (below) provides a flowchart for the process.
[Insert figure 1: Search process flowchart]
Relevance and strength
appraisal
Our approach to full-text appraisal relied on two scores: one for
methodological quality (strong=3, acceptable=2, weak=1) and one for
relevance (highly relevant=3, some relevant elements=2, not relevant
=1). Documents were only selected for inclusion at this next stage if
they had a combined score of 3 or above. The inclusion threshold was
deliberately set low enough to maximize the sensitivity of the search.
Divergences in scoring were resolved by discussion and consensus was
reached in all cases.
As this system relies on a moving threshold regarding the strength of
the study design, some additional discussion might be warranted. The
type of review we conducted is integrative and iterative in nature.
First, it is integrative, as different types of evidence (description,
typologies, outcome evaluations, etc.) are brought together with the aim
to identify desirable characteristics of home care delivery models.
Second, it is iterative, as the focus of the review was refined on an
ongoing basis as it progressed. Therefore, some of the documents
included in the analysis offer robust evidence while others are
descriptive or rely on weak study designs. However, when analyzing the
data itself, the strength of the evidence was taken into account in a
contextualized way. For example, study design matters when analyzing
potential links between interventions and outcomes. But study design
matters much less than face validity when assessing the usefulness of a
typology of home care delivery models.
Data organization and
coding
The documents were coded on an ongoing basis throughout the three phases
according to a modified PICO grid. The main modification to the PICO
format was that the āCā here refers to the causality presumptions made
in the paper (what underlying hypotheses are made or implied linking the
intervention to the expected outcomes?). The other items were usual
elements of the PICO format, including: Population (Who is receiving the
services? For example health issue, age, insurance status, location,
etc.); Intervention (What services are being offered? Professionals
involved, intensity, duration, etc.); and Outcomes (what outcomes are
described or measured?). We also coded articles by country and type of
method. When relevant, additional information such as formal definitions
of home care were retrieved during coding. Also, some documents included
in the analysis would not easily be classified within this coding grid
(for example, broadly focused reviews of the literature) and those where
coded on an ad hoc basis. The coding results for each article were 164
words long on average (standard deviation 68 words), for a total of
18585 words. The analysis also heavily relied on multiple iterative
reading of the full text of each document.