Abstract:
Objective: High data
quality is essential to ensure the validity of clinical and research
inferences based on it. However, these data quality assessments are
often missing even though these data are used in daily practice and
research. Our objective was to evaluate the data quality of our
high-resolution electronic database (HRDB) implemented in our pediatric
intensive care unit (PICU).
Design: A prospective validation study of a HRDB.
Setting: A 32-bed pediatric medical, surgical and cardiac PICU
in a tertiary care freestanding maternal-child health center in Canada.
Population: All patients admitted to the PICU with at least one
vital sign monitored using a cardiorespiratory monitor connected to the
central monitoring station.
Interventions: None
Measurements and Main Results: Between June 2017 and August
2018, data from 295 patient days were recorded from medical devices and
4,645 data points were video recorded and compared to the corresponding
data collected in the HRDB. Statistical analysis showed an excellent
overall correlation (R2=1), accuracy (100%),
agreement (bias=0, limits of agreement=0), completeness (2% missing
data) and reliability (ICC=1) between recorded and collected data within
clinically significant pre-defined limits of agreement. Divergent points
could all be explained.
Conclusions: This prospective validation of a representative
sample showed an excellent overall data quality.
Key words: Pediatrics; Critical care; Database; Electronic
Health Record; Big data
INTRODUCTION
Over the past two decades, technological and computer advances were used
extensively to modernize medicine and assist medical teams in daily
practice, as shown by the widespread use of electronic medical records
(EMR) or connected biomedical devices. While the dedicated purpose in
health care services is patient management, these systems have been
perceived by many scientists as a way of improving clinical research
efficiency and data analysis (1–4). As a result, many medical databases
(DB) have been built since the beginning of the twenty-first century
(4–6). To optimize our research quality in our different fields of
expertise such as respiratory physiology and the development of clinical
decision support systems (CDSS) (7), we implemented in 2015 an automated
electronic data gathering process in our pediatric intensive care unit
(PICU) (8). This DB was designed to develop and validate virtual or
synthetic patients for cardiorespiratory physiology as well as for CDSS
and data-driven learning systems (8). However, a validation step of the
collected data is necessary before considering this DB suitable for
research purposes (9–11). Indeed, the value of research findings
depends on data quality (12,13). Several guidelines or frameworks were
elaborated to evaluate and report the quality of DBs and national
registries and to guide designers of DBs at each step of the data
collection (12,14,15). These documents highlighted the need to evaluate
data quality, to compare dataset quality performance between them and
raised the question of data validity that every scientist or clinician,
as data users, deal with whether in day-to-day clinical care
decision-making or in medical research (16,17). However, none of these
guidelines provide a detailed validation process that is entirely
suitable for high resolution electronic DB (HRDB), defined as a database
that collects more than one data point per minute per variable and per
patient. Besides, to our knowledge, none of the HRDB published a
detailed validation procedure and evaluation of the quality of the data
(18–20). This article constitutes the final part of the validation
process of our HRDB (8,11). The purpose of this study was to assess the
quality of the data include in our HRDB and to provide a generalizable
validation method for all HRDB.
METHODS
This study was a prospective data quality assessment conducted in the
PICU of Sainte-Justine hospital (Montreal, Canada), a pediatric 32-bed
medical, surgical and cardiac ICU in a free-standing tertiary
maternal-child health center. The study was performed between June 2017
and August 2018.
Population
Eligible patients were those admitted to the PICU with at least one
vital sign monitored using a cardiorespiratory monitor connected to the
central monitoring station. Patients were excluded
if the presence of one study
observer in the patient room was considered incompatible or
inappropriate by the physician or the nurse in charge.
Standard management
As previously reported (8), as a standard of practice in our PICU, all
physiological, therapeutic and clinical data from medical devices
available at the bedside of all children admitted in the PICU were
continuously collected in an organized HRDB linked to the EMR from
admission to discharge of the PICU (8). Biomedical signals from the
monitors were sampled and recorded every 5 seconds while data from
ventilators and infusion pumps were recorded every 30 seconds. The full
details of the HRDB structure were previously reported (8).
Study protocol
The study was divided in three periods of 14, 16 and 17 days
respectively (convenient samples): the first was dedicated to data from
the monitors, the second to the data from the ventilators and the third
to the infusion pumps. During the first period, data were collected on
devices that displayed the monitored data outside of the patient’s room,
whereas both second and third period took place at the bedside. On every
study day, a sample of 20% of the children hospitalized in the PICU
that meet the inclusion criteria was randomly selected. One patient
could have been included more than once. A videotape of the data
displayed on the medical devices (monitors, ventilators and infusion
pumps) and available at the bedside, such as heart rate or positive
inspiratory pressure (Figure 1) was recorded. Each day, a time
synchronization process with the automatically calibrated clocks of the
hospital and the video recorder was made. Each monitor (IntelliVue MP60,
MP70 and MX800, Koninklijke Philips Electronics, Amsterdam, the
Netherlands) was video recorded for 30 seconds, each ventilator
(Servo-I®, Maquet, Getinge, Sweden) for 90 seconds and
each infusion pump (Infusomat®, B. Braun Medical Inc,
Bethlehem, Pennsylvania, U.S.) was simply photographed. Since ventilator
data are recorded every 30 seconds in the HRDB, 90 seconds was enough to
get at least two consecutive records in the HRDB. Because the infusion
pumps parameters are only set, and not measured, static pictures were
considered enough. The data displayed on the devices were then manually
extracted into a spreadsheet from the pictures or at every second from
the videotape. Data were periodically screened for aberrant values.
These data, collected by one independent observer (AM) who was not
implicated in patients’ care, were considered as the reference data.
Three types of data from medical devices were collected (Figure 1): 1)
Physiologic signals from patient monitors (heart rate, oxygen saturation
and systolic, diastolic and mean blood pressure) 2) Respiratory and
ventilator parameters from the ventilator (positive end-expiratory
pressure, peak inspiratory pressure, respiratory rate, respiratory
minute volume) 3) Pharmacotherapy from the infusion pumps (ex: drug
names and infusion rate). The corresponding HRDB data were extracted
using structured query language (SQL) and used for comparison (Figure
1).
Endpoints
The primary endpoints were the absolute value of the selected variables
(heart rate (HR) and pulse oximetry (SpO2)) recorded
from the monitors. The secondary endpoints were:
- The absolute value of the selected variables recorded from the
monitors when available: invasive arterial blood pressure (systolic
(SBP), diastolic (DBP) and mean blood pressure (MBP)) and central
venous pressure (CVP)
- The absolute value of the selected variables recorded from the
ventilators: positive end-expiratory pressure (PEEP), positive
inspiratory pressure (PIP), respiratory rate (RR), minute ventilation
(VM), expiratory tidal volume (VE)
- The infusion rate
- The infused drugs’ name
- The recording time of the data
- The missing data or the completeness of the dataset.
- Statistical analysis and features’ definition
Reference data were compared to the experimental data simultaneously
collected in the PICU HRDB at a specific time point for each patient.
Variables were expressed as mean ± standard deviation or median
[minimal – maximal value] for continuous variables, depending on
whether they followed a normal distribution (Shapiro-Wilk normality
test) and count (percentage) for categorical variables. Comparisons
between experimental and reference data were made by dependent tests as
appropriate.
Under the concept of quality lies several features that tends to
delineate the degree to which the HRDB is a true representation of the
reality of the PICU’s data (14,21)
- The accuracy is defined as the closeness of agreement between the
experimental and the reference data. Accuracy refers to both trueness
and precision. Trueness is expressed in terms of bias and corresponds
to the difference between experimental and reference value. Precision
relates to the distribution of the experimental values. The agreement
between experimental and reference data were evaluated for each
parameter measuring the absolute agreement, the mean difference (22)
and using the Bland & Altman analysis. Bias and limits of agreement
were calculated with the R statistical package “BlandAltman” (23)
based on both the original method (the difference of the two paired
measurements was plotted against the mean of the two) and the modified
one (the difference of the two paired measurements was plotted against
the value of the reference data) of the Bland Altman analysis (24,25).
In theory, the data should not be modified between the measure
(monitor) and the storage (database) and the accuracy should be
perfect. However, rounding process could slightly impact accuracy
evaluation. Moreover, accuracy implies more than just the data itself:
metadata, such as timestamps and patient identifiers, could also
impact accuracy in case of asynchrony for example. Acceptable limits
of agreement were a priori defined as ±5% of the mean of the
reference.
- The correlation, defined as the association between reference and
experimental data, was evaluated by the determination coefficient
(R2).
- The reliability, defined as the degree to which measurements can be
reproduced, echoes both agreement and correlation between experimental
and reference data. It was evaluated by intraclass correlation
coefficients (ICC) for each parameter. ICCs estimates, 95% confidence
intervals and F test results were calculated with the R statistical
packages “irr”(26) and “psych” (22) using a single measurement,
agreement, two-way mixed effect model (27).
- The completeness is related to the amount and the nature of the
missing data and is defined as the extent to which the data that
should have been included were indeed included. To evaluate the
completeness, the data of infusion pumps within the HRDB were compared
to the corresponding data in the EMR. We compared for each selected
patient, throughout the day, the data recorded in the HRDB to those
recorded in the EMR for each infusion. Additionally, we selected 14
daily-used PICU drugs and their respective standardized concentration
(sedative, analgesic and vasoactive drugs) and compared the
correlation between the HRDB and the EMR within the study period (from
August 31, 2017, to August 1, 2018).
All analyses were performed after the exclusion of the paired
measurements when one of the experimental or reference data was missing.
Thus, we intended to differentiate inaccurate data from missing data. A
p-value < 0.05 was considered statistically significant.
Statistical analyses were performed using open access R software
(version 3.5.1, 2018-07-02, http://cran.r-project.org/).
Ethics: The study was approved by the institutional review board
of Sainte-Justine Hospital (reference number 2016-1210, 4061). The
exploitation of the HRDB is regulated by a DB policy validated by the
institutional review board and no protected health information were
stored in the HRDB nor in the video recordings. No patients or
caregivers were recorded in the videos.
RESULT
Between June 1, 2017, and August 30, 2018, 1378 patients were admitted
to the PICU and 100% were included in the HRDB. During the effective 47
days of study, 81 patients were hospitalized in PICU and 81 (100 %)
were included in the HRDB. Data from 70 patients (86 %), 295 patients’
days, were recorded from medical devices (Table 1) and 4645 data points
were video recorded and compared to the corresponding data collected in
the HRDB (Table 2).
Monitor data validity
Statistical analysis showed an overall excellent correlation, agreement
and reliability, as shown in Table 2. ICCs were considered as excellent
for all the tested variables (Table 2). Bland-Altman analysis showed an
excellent accuracy and precision between recorded and collected data
within clinically significant pre-defined limits of agreement
(Supplemental Digital Content 1).
A single heart rate measurement in the experimental data (0.03 %) was
considered as clinically different from the reference data (Figure 2,3).
We documented 74 data points (2 %) that were missing, as detailed in
Table 2.
Ventilators’ data validity
Statistical analysis showed an excellent overall correlation, agreement
and reliability (Table 2, Supplemental Digital Content 2). A small, but
statistically significant difference was found for the positive
inspiratory pressure (mean difference of -0.022 cmH2O,
p-value 0.02). This difference was observed only for a minority of the
data (95.5% of all values were equal). Agreement remained over 90%
with an excellent correlation between reference and experimental data.
ICCs were considered as excellent for all the tested variables (Table
2). Bland-Altman analysis showed excellent accuracy and precision
(Supplemental Digital Content 2).
No data were missing (table 2).
Infusion pumps data validity
The comparison with the data displayed on the infusion pumps showed an
overall excellent correlation, agreement and reliability (Table 2) with
Bland-Altman analysis showing an excellent accuracy and precision
between recorded and collected data for all the tested variables
(Supplemental Digital Content 2). ICCs were considered as excellent for
all the tested variables (Table 2). Twenty-three infusions (9 %) were
not retrieved in the HRDB (Table 2). Nine episodes were related to six
patients without any pharmacological data collected in the HRDB and 14
episodes were related to pump dysfunction. Other minor discrepancies
were noticed between HRDB and EMR (Table 3). Correlation between HRDB
and EMR regarding drugs of interests over the study period were depicted
in figure 4.
Timestamps
A delay was observed between time synchronized videotapes and collected
data from the monitors and the ventilators. This delay was less than 28
seconds and remained stable among patients. Besides, regarding infusion
pumps data, we discovered that the data were not collected in the HRDB
every 30 seconds as expected, but at different time interval between 10
and 40 seconds or when a modification was done. No delays were observed
between the source and the HRDB.
DISCUSSION
Whether in day-to-day clinical care decision-making or in medical
research, the need to evaluate data quality is essential to ensure the
reliability of DB (9,21,28,29). To our knowledge, this is the first
study to validate PICU data contained in a specific HRDB (20,30). This
article is indissociable from our two previously reports (8,11). The
first report described the gathering process of our HRDB (8) and the
second gave a comprehensive description of the HRDB’s architecture and
process (11), this articles constitute the quality assurance of the HRDB
(14,31). This third article completes this set. It contributes to the
quality assurance phase and to the quality control phase of the HRDB
(14,31).
As there were no guidelines specifically designed to guarantee
high-resolution data quality (9,14), we elaborated the first complete
validation procedure. Our validation procedure was inspired by
previously published experiences (9,10,30,32–34) and guidelines
(13–15,28,35) regarding data quality assessment in the field of medical
DB collected at a lower rate or in a restricted area. To evaluate the
quality of the data, we chose to perform an external validation
procedure. We compared our extracted results with the information
displayed on the monitor or the biomedical device (21).
Our study showed an excellent
overall accuracy, completeness and reliability of our HRDB when compared
to displayed data at the bedside at the same time.
Regarding the accuracy of the dataset, we noticed only one clinically
significant different heart rate value. This error was due to a rapid
acceleration of the heart rate (Figure 2). In the video, the heart rate
increase from 118 beats/minute to 154 beats/minute and the HRDB recorded
one single value at 135 beats/minute during the transition. This
suggests that monitors processed those data and only refreshed the
display at a specific interval (probably between one and two seconds)
and did not show intermediate data. Then, the HRDB recorded an
intermediate value, which explains the importance of the difference
between the reference value and the experimental value. Differences
between the HRDB’s data and the reference data were observed regarding
PIP. Even statistically significant, disagreements were not clinically
significant (the maximal difference was 0.5 cmH20 and
concerned only 4.5% of all the collected PIP, the remaining 95.5%
values were strictly equal) as shown by a mean difference of -0.022
cmH2O. Only integers are displayed on the ventilator
screen and the data processing algorithm of the raw values measured by
the ventilator is unpublished. Thus, we suspect that these very minor
differences may be due to rounding process.
Regarding the completeness of the dataset, 2% of the data were missing.
Even less than previously reported (9,14,30), this number of missing
data didn’t meet our expectations for this HRDB, as we planned for a 0%
missing data. This loss of data was mainly caused by a systematic error
in the data processing. Indeed, we discovered that the original HRDB
structure could only record nine parameters simultaneously. Then, when
more than nine parameters were sent, the additional data were not
registered. Once this issue was identified, we modified our database for
an entity-attribute-value structure where each data point is stored as
an independent row (36,37).
Regarding infusion pumps and pharmacological data, the discrepancies
between the experimental and the reference data or the EMR appeared
associated with variability in care more than with a gathering process
failure. Regarding the 23-missing data from infusion pumps, we proved
that the corresponding infusion pumps were disconnected from the
network, thus the data were not sent to the HRDB. This disconnection of
the infusion pumps explained these discrepancies between the EMR and the
experimental data, with all the pharmacological data missing in six
patients. In addition, the large majority of inconsistencies between the
EMR and the experimental data were due to a time difference from the
beginning or the end of the drug. In the EMR, a drug needs to be ordered
before the drug rate could be registered, while in the HRDB, the rate
starts to be registered directly when the pump is connected to the
network. Furthermore, medications were not registered in the patient
EMR, probably because the physician did not order it. However, nursing
notes confirmed that the drug was given. In these situations, the HRDB
could be considered as more accurate than the EMR. On two occasions, the
name of the fluid was different between EMR and HRDB. However, the name
recorded on the pump and the one in the HRDB was the same, suggesting
the infusion pump drug name was not modified when the medication was
replaced. Finally, it happened twice that no data were recorded over a
period when they should be. These intervals happened just before the
patient was moved to another room and the procedure is to disconnect the
pumps before moving the patient. Although these four situations altered
the HRDB accuracy, they were not due to a HRDB limitation. Last,
timestamp asynchronies were due to a server setting that was corrected
after this study.
This study’s main limit lies in the lack of validation of the complete
dataset (10,14,30). We have considered several procedures to apply
either during or after the gathering of the HRDB. Given the gigantic
data gathering rate (about 10,000 data points per minute), it is humanly
impossible to both gather and validate the data simultaneously while
collecting the DB or even validate the entire database retrospectively.
Thus, we decided to perform a point-by-point data analysis on a randomly
chosen patients sample considered as representative of the HRDB (30).
Besides, some could argue, and they would be right, that we were not
able to correct abnormal values or undisplayed data. But, as this
dataset is supposed to reproduce the patient’s entire course in PICU,
abnormal values and undisplayed data should be considered as part of the
patient’s course as much as a true value (19). Furthermore, this is a
study in one institution with an excellent understanding of the value of
data quality. Even if the methodology is transferable to other data,
this study only validates this particular data in this particular HRDB
and its results shouldn’t be generalized to other clinically collected
data. Finally, even limited as most of the analyzed data were
electronically captured, we must consider the possibility of a Hawthorne
effect. The observational methodology might have modified the quality of
the data being entered in the EMR by the bedside personnel.
CONCLUSION:
This study showed an excellent overall quality of the data include in
the HRDB of our PICU while performing validation procedures on a
representative sample. We considered that this study provides an
assurance for future HRDB users of the data quality, especially
regarding monitor and respirator data. By reporting and detailing this
data quality validation process, the process becomes reproducible by any
research team and sets a reference for future validation studies of
similar datasets.