Discussion
Amongst the 12 papers identified, the definition of prolonged LOS varied widely from 2-14 days, therefore limiting comparability of the results. There were also variations in study design, firstly in the usage of a comparison or control group. The most common method of comparison was using a ‘normal’ LOS group versus a ‘prolonged’ LOS group; however a lack of standardised definition of ‘prolonged’ means that, although similar methodology was used, results may differ widely. Silbermanet al. (25) considered normal LOS to be <2 days, therefore used ITU LOS of <2 days as the control vs the comparison groups of 2-14 days and >14 days, whereas Hein et al. (5) classified ‘prolonged’ to be LOS exceeding 14 days and used the results as a comparison for LOS <14 days. One study used an age-sex matched group from the general population for comparison, and not patients who had undergone cardiac surgery 27). This too limits comparability between the groups as there may be multiple variables unaccounted for, which may have impacted long-term survival and QoL, and this also goes against a key premise of the cohort study, that both exposed and unexposed groups must be taken from the same source population(30). Two papers made no comparison to a control(9), (26) so external factors cannot be ruled out when considering their results. Soppa et al.(28) described having used an ‘internal control’ but it is not clear what this constitutes. Comparisons were made between ‘Group A’ with LOS of 5-10 days and ‘Group B’ of LOS >10 days but the authors do not clarify whether both groups are considered to have experienced prolonged stay. It would seem that they are to be classified as having had a prolonged LOS, as only 4.7% of the 2250 cohort were categorised into Group A or B, indicating that the remaining 95.3% either had an uneventful post-operative course and subsequent discharge from ITU or died post-operatively, although why comparison was not made with those of ‘normal’ ITU LOS instead is not addressed by the authors. Secondly, there was wide disparity in sample sizes between studies. Silberman et al. (25) recruited a large cohort of 6385 patients in comparison to Soppa et al.(28) with 108 patients and Barrie et al.(3) with just 35 patients matched with 35 control participants. It seems that Silberman et al.(25) achieved this through a wide window of data collection, retrospectively identifying all patients undergoing cardiac surgery between 1993-2011. On the other hand, although Soppa et al. (28) identified 2250 cardiac surgery patients, only 108 participants met the inclusion criteria as only 4.7% experienced prolonged LOS in ICU. The advantages of a large sample size are that they can provide greater precision of estimation of treatment effects and are more likely to be representative of the sample, thus improving generalisability of the results. Sample size is also a key determining factor for the risk of generating false positive or false negative results (29). However, it can be argued that, on the basis of statistics, there is no reason why a significant result from a cohort of 6385 should be trusted more than a result from a cohort of 108, given the level of significance is the same. Biau et al.(29) also commented that small trials, if well designed, can still produce reliable estimation of treatment effect; however this is in reference to RCTs which are designed to mitigate the effects of external variables and bias through randomisation and blinding, which is not achievable in cohort studies.
Cohort studies are either prospective or retrospective in design; the most common approach amongst these pieces of literature was retrospective (employed in seven of the 12 studies). Retrospective cohort studies are less costly and less time consuming than prospective to conduct, but they are also more susceptible to bias(30). Firstly, with studies with lengthy follow-up periods of months to years, such as these, there is a risk of attrition bias that occurs from loss to follow-up through death, migration, late refusal to participate or losses that occur as a result of the exposure itself (30). In this instance there may have been impairments post-surgery that could limit an individual’s ability to participate in long-term follow-up, resulting in potential bias and consequent skewing of data. Most studies achieved 100% follow up of their participants, although this was easier to achieve for those which were retrospective in design. Those that looked at QoL requiring the completion of questionnaires were less successful at obtaining complete follow-up. Lagercrantz et al. (27) achieved 72% response rates and Barrie et al. (3)reported a 17% loss of the prolonged group participants and a 23% loss of participation from the comparison group, which is higher than what is considered to be an acceptable level of loss (below 20%)(30). Secondly, when there is disparity between completeness of follow-up between the exposure group and control or comparison group, this can lead to selection bias (30)which occurred in one study where a higher proportion of phone survey responses were received in the control group (3).
QoL was assessed in four of the studies using Karnofsky’s Performance Status scale (KPS), the SP-36 and a combination of the New York Heart Association classification, frailty assessments and the Hospital Anxiety and Depression Scale. Usage of different tools will undoubtedly yield different results; however at present there is no approved cardiac-specific QoL assessment tool. The KPS scale, which was used in two of the studies, has been criticised for lacking sensitivity at the lower end of the spectrum, resulting in inaccurate classification of functional status for patients with greater physical impairments(31). It also does not include the assessment of mental health which means it does not address the psychological burden that can result from ICU stay, the effects of which have been shown to still be evident years following discharge (14). The KPS scale was also designed for usage by oncology patients and, therefore, due to the lack of a cardiac specific tool, a generic assessment tool such as the SF-36 may have been more appropriate for this demographic of patients. However, although more widely applicable, generic tools do not include all of the dimensions relevant to a specific patient group, consequently reducing their sensitivity(32) and increasing the possibility of missing factors which may be key to the patient’s perception of their QoL. Additionally, no baseline assessment of pre-operative QoL was made in any of the four studies which means it is not possible to conclude that QoL was impaired as a result of their treatment or prolonged LOS in ITU as the impairments may have been pre-existing or as a result of a comorbidity. Impaired QoL pre-operatively has been shown to be a strong predictor of impaired QoL late after cardiac surgery (32) and therefore may have affected the results. Manji et al.(26) reported good functional survival at one and five years post-operatively, i.e more than 50% of participants were alive and non-institutionalised. However, although alive, it is not possible to ascertain whether they would have reported an acceptable QoL or whether they were highly dependent and requiring care. Analysis of long-term survival or mortality was found to be significant in all 12 studies; however those utilising Karnofsky as a measure of QoL either did not include the statistical analysis in the published article(3) or the results were not statistically significant(27). Lagercrantz et al. (27)used the SF-36 in conjunction with Karnofsky as a QoL measure and the SF-36 results were found to be statistically significant. Length of follow-up was adequate for the assessment of long-term outcomes in all of the studies reviewed, ranging from six months to ten years. Although the way in which cardiac surgery is performed has not changed dramatically in the last decade, improvements to pre, intra and post-operative care continue to be made in a bid to improve outcomes, changes which may have influenced ICU practices and ultimate timing of patient discharge from ICU. In an effort to mitigate this bias, standard operating procedures and local policies were reported to have been followed (5).