3.2 Diagnostic yield of the undiagnosed GDD/ID cohort
In this cohort, pathogenic variants were identified in 23 families with GS. Highly suspicious variants (i.e., variants reported in the literature but lack a well-established disease-causing relationship) were found in four families (Table 2 ). The identified pathogenic variations included small variants, CNVs and one LOH causing Angelman syndrome. No mitochondrial variants or abnormal expanded repeats were found in the study. The diagnostic yield of GS in CMA only cases was high (64.3%, 9/14), while in ES only families and CMA + ES families (12.9%, 8/62 and 25.0%, 6/24, respectively) the diagnostic yield of GS was significantly lower (p=0.000194, chi-square (degrees of freedom=2) = 17.0975).
We examined the pathogenic variants found by GS and compared them to the data from previous CMA/ES tests. Nine different scenarios were identified as plausible explanations for missed CMA/ES diagnoses (Figure 2b ): 1. Patients received CMA only and the causative variants were too small to be detected (n=9); 2. The disease-causing genes were reported after study enrollment (n=5); 3. ES failed to detect pathogenic CNVs (n=4); 4. LOH was not called in ES reanalysis (n=1); 5. Improper annotation (n=1); 6. Patient did not manifest the clinical features when the previous tests were performed (n=1); 7. Mutant allele dropped out in ES (n=1,Figure 3a ); 8. The complex variant was not captured well by ES (n=1, Figure 3b ); and 9. A 3’ untranslated region (3’UTR) deletion was not captured by ES (n=1).
Of the 23 positive cases, seven could have been solved by reanalyzing data from prior tests (30.4%): three had previous ES testing, and four were previously tested with both ES and CMA. Updates to analysis pipelines, annotation databases, and clinical follow-up were reasons for reanalysis success (Figure 2b ). Reanalyzed data from CMA only cases provided no additional diagnoses.