Anisha Keshavan edited I_asked_colleagues_that_work__.tex  about 8 years ago

Commit id: 72386ccc03fddf64f24b2dc40afd6254a76756e8

deletions | additions      

       

I asked colleagues Colleagues  that work on ADNI data at UCSF, and they said verified  that the same healthy controls were not scanned at multiple sites, and so I would the method is  not be able to calculate a $CV_a$ from that data. However, its still really important to compare directly comparable. Instead  our results were compared  to 3  other harmonization efforts, so I have added 2 tables that do this. efforts.  In the first, our between site ICC results pre- and post- calibration are were  compared to the between site between-site  ICC results of \cite{cannon2014}. In \cite{cannon2014}, the sites were harmonized and an ADNI phantom was used to correct gradient distortions. The authors ran FSL's FIRST for subcortical segmentation, and a cortical pattern matching algorithm for gray/white segmentation. They then calculated the between-site ICC using variance components in the same way we did. similar to our study.  The table below was added to the manuscript: \begin{table}   \begin{tabular}{ c c c c } 

\label{comparetocannon}  \end{table}  The two ROIs that do not compare to \cite{cannon2014} is are  the thalamus and the white matter volume (WMV). For the thalamus, it is possible that the FIRST algorithm is more reliable at segmenting this structure, while for white matter volume, Freesurfer is more reliable. I included another Another  table by \cite{jovicich2013brain} was included  where sites were not strictly harmonized, but different control subjects were scanned at each site, and the authors used the same freesurfer cross-sectional algorithm that we used. Instead of calculating between-site ICC's, they calculated the average within-site ICCs for each ROI. The following table (which is now included in the manuscript) compares our within-site ICC's pre- and post- calibraiton to \cite{jovicich2013brain} average within-site ICC values: \begin{table}   \begin{tabular}{ c c c c } 

Here, we see that the within-site thalamus ICC values fall within the range of \cite{jovicich2013brain}, along with the other ROIs. Again, the calibration done here is to show that when the biases are applied to the data, the results are comparable to the harmonized results, and therefore the variability of biases ($CV_a$) could be trusted.  Finally, the following comparison to Schnack 2004 was added:  "The data acquisition of our study is similar to that of \cite{Schnack_2004}, in which the researchers acquired T1-weighted images from 8 consistent human phantoms across 5 sites with non-standardized protocols. These scanners were all 1.5T except for one 1T scanner. \cite{Schnack_2004} calibrated the intensity histograms of the images before segmentation with a calibration factor estimated based on the absolute agreement of volumes to the reference site (ICC). After applying their calibration method, the ICC of the lateral ventricle was $\geq 0.96$, which is similar to our pre- and post- calibrated result of $0.97$. The ICC for the intensity calibrated gray matter volume in \cite{Schnack_2004} was $\geq 0.84$, compared to our calibrated between-site ICC of $0.78$ (uncalibrated), and $0.96$ (calibrated). Our between-site ICCs for white matter volume ($0.96$ and $0.98$ for the pre- and post- calibrated volumes, respectively) were much higher than those of the intensity calibrated white matter volume in \cite{Schnack_2004} ($\geq .78$). This could be explained by the fact that our cohort of sites is a consortium studying multiple sclerosis, which is a white matter disease, so there may be a bias toward optimizing scan parameters for white matter. Most importantly, the calibration method of \cite{Schnack_2004} requires re-acquisition of a human phantom cohort at each site for each multisite study. Alternatively, multisite studies employing our approach can use the results of our direct-volume calibration (the estimates of $CV_a$ for each ROI) to estimate sample sizes based on our proposed power equation and bias measurements without acquiring their own human phantom dataset to use in calibration."