MR Acquisition Protocol
All participants were scanned with a Philips Achieva 3T scanner in the NIH Clinical Center Radiology Department. Whole brain sagittal 3D MRI scans were acquired as follows: 1) 3D T1 weighted MPRAGE (T1): TR = 6.8–7.2, TE = 3.2 ms, TI = 900ms, flip angle = 90, voxel size = 0.75 x 0.75 x 0.8, acceleration factor 2 in slice direction, acquisition time = 7:02 min; 2) 3D T2 weighted FSE (T2): TR = 2500, TE = 225–245, voxel size = 1 x 1 x 2 or 1 x 1 x 1, acceleration factor 2 in slice and phase directions, acquisition time = 5:03 min; 3) 3D FLAIR: TR = 4800, TE = 271–415, TI = 1600, voxel size = 0.9 x 0.9 x 1, acceleration factor 2 in slice direction, 2.6 in phase direction, acquisition time = 6:10 min.
Preprocessing and Feature Generation
Image Registration
All images were initially inspected for excess motion and overall quality. For each individual subject, T
2 and FLAIR images were co-registered to the T
1 image with an affine transformation and a normalized mutual information cost function using the publicly available software package Analysis of Functional NeuroImages (
AFNI)
\cite{Cox1996a}. The resulting images were sampled to the grid of the subject's T
1 image. Registered images were visually assessed for alignment.
Cortical Reconstruction
Cortical reconstruction was performed using standard processes with
FreeSurfer v6.0.0 \cite{Dale1999,Fischl1999}. The T
1 and T
2 images were used as input to FreeSurfer's standard processing pipeline. Briefly, this processing consists of removal of non-brain tissue
\cite{Ségonne2004}, automated Talairach transformation, segmentation of the subcortical white matter and deep gray matter volumetric structures
\cite{Fischl2004,Fischl2002}, intensity normalization
\cite{Sled1998}, tessellation of the gray matter white matter boundary, automated topology correction
\cite{Fischl2001}, surface deformation using intensity gradients to optimize the gray/white and gray/cerebrospinal fluid borders
\cite{dale:99,fischl:99}, then surface inflation
\cite{fischl:99}, registration to a spherical atlas based on individual cortical folding patterns
\cite{HBM:HBM10}, and parcellation of the cerebral cortex
\cite{Desikan2006,Fischl2004}. Results were visually inspected and manually corrected if needed. FreeSurfer output was converted to AFNI compatible format, and cortical surfaces were deformed to a standard mesh consisting of 36,002 vertices per hemisphere.
Intensity Correction
Intensity correction was implemented using an in-house procedure that performs an edge-aware local normalization, consisting of division by a smooth estimate of the local signal mean within each 3D image for each subject (see supplementary materials materials
(link to supp)). The algorithm was implemented in Python, and the source code is publicly available on GitHub as part of the
JEM package.
Feature Generation
For each voxel, in addition to the corrected intensity in each MR contrast, we wished to represent the characteristics of the local neighborhood surrounding each voxel in a manner that was insensitive to the local orientation of the cortical sheet. We therefore created a set of rotationally invariant 3D multi-scale local image filters analogous to the multi-scale oriented filters that have been shown to be: 1) an efficient representation of the local statistics of natural image statistics, 2) an accurate model of neurons in the mammalian visual system, and 3) a reliable model commonly used in computer vision applications to create a robust, efficient, generic basis set for applications such as texture synthesis \cite{Simoncelli2001}. Our rotationally invariant local features consisted of:
- Gaussian: the image blurred by Gaussian kernels of widths 1, 2, and 3 mm;
- Magnitude of the gradient of Gaussians: the magnitude of the first spatial derivative of the image blurred by Gaussian kernels of widths 1, 2, and 3 mm;
- Laplacian of Gaussian: the trace of the spatial second derivative of the image blurred by Gaussian kernels of widths 1, 2, and 3 mm;
- Frobenius Norm of the Hessian: the matrix norm of the spatial second derivative of the image blurred by Gaussian kernels of widths 1, 2, and 3 mm
This resulted in a 39 feature vector at each voxel consisting of the corrected T
1, T
2, and FLAIR image intensities, as well as the four local features at three spatial scales for each MR contrast. Because well-described features of FCDs have been found at or in close proximity to the gray-white junction, features were then sampled onto the FreeSurfer generated gray-white junction surface (see supplementary materials, features figure)
(features figure). The source code is also publicly available on GitHub as part the
JEM package.
Scaling and Dimensionality Reduction
Each resulting surface feature was standardized across the whole cortex within each subject. Dimensionality reduction and whitening was then performed using principal component analysis (PCA) implemented in the
scikit-learn machine learning library in Python
\cite{scikit-learna}. The PCA model was fit to all cortical vertices in all HVs and applied to the resulting feature set for each subject. Fourteen components were kept (90% explained variance).
Normative Model
Based on these features, we created a normative model to characterize the joint probability density function (PDF) of our features across the cortex in our healthy volunteer population to describe the expected range of normal cortical variability. Although our initial PCA components were uncorrelated with zero mean and unit variance, we observed a fair amount of residual structure, such as variable degrees of skew and kurtosis, within and across these features (overview fig). To allow for more straightforward estimations of outlierness, as well as similarities and differences between cortical vertices, we applied an invertible non-linear transformation to map the features into a latent representation where the PDF follows an approximately multivariate normal distribution using the Rotation-Based Iterative Gaussianization (RBIG) procedure \cite{Laparra_2011}. RBIG is a fast iterative procedure consisting of a sequence of pairs of transformations: 1) a non-linear transformation applied to each of the columns (marginals) of the data matrix and 2) a linear transformation applied to the entire data matrix. The non-linear column-wise operation is a univariate Gaussianization that converts percentile scores computed using the rank transformation to standard scores (scikit-learn's QuantileTransformer). The orthogonal transformation is performed using PCA, with all components retained after each iteration. Ten iterations were performed. Following this procedure, the histograms and joint scatter plots of each of the resulting components approximately followed a Gaussian distribution, as expected (overview fig).
Feature Comparison
Because this representation of cortical variability is novel, and theoretically represents an efficient and overcomplete basis from which to represent the local features of each image, we hypothesized that we should be able to use our features to predict the values of other, more commonly used local features such as curvature, sulcal depth, cortical thickness, and gray/white contrast (as calculated using FreeSurfer) or measures of myelination, here calculated by dividing the T1 intensity by the T2 intensity, as in \cite{Glasser_2011}, then sampled onto the gray-white function surface. Using our feature set as the input, quadratic regression models (ordinary least-squares) were estimated for each target metric using scikit-learn. Model training and testing was performed in healthy controls using a 10-fold cross validation procedure. Each fold was trained on all cortical vertices from 25 randomly selected healthy volunteers and tested on all cortical vertices from the remaining 5 subjects. Performance was evaluated for each model using the coefficient of determination \(r{^2}\); effect size was reported as in \cite{cohen1988statistical}: \(r=0.1\) as small, \(r=0.3\) as medium, and \(r=0.5\) as large.
Similarity Estimation Across Subjects
To smooth the data and facilitate comparison across subjects, we created patches centered at every vertex, including all neighboring vertices within a 5 mm radius. Homotopic patches were defined as patches in the same location across subjects; heterotopic patches are in non-overlapping different locations. For each patch \(p\), the center \(\mathbf{m}_p\) was computed by averaging the feature vectors of the vertices within the patch: \(\mathbf{m}_p = \sum_{\mathbf{x}\in p} \mathbf{f}(\mathbf{x})\). The direction \(\mathbf{\hat{u}}_p\), the unit vector pointing in the direction of the patch's center, was computed by dividing the patch's mean feature vector by its length \(\hat{u}_p = \mathbf{m}_p / \| \mathbf{m}_p \|\). In this feature space, the probability of finding a patch with a given average feature vector \(\mathbf{m}_p\) depends only on the magnitude of the feature vector \(\| \mathbf{m}_p \|\), which is also the Mahalanobis distance, \(d\), defined as the distance from the origin to the patch's center with \(d^2\) following a cumulative chi-squared distribution. The similarity between two patches \(p_i\) and \(p_j\) can be assessed using simple metrics, such as 1) the Euclidean distance between their centers \(d_{ij} = \| \mathbf{m}_i - \mathbf{m}_j \|\), and 2) the cosine similarity between their directions \(s_{ij} = \mathbf{\hat{u}_i} \cdot \mathbf{\hat{u}_j}\).
Global Anomaly Detection
In our representation of cortical variability, we hypothesized that cortical lesions, but also possibly some normal cortical regions known to have atypical structural characteristics, would appear as global outliers in our feature space. To identify such regions of normal cortex, we calculated the average Mahalanobis distance for each cortical patch across all HVs (fig). Specific outlier regions were identified by thresholding the average distance map across HVs at a threshold of 2.7, retaining 4.3% of the patches (equivalent to \(p=0.043\)), and clustering with a minimum of 30 nodes. As exemplars, for further analysis we selected 2 of the resulting outlier regions of interest (ROI) in the anterior insula and primary motor cortex.
Directional Outliers and Similarity Maps
Although some brain regions appear to be consistent global outliers across HVs, this does not mean that they are similar to each other. In our representation, similarities or differences in combinations of features can be represented as differences in direction. We explored this using our 2 exemplar outlier ROIs, defining the center and direction of each ROI as the average of the patches within that ROI across all HVs. We similarly defined the average FCD ROI center and direction by averaging all of the patches within each MRI+ FCD mask (n=10), then averaging across the FCDs, to give each lesion equal weighting. Using the average direction of the insula ROI as the x-axis and the average direction of the motor ROI as the y-axis, we plotted the patches within each outlier ROI, the FCD ROI center, and 1000 randomly selected cortical patches for comparison (fig).