Each resulting surface feature was standardized across the whole cortex within each subject. Dimensionality reduction and whitening was then performed using principal component analysis (PCA) implemented in the
scikit-learn machine learning library in Python
\cite{scikit-learna}. The PCA model was fit to all cortical vertices in all HVs and applied to the resulting feature set for each subject. Fourteen components were kept (90% explained variance).
Although our initial PCA components were uncorrelated with zero mean and unit variance, we observed a fair amount of residual structure, such as variable degrees of skew and kurtosis, within and across these features (overview fig). To allow for more straightforward estimations of outlierness, as well as similarities and differences between cortical vertices, we applied an invertible non-linear transformation to map the features into a latent representation where the PDF follows an approximately multivariate normal distribution using the Rotation-Based Iterative Gaussianization (RBIG) procedure \cite{Laparra_2011}. RBIG is a fast iterative procedure consisting of a sequence of pairs of transformations: 1) a non-linear transformation applied to each of the columns (marginals) of the data matrix and 2) a linear transformation applied to the entire data matrix. The non-linear column-wise operation is a univariate Gaussianization that converts percentile scores computed using the rank transformation to standard scores (scikit-learn's QuantileTransformer). The orthogonal transformation is performed using PCA, with all components retained after each iteration. Ten iterations were performed. Following this procedure, the histograms and joint scatter plots of each of the resulting components approximately followed a Gaussian distribution, as expected (overview fig).
Feature Comparison
Because this representation of cortical variability is novel, and theoretically represents an efficient and overcomplete basis from which to represent the local features of each image, we hypothesized that we should be able to use our features to predict the values of other, more commonly used local features such as curvature, sulcal depth, cortical thickness, and gray/white contrast (as calculated using FreeSurfer) or measures of myelination, here calculated by dividing the T1 intensity by the T2 intensity, as in \cite{Glasser_2011}, then sampled onto the gray-white function surface. Using our feature set as the input, quadratic regression models (ordinary least-squares) were estimated for each target metric using scikit-learn. Model training and testing was performed in healthy controls using a 10-fold cross validation procedure. Each fold was trained on all cortical vertices from 25 randomly selected healthy volunteers and tested on all cortical vertices from the remaining 5 subjects. Performance was evaluated for each model using the coefficient of determination \(r{^2}\); effect size was reported as in \cite{cohen1988statistical}: \(r=0.1\) as small, \(r=0.3\) as medium, and \(r=0.5\) as large.
Similarity Estimation Across Subjects
To smooth the data and facilitate comparison across subjects, we created patches centered at every vertex, including all neighboring vertices within a 5 mm radius. Homotopic patches were defined as patches in the same location across subjects; heterotopic patches are in non-overlapping different locations. For each patch \(p\), the center \(\mathbf{m}_p\) was computed by averaging the feature vectors of the vertices within the patch: \(\mathbf{m}_p = \sum_{\mathbf{x}\in p} \mathbf{f}(\mathbf{x})\). The direction \(\mathbf{\hat{u}}_p\), the unit vector pointing in the direction of the patch's center, was computed by dividing the patch's mean feature vector by its length \(\hat{u}_p = \mathbf{m}_p / \| \mathbf{m}_p \|\). In this feature space, the probability of finding a patch with a given average feature vector \(\mathbf{m}_p\) depends only on the magnitude of the feature vector \(\| \mathbf{m}_p \|\), which is also the Mahalanobis distance, \(d\), defined as the distance from the origin to the patch's center with \(d^2\) following a cumulative chi-squared distribution. The similarity between two patches \(p_i\) and \(p_j\) can be assessed using simple metrics, such as 1) the Euclidean distance between their centers \(d_{ij} = \| \mathbf{m}_i - \mathbf{m}_j \|\), and 2) the cosine similarity between their directions \(s_{ij} = \mathbf{\hat{u}_i} \cdot \mathbf{\hat{u}_j}\).
Global Anomaly Detection
In our representation of cortical variability, we hypothesized that cortical lesions, but also possibly some normal cortical regions known to have atypical structural characteristics, would appear as global outliers in our feature space. To identify such regions of normal cortex, we calculated the average Mahalanobis distance for each cortical patch across all HVs (fig). Specific outlier regions were identified by thresholding the average distance map across HVs at a threshold of 2.7, retaining 4.3% of the patches (equivalent to \(p=0.043\)), and clustering with a minimum of 30 nodes. As exemplars, for further analysis we selected 2 of the resulting outlier regions of interest (ROI) in the anterior insula and primary motor cortex.
Directional Outliers and Similarity Maps
Although some brain regions appear to be consistent global outliers across HVs, this does not mean that they are similar to each other. In our representation, similarities or differences in combinations of features can be represented as differences in direction. We explored this using our 2 exemplar outlier ROIs, defining the center and direction of each ROI as the average of the patches within that ROI across all HVs. We similarly defined the average FCD ROI center and direction by averaging all of the patches within each MRI+ FCD mask (n=10), then averaging across the FCDs, to give each lesion equal weighting. Using the average direction of the insula ROI as the x-axis and the average direction of the motor ROI as the y-axis, we plotted the patches within each outlier ROI, the FCD ROI center, and 1000 randomly selected cortical patches for comparison (fig).