T-SNE visualization of large-scale neural recordings

George Dimitriadis1,*, Joana Neto1, Adam R. Kampff1

1Sainsbury Wellcome Centre, UCL, London, UK


Electrophysiology is entering the era of ‘Big Data’. Multiple probes, each with hundreds to thousands of individual electrodes, are now capable of simultaneously recording from many brain regions. The major challenge confronting these new technologies is transforming their raw data into physiologically meaningful signals, i.e. single unit spikes. Sorting the spike events of individual neurons from the spatiotemporally dense raw data is a problem that has attracted much attention (Rey 2015, Rossant 2016), but is still far from solved. Current methods still rely on human input and thus become unfeasible as the size of the data sets increase exponentially.
Here we introduce the t-student stochastic neighbor embedding (t-sne) dimensionality reduction method (Van der Maaten 2008) as a visualization tool in the spike sorting process. T-sne embeds the n-dimensional extracellular spikes (n = number of features by which each spike is decomposed) into a low (usually two) dimensional space. We show that such embeddings, even starting from different feature spaces, creates obvious clusters of spikes that can be easily visualized and manually delineated with little effort and a high degree of precision. We propose that these clusters represent single units and test this assertion by applying our algorithm on labeled data sets both from hybrid (Rossant 2016) and paired juxtacellular/extracellular recordings (Neto 2016). We have released a graphical user interface (gui) written in python as a tool for the manual clustering of the t-sne embedded spikes and as a tool for an informed overview and fast manual curation of results from other clustering algorithms. Furthermore, the generated visualizations offer evidence in favor of the use of probes with higher density and smaller size electrodes. They also graphically demonstrate the diverse nature of the sorting problem when spikes are recorded with different methods and arise from regions with different background spiking statistics.


It is neuroscience dogma that the brain’s computational mechanics are implemented by the complex dynamics of its spiking neural networks. As a consequence, detailed knowledge of the spiking activity for “as-many-neurons-as-possible” during behavior is seen as essential to understand how the brain receives and transforms information. Electrophysiological methods that allow the recording spiking activity extracellularly have been one of the most significant tools for exploring the correlations between behavior and neural activity and there remains a constant push to record from more neurons, for longer times (cite: Bence), from a host of neural regions, in diverse physiological conditions, and from many species. This trend was recently accelerated by new extracellular recording probes that extend the standard single electrode and tetrode devices (Recce 1989) by using microfabrication and integrated electronics to produce devices with thousands of recording sites (Ruther 2015, Alivisatos 2013).
The new generation of recording tools brings with it the challenge of extracting meaningful physiological signals from the resulting (big) data sets. In the case of extracellular probe recordings, that usually means transforming the measured voltages at the electrode sites into spiking activity of the nearby neural units. The importance of accurate spike sorting stems from a number of ideas on how cell spiking contributes to brain functions. Sorting competency is required for example to test ideas like sparse coding in memory functions (Chaudhuri 2016) or the diverse responses of single neighboring cells to different inputs as in the theories of concept (Rey 2015) and place cells (Redish 2001). Also taking the next step in closed loop experiments and brain machine interfaces requires providing feedback based not only on multi-unit activity firing rates but on successfully registering complex spatio-temporal spiking patterns defined in real-time (Wood 2004). (sorting is required for both accurate picture (aprseness, unboased) as well as online stuff)...
The original efforts in spike sorting utilized the development of the tetrode in order to differentiate the spiking signals of nearby neurons (Gray 1995, Fee 1996, Wehr 1999). Since then, the realization that denser electrode configurations allow sensing of the same cell with multiple electrodes, each with different attenuations, thus making sorting easier (Lewicki 1998, Buzsáki 2004), has also been a push for an increase of the electrode density in modern probe designs. Today new methodologies have evolved to work with the next generation of multi-electrode probes and to try and address the problem of the exploding size and complexity of the data sets (Rossant 2016, Rey 2015a). None the less the basic idea of the spike sorting pipeline remains the same (Fig 1A). The (filtered) data go through a process of spike detection that has traditionally relied on thresholding the raw signal. The multi-unit activity generated is then passed through a dimensionality reduction method which transforms the space-time spike matrices into a smaller set of features. The most commonly used dimensionality reduction techniques are principal component analysis (PCA) (Harris 2000) and wavelet decomposition (Hulata 2002, Quiroga 2004, Takekawa 2010) for offline and geometric/spike shape methods(Gerstein 1964, Lewicki 1998) for online sorting. More recent approaches even combine the two offline methods to generate an optimum set of features for further analysis (Rey 2015). Finally, a clustering method is employed to automatically group together the spikes from an isolated single unit in the still high dimensional space of the decomposed features. Techniques commonly used for this clustering are k-means (Wood 2004a), mixtures of gaussians based on an expectation minimization algorithm (Wood 2004a) and template matching (Zhang 2004, Wang 2006). An overview of different techniques for detection, feature extraction and classification is given in Bestel et al (Bestel 2012). Methods that are currently under development follow a different route where the event detection, the feature extraction and the clustering steps are realized in a single template matching step (Pachitariu 2016, Yger 2016). These methods offer better parallelization capabilities and are proving very capable in handling millions of spikes arising from recordings of hundreds to thousands of channels.
In all cases, the automated clustering algorithms operate on a number of dimensions that scales linearly with the number of channels of the recording probe. For the more recent multi-channel probes, this feature space usually translates to hundreds of dimensions. Such multidimensional spaces make either manual clustering or the manual supervision and quality assurance of the automated algorithms’ results prohibitive. The t-sne dimensionality reduction technique was designed to reduce such multidimensional data sets to 2 or 3 dimensions in a way that visualizing them can offer meaningful insights into their original high dimensional structure (Van der Maaten 2008). Embedding techniques, like t-sne, transform the position of points in a high-dimensional space to positions in a lower dimensional (usually 2D) space. This reduction transformation obviously requires that some information is lost. Each embedding technique decides which aspects of the original structure to keep and which to ignore. T-sne focuses on ensuring that the local structure (i.e. the ordering of distances between nearby points) remains intact while it ignores the global structure (i.e. the further distances in the t-sne space are not representative of the distances in the original space). A good mental representation of how t-sne achieves this is to think of all points as objects connected to each other with spring like forces. In the original space these forces are in equilibrium. When the points are transferred (randomly at first) into the 2D space the forces between them start both pulling and pushing so that a new equilibrium might be reached (Fig 1B). Points that are close in the original space are attracted to each other until they get roughly equally close in the 2D space, while points that are far away in the original space are repulsed by each other if they find themselves close in the 2D space. This ability of the t-sne algorithm to also repulse points that happen to be close in the 2D space, but aren’t close in the original space (a solution to the crowding problem of embedding methods) underlies the informative 2D plots it generates.