Young Investigator Abstracts 2017

Modern machine learning far outperforms GLMs at predicting spikes

Ari S. Benjamin
Department of Biomedical Engineering, Northwestern University, Evanston, IL, 60208

Neuroscience has long sought encoding models that accurately predict neural activity from stimuli and tasks, and generalized linear models (GLMs) are a typical
choice. Modern machine learning techniques have the potential to perform better. Here we directly compared GLMs to three leading regression methods:
feedforward neural networks, gradient boosted trees, and stacked ensembles that combine the predictions of several methods. We predicted spike counts 
in macaque motor (M1) and somatosensory (S1) cortices from reaching kinematics, and in rat hippocampal cells from open field location and orientation. 
For neurons from each area we found that the modern ML methods predicted spike rates more accurately than the GLM and were less sensitive to the preprocessing 
of features. This overall performance shows that tuning curves built with GLMs are at times inaccurate and can be easily improved upon.

This demonstration naturally leads to a discussion of the uses of encoding models that are predictive but not transparent, i.e. models whose internal
parameters are not easily interpretable. Applications include determining which features are informative of spiking and providing estimates of how much 
neural activity is at all predictable by external factors (e.g. in order to isolate purely internal dynamics, or to set benchmarks for other methods).
This work, a collaborative effort of the labs of Konrad Kording and Lee Miller, is accompanied by example code that uses standard Python packages and 

can be quickly applied to other datasets.


Extracting Stable Representations of Neural Population State from Unstable Neural Recordings

W.E. Bishop
Dept. of Machine Learning, Carnegie Mellon University
Center for the Neural Basis of Cognition, Pittsburgh, PA

Stable, long-term neural population recordings could enable basic science and more clinically relevant brain-computer interface (BCI) technologies. 
Unfortunately, many recording techniques are limited in their ability to stably record from the same population of neurons across days. 
Addressing this limitation would enable the study of many aspects of population coding, such as long-term learning and the encoding of many more stimuli 
than can be shown in single recording sessions, and remove the need for daily recalibration of BCI systems.  Here we develop theory and statistical
methods, which we refer to as “neural stitching,” for extracting a consistent, low-dimensional representation of neural population activity from
population recordings across many days, where the neurons being recorded may change from day to day.

We build on recent work establishing that a small number of latent variables can summarize the joint activity of hundreds of neurons.  The key
intuition behind our approach is that the low-dimensional representation of the activity of an entire population of neurons should be conserved
even though the particular neurons recorded may change from day to day.  We use latent variable models to extract the value of these latent
variables representing the state of a full population from different sets of neurons recorded from day to day.  This permits the activity of an entire 
population of neurons to be related across time, even when it is only possible to record from small portions of the population at once.  Formally, 
neural stitching is a special case of statistical matching, which has been studied for over forty years in the statistical community, and we develop
new theory establishing sufficient conditions under which the latent variable models underlying our method are identifiable.  

We validate our approach to neural stitching by developing a self-recalibrating BCI.  Most current intracortical BCI systems require frequent, 
manual recalibration to compensate for recording instabilities.  We develop a BCI which automatically self-recalibrates by applying neural stitching to 
extract a stable, low-dimensional representation of neural population activity across time and employing a standard BCI decoder with fixed parameters 
to extract user intent from this low-dimensional representation.  We tested the self-recalibrating BCI in closed-loop experiments with two rhesus macaques.  
To thoroughly test the algorithm, we introduced artificial recording instabilities during the experiments.  These artificial instabilities were
patterned on but often more severe than those typically seen in practice.  The algorithm significantly improved performance after the introduction 
of the instabilities in 41 of 42 experiments, demonstrating neural stitching’s ability to identify a stable representation of neural population activity.  
In addition to self-recalibrating BCI, the neural stitching methods we developed can enable a wide range of basic neuroscience studies, including
studying long-term learning, population coding across brain areas and the encoding of more stimuli than can be presented in single recording sessions. 

This is joint work Alan D. Degenhart, (U. Pitt), Emily R. Oby (U. Pitt), Elizabeth C. Tyler-Kabara (U. Pitt), Aaron P. Batista (U. Pitt), 
Steven M. Chase (CMU) and Byron M. Yu (CMU). 


Mapping Neural Microcircuits: Design and Inference

Shizhe Chen

Recent advancements in the optical stimulation of sets of neurons now enable mapping the fine-scale synaptic properties of large-portions of neural
circuits in a single animal. 
Specifically, we consider circuit mapping experiments where the subthreshold, postsynaptic responses of a small number of 
neurons are recorded using whole-cell patch clamp, and optogenetics is used to stimulate a set (1 to 10) of putative presynaptic neurons per trial 
within a target volume of roughly 1000 um x 600 um x 300 um. From these data, one can infer which presynaptic neurons are connected to the patched neurons.

However, two challenges remain to fully realize this type of experiments. First, the limitations of the spatial resolution of the optical 
stimulation, the biological variability in the response of individual neurons to optical stimulations, and the variability in the postsynaptic 
features (e.g. amplitudes) of individual connections make confident inference of unitary monosynaptic inputs challenging. Second, the neural circuits 
must be learned with limited data,  because often times the preparations are short-lived and in general the amount of data one can collect is 
paltry compared to the extent of neural circuits.

In this project, we propose two methods that address the challenges in data analysis and data collection, respectively:

i) We develop a novel model and an inference procedure that can reliably reconstruct the neural microcircuits. A major challenge in this problem 
is that the spiking of the potential presynaptic neurons is unobserved. Our model overcomes this challenge by exploiting the variability of 
postsynaptic features induced by different presynaptic cells and through a prior characterization of the presynaptic populations spiking response to 
optical stimulation. In detail, we model the stimulus-induced spiking in each presynaptic cell using a combination of a leaky integrate-and-fire 
model and a generalized linear model (i.e., LIF-GLM). We further assume that each presynaptic cell induces postsynaptic events with a fixed 
amplitude distribution.

Using a novel expectation-maximization algorithm, we can estimate the parameters in the LIF-GLM model, the amplitude distribution and the synaptic 
success rate for each connected presynaptic. ii) We propose an optimal experimental design procedure that can provide instant guidance on which 
locations to stimulate during the experiment to optimize the collected data.

To be specific, we identify future stimulation spots that lead to maximal increase in the mutual information between the data and the parameters of 
interests (e.g. synaptic features). Towards this end, we simplify the proposed model and exploit the sparsity of synaptic connections to reduce 
the computational cost. Compared to early work by Shababo et al. (2013), our proposal better utilizes the variability in the postsynaptic features 
to refine the estimation of the mutual information, and importantly we optimize stimulation locations over the full volume of tissue as opposed to 
just the locations of neurons.

We first illustrate the performance of our models and algorithms on realistic, simulated data. Then we show their application on real two-photon, 
multi-spot mapping data.

Joint work with Ben Shababo, Xinyi Deng, Hillel Adesnik, and Liam Paninski


Distance covariance analysis

Benjamin R. Cowley 
Machine Learning Dept.

Nonlinear interactions are prevalent in neuroscience, from the relationship between a stimulus parameter and neural responses to interactions 
between populations of neurons across different brain areas.  To detect these relationships, nonlinear methods are desired.  A critical limitation 
of nonlinear dimensionality reduction methods for studying neural population activity is that these methods do not provide an explicit mapping between  
population activity and the extracted latent variables.  As a result, it is difficult to use nonlinear dimensionality reduction methods to address 
basic scientific questions, such as whether the low-dimensional spaces recovered for different experimental conditions are similar or different.

To address this need, we developed distance covariance analysis (DCA), a linear dimensionality reduction method that can detect linear and nonlinear 
interactions between populations of neurons recorded in two or more brain areas, as well as other experimental variables.  DCA optimizes a correlational 
statistic called distance covariance, which can detect if linear or nonlinear interactions exist between two sets of variables. We extend distance 
covariance to detect interactions between multiple sets of variables, and then optimize the extended distance covariance with respect to linear 
projection vectors (i.e., dimensions) for each set of variables. Statistically, DCA is well-suited for neuroscientific datasets because it can be 
applied to continuous and categorical variables, orders identified dimensions based on the strength of interaction, and can relate neural activity to 
stimulus or behavioral parameters, outputs from computational models, and neural activity from other subjects.  Computationally, DCA is fast and 
scales to hundreds of neurons and tens of thousands of trials.

We showcase DCA's ability to detect nonlinear interactions across a broad range of neuroscience settings. The nonlinearities of these datasets 
include the cosine relationship between orientation angle of gratings and responses of V1 neurons, as well as the divisive normalization relationship 
between V1 population activity and the response of an MT neuron. We also show that DCA can align the firing rate spaces across different subjects by  
identifying dimensions common across subjects,  even when the common dimensions are nonlinearly related.  Finally, we apply DCA to population activity 
simultaneously recorded from V1 and V2, and find that the nonlinear interactions between brain areas lie within a subset of dimensions in which there 
are linear interactions.  Overall, DCA provides a deeper understanding of how different neural populations interact and has broad applicability 
to many neuroscientific settings.

Joint work with Joao Semedo, Douglas Ruff, Amin Zandvakili, Marlene Cohen, Matthew Smith, Adam Kohn, and Byron Yu.


Stability and dynamics of nonlinear Hawkes processes and PP-GLMs

Felipe Gerhard

This talk is based on work presented in the publication "On the stability and dynamics of stochastic spiking neuron models: 
Nonlinear Hawkes process and point process GLMs" (PLOS Computational Biology, 2017; ).

Point process generalized linear models (PP-GLMs) provide an important statistical framework for modeling spiking activity in single-neurons and 
neuronal networks. Stochastic stability is essential when sampling from these models, as done in computational neuroscience to analyze statistical properties 
of neuronal dynamics and in neuro-engineering to implement closed-loop applications. Here we show, however, that despite passing common goodness-of-fit 
tests, PP-GLMs estimated from data are often unstable, leading to divergent firing rates. The inclusion of absolute refractory periods is not a 
satisfactory solution since the activity then typically settles into unphysiological rates.

To address these issues, we derive a framework for determining the existence and stability of fixed points of the expected conditional intensity function 
(CIF) for general PP-GLMs. Specifically, in nonlinear Hawkes PP-GLMs, the CIF is expressed as a function of the previous spike history and exogenous 
inputs. We use a mean-field quasi-renewal (QR) approximation that decomposes spike history effects into the contribution of the last spike and an average 
of the CIF over all spike histories prior to the last spike. Fixed points for stationary rates are derived as self-consistent solutions of integral 
equations. Bifurcation analysis and the number of fixed points predict that the original models can show stable, divergent, and metastable (fragile) 
dynamics. For fragile models, fluctuations of the single-neuron dynamics predict expected divergence times after which rates approach unphysiologically 
high values. This metric can be used to estimate the probability of rates to remain physiological for given time periods, e.g., for simulation purposes.

We demonstrate the use of the stability framework using simulated single-neuron examples and neurophysiological recordings. Finally, we show how to 
adapt PP-GLM estimation procedures to guarantee model stability. Overall, our results provide a stability framework for data-driven PP-GLMs and shed new 
light on the stochastic dynamics of state-of-the-art statistical models of neuronal spiking activity.

Joint work with Moritz Deger and Wilson Truccolo


Estimating short-term synaptic plasticity from pre- and postsynaptic spiking

Abed Ghanbari

Short-term synaptic plasticity (STP) critically affects the processing of information in neuronal circuits by reversibly changing the effective 
strength of connections between neurons on the time scales from milliseconds to few seconds. STP is traditionally studied using intracellular 
recordings of postsynaptic potentials or currents evoked by presynaptic spikes. However, STP also affects the statistics of postsynaptic spikes.

Here we present two model-based approaches for estimating synaptic weights and short-term plasticity from pre- and postsynaptic spike observations 
alone. In particular, we extend a generalized linear model (GLM) that predicts postsynaptic spiking as a function of the observed pre- and postsynaptic 
spikes and allow the efficacy of presynaptic inputs (coupling term in the GLM) to vary as a function of time. In a first model, we assume that 
STP follows a Tsodyks-Markram model of vesicle depletion and recovery. In a second model, we introduce a functional description of STP where we 
estimate how the synaptic weight is modified as a function of different presynaptic inter-spike intervals.

To validate the models, we test the accuracy of STP estimation using the spiking of neurons with known synaptic dynamics, ranging from strong 
depression to strong facilitation. We generated an artificial current from a simulated population of presynaptic neurons, with different weights 
and types of STP, and recorded spike responses of layer 2/3 pyramidal neurons in slices from rat visual cortex to injection of this current. We find 
that, using only spike observations, both model-based methods can accurately reconstruct the time-varying synaptic weights of presynaptic inputs 
for different types of plasticity. Furthermore, our models are able to capture the short-term dynamics of in vivo recordings using pre- and 
postsynaptic spiking at two strong, identified monosynaptic connections: thalamocortical synapses in the awake rabbit and the large endbulb 
of Held synapse in the cochlear nucleus of the awake gerbil. In agreement with intra- and juxtacellular recordings our results using 
spikes alone find that thalamocortical connections tend to have short-term synaptic depression, while the endbulb of Held is facilitating.

Joint work with Aleksey Malyshev, Maxim Volgushev and Ian H. Stevenson


Identifying Theta and Alpha-band Traveling Waves in Human Neocortex with Spatial Circular Statistics

Joshua Jacobs

Electrical oscillations are present in nearly every brain region, but we lack a rich understanding of their functional role and mechanism of generation.  
Here we describe our finding that human neocortical oscillations at various frequencies are traveling waves, showing that brain oscillations reveal 
the movement of neural activity between regions.  We made this discovery by developing a new quantitative framework using spectral analysis and 
multivariate circular statistics, and using this  methodology to examine direct brain recordings from neurosurgical patients. Using this approach, 
we identified spatial clusters of traveling oscillations that propagate through time and space to organize neural activity across the cortex.  We assessed 
each traveling wave's instantaneous temporal frequency, movement speed, direction, and spatial consistency, which we then compared with  simultaneous 
behavior in a memory task.  Traveling theta and alpha oscillations were present in all neocortical regions, where they generally propagated in a 
posterior-to-anterior direction.  The spatial consistency of individual traveling waves correlated with task performance.  To understand mechanisms 
that underlie these waves, we examined the relation between movement speed and temporal frequency. We found that traveling-wave propagation could be 
modeled by a series of coupled Kuramoto oscillators in conjunction with interindividual differences in local cortical coupling.  Our findings demonstrate 
a new functional role for human brain oscillations by showing the prevalence and behavioral relevance of cortical traveling waves.  These results suggest 
that characterizing the spatial structure of neuronal oscillations can reveal dynamic interactions between brain regions that are important for behavior.

Joint work with Honghui Zhang


Learning multi-variate point process models

Garvesh Raskutti

Multivariate Poisson autoregressive models are a common way of capturing self-exciting point processes, where cascading series of events from nodes 
in a network either stimulate or inhibit events from other nodes. These models can be used to learn the structure of biological neural networks where events 
may correspond to neurons firing at different voxels the brain.

An important question associated with these multivariate network models is determining how different voxels influence each other. This problem presents 
a number of technical challenges since the number of voxels $M$ is typically large compared to the number of observed time points $T$, and the stability 
and learnability properties of non-linear multivariate point process models are difficult to characterize.

In this talk I address these challenges and provide learning rates for fairly general multivariate self-exciting Poisson auto-regressive models that 
include Poisson ARMA models, saturated or clipped models and a discretized version of the multivariate Hawkes process. Our learning guarantees depend 
on the sparsity of the network $s$, $M$ and $T$ as well as other network properties. Importantly, our rates apply in the high-dimensional setting 
where $s \ll T \ll M^2$. We also provide simulations and a real data example involving crime reports to support our methodology and main results.

Joint work with Benjamin Mark and Rebecca Willett 


The sensorimotor strategies mediating object recognition by active touch

Chris Rodgers

Humans and other animals can identify objects by active touch -- coordinated exploratory motion and tactile sensation. For example, we precisely scan
our fingertips over objects in order to identify them, integrating tactile and proprioceptive input from each finger into a holistic representation 
of shape. Similarly, mice adeptly recognize objects by scanning them with their array of facial whiskers. To identify the behavioral strategies 
and neural computations that mediate this ability, we have developed a behavioral task for head-fixed mice -- curvature discrimination -- that challenges 
 them to discriminate concave from convex shapes. We can identify the time and location of every whisker contact using high-speed videography.

We are statistically characterizing the behavioral strategies mice use to efficiently extract information about object curvature, in order to 
generate hypotheses about the underlying neural algorithms. Preliminary results suggest that the mice use a two-part "scan, then foveate" strategy: they 
first whisk broadly to coarsely localize the object, and then target their whisking more precisely to extract more detailed information about shape. 
Mice typically contact the stimuli in multi-whisker bouts lasting 25-50 ms, producing rich spatiotemporal patterns of contacts across the whisker array. 
Because the stimuli are presented over a range of positions, it is not possible to unambiguously determine curvature using only information from a 
single location in space. We have used a dictionary learning algorithm to reveal recurring multi-whisker contact patterns, and support vector machine 
(SVM) classifiers to decode the curvature of the shape from these patterns. This approach has identified candidate behavioral events that the mouse may 
detect in order to identify curvature.

Thus far, this work has generated a rich dataset of whisker motion and contact patterns during naturalistic, goal-directed object recognition. 
These statistical analyses will reveal the motor control and feature recognition strategies mice employ to infer shape from this complex tactile input. 
We are presently recording spiking activity from populations of neurons in somatosensory cortex and we next plan to identify how they encode and process 
these features in order to mediate this behavior.

Joint work with B Christina Pil, Philip Calafati, Akash Khanna and Randy M Bruno


Detecting Multivariate Cross-Correlation Between Brain Regions

Jordan Rodu

The problem of identifying functional connectivity from multiple time series data recorded in each of two or more brain areas arises in many 
neuroscientific investigations. For a single stationary time series in each of two brain areas statistical tools such as cross-correlation and 
Granger causality may be applied. On the other hand, to examine multivariate interactions at a single time point, canonical correlation, which 
finds the linear combinations of signals that maximize the correlation, may be used. In this talk, we discuss a new method that produces 
interpretations much like these standard techniques and, in addition, (a) extends the idea of canonical correlation to 3-way arrays (with 
dimensionality number of signals given by number of time points by number of trials), (b) allows for non-stationarity, (c) also allows for 
nonlinearity, (d) scales well as the number of signals increases, and (e) captures predictive relationships, as is done with Granger causality. 

Joint work with Natalie Klein, Scott L. Brincat, Earl K. Miller and Robert E. Kass


Neural correlates during human decision-making: Integration of cognitive reasoning and internal biases

Pierre Sacre

Decision-making is a complex phenomenon in which cognitive reasoning (e.g., computation of reward expectation) and internal biases (e.g., preferences, 
emotions) influence the decisions we make, much as the outcome of our decisions influences the internal biases we acquire. Most decision making studies 
involving humans identify neural correlates from functional magnetic resonance imaging data that have poor temporal resolution, and yet humans often make 
decisions on the order of milliseconds. To map the neural substrates of decision-making at a millisecond resolution, we exploited a unique opportunity 
to record from 10 humans (implanted with depth electrodes for clinical purposes) while they performed a gambling-based decision-making task. First, 
we constructed dynamical probabilistic models of betting decisions for each subject as a function of cognitive reasoning and an internal-bias state that is 
unobserved but estimated from measured data. The models suggest that the deviation from the optimal strategy of maximizing expected reward is explained 
by the influence of the estimated internal-bias state for some subjects. Then, the neural data was analyzed to identify brain regions whose activity modulates 
with the estimated internal-bias state. Regions including inferior temporal gyrus, amygdala and entorhinal cortex correlate with the internal-bias state, 
suggesting that this internal state may carry some information about the patient's internal-bias state during the task. These preliminary data suggest 
that internal-biases are likely a key component of utility functions that govern financial decisions, in addition to rational components including expected 
reward and variance of reward.

Joint work with Matthew S. D. Kerr, Sandya Subramanian,  Kevin Kahn, Jorge J. Gonzalez-Martinez, Matthew A. Johnson, Uri T. Eden, John T. Gale 
and Sridevi V. Sarma


Multiscale Modeling of Deep Brain Stimulation for Depression

Vineet Tiruvadi

Deep brain stimulation (DBS) of the white matter tracts passing through the subcallosal cingulate cortex (SCCwm) alleviates symptoms of depression. 
Here, I will present work characterizing and modeling network-level electrophysiologic changes using combined EEG and chronic LFP recordings in 
patients with treatment resistant depression (TRD) being treated with SCCwm-DBS. First, I will (a) identify chronic SCC-LFP frequency activity correlated 
with depression state and (b) characterize rapid, network-level oscillatory changes in key oscillatory bands. Then, I will outline mathematical 
modeling approaches being explored to determine SCCwm-DBS mechanisms of action in (a) long-term synchronization changes associated with depression 
recovery, and (b) short-term excitatory-inhibitory balance modulation of SCC. The results of this work include objective brain-based measures to 
standardize SCCwm-DBS for TRD and also provide a more complete understanding of network-level changes induced by SCCwm-DBS at rapid and chronic 
timescales. Finally, I will propose approaches to directly link electrical state changes and behavioral dynamics associated with depression.

Joint work with Robert Butera and Helen Mayberg.


State Estimation and Model Identification for Dynamical Processes with Censored Observations

Ali Yousefi

Censored data are a common occurrence in trial-structured behavioral experiments and many other forms of longitudinal data. They can lead to severe 
bias and reduction of statistical power in subsequent analyses. Principled approaches for dealing with censored data, such as data imputation and 
methods based on the complete data likelihood, work well for estimating fixed features of statistical models. They have not been extended to dynamic 
measures, such as serial estimates of an underlying latent variable over time. Here, we propose an approach to the censored data problem for dynamic 
behavioral signals. We develop a state-space modeling framework with a censored observation process at the trial timescale. We then develop a filter 
algorithm to compute the posterior distribution of the state process using the available data. We show that special cases of this framework can 
incorporate the three most common approaches to censored observations: ignoring trials with censored data, imputing the censored data values, or using 
the full information available in the data likelihood. We also derive a computationally efficient approximate Gaussian filter that is similar in 
structure to a Kalman filter, but which efficiently accounts for the censored data. Finally, we combine the censored data filter/smoother and EM 
algorithm to estimate model parameters. We show that the solution provides an accurate estimation of the model parameter even with a large percentage 
of censored data.

We demonstrate the application of these methods to multiple classes of observation processes, including mixtures of continuous and discrete signals. 
As a specific example of this, we provide the closed form approximate solutions for continuous observation processes (for example related to reaction 
time) with normal or Gamma distributed data combined with Bernoulli data (for example related to binary decisions). We present a detailed simulation 
analysis of the estimation error for such data based on each approach for dealing with the censored data, as a function of the expected proportion 
of missing data. Using these results, we provide suggestions on how to deal with censored data and which methodology to pick given attributes 
of the dynamical model and percentage of the censored points.

Finally, we demonstrate the application of this methodology to two different datasets including fMRI and intra-operative behavioral tasks, where 
there is generally a substantial percentage of missed or censored data. We also discuss new directions for this framework to address problems 
related to censored neural activity.

Joint work with Alik S. Widge and Uri T. Eden


Mapping Population-based Structural Connectome

Zhengwu Zhang

Advances in understanding human brain structural connectomes require improved approaches for the construction, comparison and integration of 
high-dimensional whole-brain tractographic data from a large number of individuals.  This article develops a population-based structural connectome 
(PSC) mapping framework to address these challenges. Simultaneously characterizing huge numbers of white matter bundles within and across different 
subjects, PSC registers different individuals' brains, relying on a coarse construction, decomposes variation in the bundles, and extracts novel 
connection weights.  PSC can be used to extract binary networks, weighted networks and streamline-based connectivity representations of brain 
connectomes from many large-scale neuroimaging studies.  PSC facilitates analyses relating structural connectomes to demographic and behavioral 
measures.  A test-retest dataset is used to to improve and validate the robustness and reproducibility of PSC.  We apply PSC to data from the Human 
Connectome Project (HCP) to investigate normal variations in structural connectomes among healthy subjects, such as the heritability analysis.  
PSC facilitates the understanding of normal brain structure, the structural bases of neuropsychiatric disorders, and the effects of environmental 
and genetic factors on the structure of connections.

In additional to the developoment of a mapping framework, we also performed statistical analysis for various outputs of PSC. About 900 subject 
from HCP are processed. Based on the analysis of the new formats of brain connectome produced by PSC, we found that structural connectivity 
is significantly related to phenotypes such as fluid intelligence, memory, education, language comprehension and taste.

Joint work with Hongtu Zhu, Maxime Descoteaux, Anuj Srivastava and David Dunson