TO BIN OR NOT TO BIN:

NOVEL APPROACHES TO OBTAIN AND DEBIAS ESTIMATES OF INFORMATION AND ENTROPY

 

Jonathan D. Victor

 

Department of Neurology and Neuroscience

Weill Medical College of Cornell University

 

 

The straightforward approach to the estimation of information in spike trains is to cut spike trains into "words" of length T, and to bin these words into narrow bins ("letters") of length D T. Information estimates are then calculated from these discretized spike trains, parametric in the width of the bins D T and the length T of a word. Ideally, sufficient data are available to have confidence in the asymptotic behavior of these estimates as both D T becomes small and T becomes large. However, this is often not the case, especially when the ratio of the longest and shortest time scales of interest is large. In essence, this ratio determines the dimension of the space of possible words, each of whose probabilities need to be estimated in order to calculate an entropy. In visual cortex, for example, this ratio is at least 100:1. Analysis of multineuronal recordings compounds the problem, since the effective dimensionality is proportional to the number of neurons under consideration.

 

We consider several strategies to mitigate this problem. First, as is well known, naïve estimates of information from observed probabilities are biased. Some of this bias can be removed by the procedure of Treves and Panzeri. We show that there is an additional improvement in the bias estimate associated with the use of a jackknife, but not a bootstrap, approach. However, the debiasing strategies still leave much to be desired when the space of all conceivable spike trains is only sparsely sampled. This motivates consideration of strategies based on the geometry of the observed set of spike trains. One such approach is the "spike metric" method of Victor and Purpura, which is briefly reviewed. We then consider another, more general, approach – also based on distances between spike trains – derived from a strategy to estimate the local density of observed data within the configuration space. For sparse data, this method produces entropy estimates that are substantially less biased, though somewhat more variable, than methods based on binning. We also show how this approach is related to entropy estimates based on the Lempel-Ziv algorithm.