Thanks to you awesome help with #12 (closed), I've been comparing some of our simulation runs with the 33fb5955558ba8bb15a3fdce49dfd914682ef3ea dataset and are having some issues especially with the populations with low firing rates. To try and narrow down where this is coming from, I tried re-using the analysis code from the repository on the FEF spike data from the 33fb5955558ba8bb15a3fdce49dfd914682ef3ea dataset as follows:
fromcorrelation_toolboximporthelperaschimportnumpyasnp# **NOTE** this is heavily based off the analysis code from the paperdefload(filename,duration_s):tmin=500.subsample=2000resolution=1.spikes=np.load(filename)ids=np.unique(spikes[:,0])dat=ch.sort_gdf_by_id(spikes,idmin=ids[0],idmax=ids[0]+subsample+1000)bins,hist=ch.instantaneous_spike_count(dat[1],resolution,tmin=tmin,tmax=duration_s*1000.0)rates=ch.strip_binned_spiketrains(hist)[:subsample]cc=np.corrcoef(rates)cc=np.extract(1-np.eye(cc[0].size),cc)cc[np.where(np.isnan(cc))]=0.returnnp.mean(cc)# Values from the json file on gnodedataset_fef_6e_corr=0.0004061593782540619dataset_fef_5i_corr=0.0020706629333541817duration_s=10.5# Population sizesnum_fef_5i=3721num_fef_6e=16128# Load datanest_fef_5i_corr=load("33fb5955558ba8bb15a3fdce49dfd914682ef3ea-spikes-FEF-5I.npy",duration_s)nest_fef_6e_corr=load("33fb5955558ba8bb15a3fdce49dfd914682ef3ea-spikes-FEF-6E.npy",duration_s)print("FEF 5I corr coeff - NEST:%f, Dataset:%f"%(nest_fef_5i_corr,dataset_fef_5i_corr))print("FEF 6E corr coeff - NEST:%f, Dataset:%f"%(nest_fef_6e_corr,dataset_fef_6e_corr))
The output shows fairly significant differences between the versions from the json file in the same directory:
Could it be the condition for neurons to have spiked at least once where this goes wrong? So that you are still including some silent neurons in the calculation?
Well-spotted! rates ends up with less than 2000 neurons that spike after calling strip_binned_spiketrains. However, I don't quite understand how this ever worked as the paper says that the ground state simulations were run for 10.5 seconds and the 10.5 second spike trains in the repository don't have enough data
No, I don't think so. The paper states that for chi=1, the simulation duration was 10.5 s. I think the calculation started with 2000 neurons, from which the ones that spiked at least once were taken, leaving fewer than 2000 neurons in some cases.
Ahh ok so how come our recalculation of the stats using the same code and (presumably) the same data don't match? I don't see any non-determinism in that code
I think that @albada is right in saying that for some populations, there were probably fewer than 2000 neurons entering the calculation, simply because these population are small in the first place and then have low rates.
For what it's worth, I wouldn't call these deviations significant. Keep in mind that these values are very low, 10^-3 - 10^-4, on a scale of 0 to 1. If your simulations do not produce exactly the same spikes, these deviations can easily occur, but they're not significant, in my opinion.
I think, unless you're using the exact same configuration for your compute system (MPI processes, threads) and the same NEST version (+ some other dependencies that influence the random numbers), it's unlikely that you can produce the same spikes.
Hey @mschmidt87 - thanks for looking at this. My concern is that, as a test, we're calculating these metrics from the published spiking data using the published code and we don't get the published correlation coefficients
I can reproduce your finding for FEF, 6E. I am getting the same value as you for the cross-correlation coefficient. I can also see a deviation for the LvR value (0.2683 from recalculating vs. 0.4178 in the json file). However, I can reproduce the the population rates from the json files, which makes me conclude that the spike data is the correct one and I didn't use other data for the calculation of the json files.
I couldn't find the problem in the analysis code (I tried to change the part where we subsample etc., but no success). Since I produced the json files with the code that is in the repo and the file hasn't been modified, I am suspecting that this might perhaps be a version problem of the dependencies.
I am now using Python 3.8 and numpy 1.18.1 as well as the latest master of correlation_toolbox.
Unfortunately, I didn't record the exact dependencies at the time I produced the data in the data repo, so I can't investigate this in an easy manner.
Glad to hear you can reproduce. Wouldn't it be possible that the spike rate would result in the same mean rates but different correlatation and irregularity (in the extreme case you could generate spike trains from populations of poisson sources with the same mean rates)?
Nonetheless, I can try and investigate older versions today. correlation_toolbox doesn't look to have changed a huge amount at least. The data was pushed on the 8/1/2018 so, assuming you created it around then, I can try and bisect numpy versions.
Thanks for your tests. Of course, it is possible to produce the exact same rates with different 2nd order statistics, but to achieve that with two different runs of the same simulation (with different RNG seeds), which is what I would have suspected, is extremely likely, i.e. can be excluded.
I've done a little bit of code archeology and found a change in the LvR calculation. If I calculate the LvR before with and without this change:
fromcorrelation_toolboximporthelperaschimportnumpyasnp# **NOTE** this is heavily based off the analysis code from the paperdefload(filename,duration_s,num,check):tmin=500.subsample=2000resolution=1.tref=2.0spikes=np.load(filename)# calc lvri_min=np.searchsorted(spikes[:,1],tmin)i_max=np.searchsorted(spikes[:,1],duration_s*1000.0)LvR=np.array([])data_array=spikes[i_min:i_max]foriinnp.unique(data_array[:,0]):intervals=np.diff(data_array[np.where(data_array[:,0]==i)[0],1])ifintervals.size>1:val=np.sum((1.-4*intervals[0:-1]*intervals[1:]/(intervals[0:-1]+intervals[1:])**2)*(1+4*tref/(intervals[0:-1]+intervals[1:])))LvR=np.append(LvR,val*3/(intervals.size-1.))else:LvR=np.append(LvR,0.0)# CHANGE HEREifcheckandlen(LvR)<num:LvR=np.append(LvR,np.zeros(num-len(LvR)))returnnp.mean(LvR)# Values from the json file on gnodedataset_fef_6e_lvr=0.4178813296671444dataset_fef_5i_lvr=0.9737456740769203duration_s=10.5# Population sizesnum_fef_5i=3721num_fef_6e=16128# Load datanest_fef_5i_lvr=load("33fb5955558ba8bb15a3fdce49dfd914682ef3ea-spikes-FEF-5I.npy",duration_s,num_fef_5i,False)nest_fef_6e_lvr=load("33fb5955558ba8bb15a3fdce49dfd914682ef3ea-spikes-FEF-6E.npy",duration_s,num_fef_6e,False)print("FEF 5I LvR - NEST:%f, Dataset:%f"%(nest_fef_5i_lvr,dataset_fef_5i_lvr))print("FEF 6E LvR - NEST:%f, Dataset:%f"%(nest_fef_6e_lvr,dataset_fef_6e_lvr))
Which is significantly closer to the values in the published JSON. As this change was after the original submission date, might the published LvR data have been calculated prior to this change?
I think you are probably right that this padding of silent neurons with 0.0 values to the results had not been used in the publication and the published data.
The manuscript says in the methods:
" Spike-train irregularity is quantified for each population by the revised local variation LvR [82] averaged over a subsample of 2000 neurons. The cross-correlation coefficient is
computed with bin width 1 ms on single-cell spike histograms of a subsample of 2000 neurons
per population with at least one emitted spike per neuron. Both measures are computed on the
entire population if it contains fewer than 2000 neurons."
This does not mention the padding and your calculations indicate that at least the new results are closer to the bublished with check=False (in your code). Unfortunately, I can't recall why I added the padding at some point.
I guess it's a question of definition how silent neurons should be accounted for in the calculations, but I would say now that assigning them value of 0 does not make much sense.
I think that @akorgor applied your code to more populations and can confirm your findings.
thanks for looking into this some more @mschmidt87! It's still weird that the value for FEF 5I fits exactly but the one for FEF 6E doesn't though - again that seems unlikely to be caused by two different runs of the same simulation.
I found that if in the code snippet above on the LvR calculation check is chosen to be True and num is chosen to be len(np.unique(spikes[:, 0])), the results from the code and from gnode match up to a difference of ~1e-15, 1e-16. Thus, for computing the average LvR value only the neurons that spiked at least once during the simulation contribute a 0 but not the completely silent neurons like the definition of num in the code snippet indicates. I could verify this for the areas FEF, V1, V2 for the populations in all layers in the chi=1.0 10.5s simulation.
For the correlation coefficient calculation however, I could not find the reason for the deviations until now. The gnode values are for the majority of the populations (15/24) larger than the ones from the recalculation. The differences range from -0.00025 to +0.00025 and do not seem to depend on the total number of neurons in the population or the number of spiking neurons. Except for FEF 6E there are always more than subsample=2000 neurons after ch.strip_binned_spiketrains(hist).
Since I found that this calculation of the correlation coefficients is in general sensitive to subsampling in terms of neurons and time, I tried setting tmin=0, tmax=max(np.unique(spikes[:, 1]) and subsample=3000, but none of these three simulations yielded the gnode values. @albada @mschmidt87, I would be happy to test if you have further ideas where the deviations could come from.
I found that if in the code snippet above on the LvR calculation check is chosen to be True and num is chosen to be len(np.unique(spikes[:, 0])), the results from the code and from gnode match up to a difference of ~1e-15, 1e-16. Thus, for computing the average LvR value only the neurons that spiked at least once during the simulation contribute a 0 but not the completely silent neurons like the definition of num in the code snippet indicates. I could verify this for the areas FEF, V1, V2 for the populations in all layers in the chi=1.0 10.5s simulation.
Thank you @akorgor for that detective work (and obvisouly sorry for the misleading code and created confusion). I think that this these choices are definitely debatable, however, since the deviations are not too large from the data shown in the paper, I would suggest to just edit the code in the repo such that the results match and stick to the algorithm used for the paper.