[Mne_analysis] Source Space Decoding Classification Timecourse

Mon Aug 7 17:38:31 EDT 2017

Hi all,

Thanks for all the helpful suggestions.  Everyone brought up imbalanced datasets as a possible source of the problem, but trial counts are equalized both in the example and in my personal dataset where the problem is a bit worse (baseline accuracies are averaging around 65%, even with the code changes JR suggested and with more samples and less features).  I also know I shouldn't jump to any conclusions without doing actual stats, and indeed I don't really expect these baseline periods to show up as significantly above chance. However, I figured a reviewer would nail me if I tried to report classification timecourses with that high of a baseline accuracy, even if it was statistically meaningless.

JR, thanks for those bits of code, that definitely cleans it up a lot.  At Alex and your's suggestions, I'm trying to use the 'roc_auc' scoring method to see if it calms the baseline down a bit, but I'm getting some inconsistent behavior out of cross_val_multiscore when trying to use that metric.  Attached is the same source space decoding tutorial from my original message but now modified to run using these functions from the master branch that JR suggested.  The plot sensors decoding tutorial you linked runs just fine for me, but when I try to run this on the modified source space tutorial (attached), I get the following error:

Value Error: roc_auc scoring can only be computed for two-class problems

It doesn't seem to like the data tag variable y.  Strangely enough, if I define y as such:

y=epochs.events[:,2], as it is defined in the sensor space tutorial

the cross_val_multiscore function does not return the error (the scores are obviously bad since the labeling is wrong though).  In both cases y is just a simple numpy array with identical shape (112,) and the same number of unique digits, just in different orders.  So, I'm not really sure what's happening there, but hopefully others can replicate the problem.

Cheers,
Cody

________________________________
From: mne_analysis-bounces at nmr.mgh.harvard.edu [mne_analysis-bounces at nmr.mgh.harvard.edu] on behalf of alexandre.barachant at gmail.com [alexandre.barachant at gmail.com]
Sent: Saturday, August 05, 2017 4:08 PM
To: Discussion and support forum for the users of MNE Software
Subject: Re: [Mne_analysis] Source Space Decoding Classification Timecourse

Hi Cody,

Depending on your number of trials, the number of feature and the cross validation procedure, you can have fairly high decoding results just by chance.
You should never interpret a results without running a statistical test. One good way to get the chance level of your classification pipeline is to run a permutation test : http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.permutation_test_score.html
the idea is to shuffle the labels, and train again the model to see what score you get 'by chance'. it is sometimes surprising how high you can get.

If you have an unbalanced number of trial per class, i would also suggest to use the AUC as a metric instead of the accuracy.

Alex

On Sat, Aug 5, 2017 at 9:11 AM JR KING <jeanremi.king at gmail.com<mailto:jeanremi.king at gmail.com>> wrote:
Hi Cody,

Overall, your baseline doesn't look too bad - you would need to do a statistical test to check whether it is just noise variation or above-chance decoding scores.

Still there could be multiple reasons behind a significant accuracy before t0 here:
- accuracy is biased for imbalanced datasets. You can either use epochs.equalize_event_counts before your cross validation, or better, use a 'roc_auc' scoring metrics
- filtering the data can spread information over time. Try changing your filtering parameters
- IIRC, the 'sample' protocol is actually not randomized, and it is possible to predict the simulus category in advance.

If you're using the MNE master branch, then I would recommend simplfy using this instead of your big loop (see https://martinos.org/mne/dev/auto_tutorials/plot_sensors_decoding.html#temporal-decoding for more details):

clf = make_pipeline(StandardScaler(), SelectKBest(f_classif, k=500), SVC(kernel='linear'))
time_decod = SlidingEstimator(clf, scoring='roc_auc')
scores = cross_val_multiscore(clf, X, y, cv=5)
plt.plot(times, scores.mean(0))

(Note that I would personnally recommend clf = make_pipeline(StandardScaler(), LogisticRegression(C=1)) which should be better)

Else, I believe we will be releasing the next version of MNE this month, so you'll just have to update MNE.

Hope that helps,

Jean-Rémi

On 4 August 2017 at 17:19, Ghuman, Avniel <ghumana at upmc.edu<mailto:ghumana at upmc.edu>> wrote:
Hi Cody,

Do you have the same number of trials in each condition after any trial rejection you do? If not, then the issue might be that 50% is not the correct chance level to think about, rather the correct chance level is the proportion of trials that is in your more frequent condition (eyeballing, maybe like 55%?). There are unbiased classifiers you can use, but I am not sure if they are built into MNE python...

Best wishes,
Avniel

________________________________
From: mne_analysis-bounces at nmr.mgh.harvard.edu<mailto:mne_analysis-bounces at nmr.mgh.harvard.edu> [mne_analysis-bounces at nmr.mgh.harvard.edu<mailto:mne_analysis-bounces at nmr.mgh.harvard.edu>] on behalf of Cushing, Cody [CCUSHING1 at mgh.harvard.edu<mailto:CCUSHING1 at mgh.harvard.edu>]
Sent: Friday, August 04, 2017 5:11 PM
To: mne_analysis at nmr.mgh.harvard.edu<mailto:mne_analysis at nmr.mgh.harvard.edu>
Subject: [Mne_analysis] Source Space Decoding Classification Timecourse

Hi,

I've been trying to modify the following example:

http://martinos.org/mne/dev/auto_examples/decoding/plot_decoding_spatio_temporal_source.html

to yield a time resolved classification accuracy.  I'm new to decoding so I've done it in a fairly brute way (just iterating this script over every time point), which yields a fairly convincing classification accuracy timecourse.  However, I'm a bit concerned at how high the accuracy is during the baseline, pre-stim period.  See attached for the modified script using the sample data and an example of the output.  I'm new to decoding, but the best answer I've been able to find for abnormally high pre-stim accuracy is failing to cross validate, but that shouldn't be the case as cross validation is being performed (but perhaps I'm doing it wrong) .  Is there something improper about my strategy here?  Thanks for any input.

Cheers,
Cody

_______________________________________________
Mne_analysis mailing list
Mne_analysis at nmr.mgh.harvard.edu<mailto:Mne_analysis at nmr.mgh.harvard.edu>
https://mail.nmr.mgh.harvard.edu/mailman/listinfo/mne_analysis

The information in this e-mail is intended only for the person to whom it is
addressed. If you believe this e-mail was sent to you in error and the e-mail
contains patient information, please contact the Partners Compliance HelpLine at
http://www.partners.org/complianceline . If the e-mail was sent to you in error
but does not contain patient information, please contact the sender and properly
dispose of the e-mail.

_______________________________________________
Mne_analysis mailing list
Mne_analysis at nmr.mgh.harvard.edu<mailto:Mne_analysis at nmr.mgh.harvard.edu>
https://mail.nmr.mgh.harvard.edu/mailman/listinfo/mne_analysis

The information in this e-mail is intended only for the person to whom it is
addressed. If you believe this e-mail was sent to you in error and the e-mail
contains patient information, please contact the Partners Compliance HelpLine at
http://www.partners.org/complianceline . If the e-mail was sent to you in error
but does not contain patient information, please contact the sender and properly
dispose of the e-mail.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plot_decoding_spatio_temporal_source_time.py
Type: text/x-python
Size: 6487 bytes
Desc: plot_decoding_spatio_temporal_source_time.py
Url : http://mail.nmr.mgh.harvard.edu/pipermail/mne_analysis/attachments/20170807/eebf0c34/attachment.py