[Mne_analysis] GeneralizingEstimator with incremental learning / .partial_fit

Fri Aug 7 08:32:38 EDT 2020

        External Email - Use Caution        

Dear Jean-Rémi and dear Alex,

*Thank you!*

A solution based on this:
class MyModel(SGDClassifier):
    def fit(self, X, y):
        super().partial_fit(X, y)
        return self

..works fine!
Except for the crucial fact that parallel processing (n_jobs>1) seems not
feasible.
This is what I get when I try to score the slider (apologies for the
ugliness, I copy-paste everything since it might be meaningful to catch
what is wrong):
---------------------------------------------------------------------------
_RemoteTraceback                          Traceback (most recent call last)
_RemoteTraceback:
"""
Traceback (most recent call last):
  File
"/usr/local/anaconda3/lib/python3.7/site-packages/joblib/externals/loky/backend/queues.py",
line 150, in _feed
    obj_ = dumps(obj, reducers=reducers)
  File
"/usr/local/anaconda3/lib/python3.7/site-packages/joblib/externals/loky/backend/reduction.py",
line 243, in dumps
    dump(obj, buf, reducers=reducers, protocol=protocol)
  File
"/usr/local/anaconda3/lib/python3.7/site-packages/joblib/externals/loky/backend/reduction.py",
line 236, in dump
    _LokyPickler(file, reducers=reducers, protocol=protocol).dump(obj)
  File
"/usr/local/anaconda3/lib/python3.7/site-packages/joblib/externals/cloudpickle/cloudpickle.py",
line 267, in dump
    return Pickler.dump(self, obj)
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 437, in dump
    self.save(obj)
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 662, in
save_reduce
    save(state)
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 859, in
save_dict
    self._batch_setitems(obj.items())
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 885, in
_batch_setitems
    save(v)
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 662, in
save_reduce
    save(state)
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 859, in
save_dict
    self._batch_setitems(obj.items())
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 890, in
_batch_setitems
    save(v)
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 662, in
save_reduce
    save(state)
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 859, in
save_dict
    self._batch_setitems(obj.items())
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 885, in
_batch_setitems
    save(v)
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 819, in
save_list
    self._batch_appends(obj)
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 846, in
_batch_appends
    save(tmp[0])
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 774, in
save_tuple
    save(element)
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 789, in
save_tuple
    save(element)
  File "/usr/local/anaconda3/lib/python3.7/pickle.py", line 510, in save
    rv = reduce(obj)
  File
"/usr/local/anaconda3/lib/python3.7/site-packages/joblib/_memmapping_reducer.py",
line 361, in __call__
    return (loads, (dumps(a, protocol=HIGHEST_PROTOCOL),))
_pickle.PicklingError: Can't pickle <class '__main__.MyModel'>: it's not
the same object as __main__.MyModel
"""

The above exception was the direct cause of the following exception:

PicklingError                             Traceback (most recent call last)
/neurospin/grip/protocols/EEG/Giulia_NUM_MUSIK/NUM_MUSIK_DECODING_incremental_learning_test_on_VISUAL_DRAFT.py
in <module>
    278         y_test = test_epochs.events[:,2]
    279
--> 280         scores = time_gen.score(X_test, y_test)
    281         all_scores_D.append(scores)
    282

<decorator-gen-375> in score(self, X, y)

~/.local/lib/python3.7/site-packages/mne/decoding/search_light.py in
score(self, X, y)
    583                              for pb_idx, x in array_split_idx(
    584                                  X, n_jobs, axis=-1,
--> 585                                  n_per_split=len(self.estimators_)))
    586
    587         score = np.concatenate(score, axis=1)

~/.local/lib/python3.7/site-packages/mne/parallel.py in run(*args, **kwargs)
    126     def run(*args, **kwargs):
    127         try:
--> 128             return fun(*args, **kwargs)
    129         except RuntimeError as err:
    130             msg = str(err.args[0]) if err.args else ''

/usr/local/anaconda3/lib/python3.7/site-packages/joblib/parallel.py in
__call__(self, iterable)
    932
    933             with self._backend.retrieval_context():
--> 934                 self.retrieve()
    935             # Make sure that we get a last message telling us we
are done
    936             elapsed_time = time.time() - self._start_time

/usr/local/anaconda3/lib/python3.7/site-packages/joblib/parallel.py in
retrieve(self)
    831             try:
    832                 if getattr(self._backend, 'supports_timeout',
False):
--> 833
self._output.extend(job.get(timeout=self.timeout))
    834                 else:
    835                     self._output.extend(job.get())

/usr/local/anaconda3/lib/python3.7/site-packages/joblib/_parallel_backends.py
in wrap_future_result(future, timeout)
    519         AsyncResults.get from multiprocessing."""
    520         try:
--> 521             return future.result(timeout=timeout)
    522         except LokyTimeoutError:
    523             raise TimeoutError()

/usr/local/anaconda3/lib/python3.7/concurrent/futures/_base.py in
result(self, timeout)
    433                 raise CancelledError()
    434             elif self._state == FINISHED:
--> 435                 return self.__get_result()
    436             else:
    437                 raise TimeoutError()

/usr/local/anaconda3/lib/python3.7/concurrent/futures/_base.py in
__get_result(self)
    382     def __get_result(self):
    383         if self._exception:
--> 384             raise self._exception
    385         else:
    386             return self._result

PicklingError: Could not pickle the task to send it to the workers.

Would you know to solve it?
Without parallel processing I don't think I can get to the end of the
analysis before Christmas 😌

Thank you very much again!!!!

Giulia

On Thu, Aug 6, 2020 at 4:12 PM Jean-Rémi KING <jeanremi.king at gmail.com>
wrote:

>         External Email - Use Caution
>
> Hi Giula,
>
> good catch, I had forgotten that we're cloning the estimator for each time
> sample; you'll thus need to do this:
>
> class MyModel(SGDClassifier):
>     def fit(self, X, y):
>         super().partial_fit(X, y)
>         return self
>
> model = MyModel(loss='log', class_weight='balanced')
> slider = SlidingEstimator(model, scoring='roc_auc')
>
> Hope that helps
>
> JR
>
>
> On Thu, 6 Aug 2020 at 15:56, Giulia Gennari <giulia.gennari1991 at gmail.com>
> wrote:
>
>>         External Email - Use Caution
>>
>> Dear Jean-Rémi,
>>
>> Thank you for the nice suggestion!
>>
>> Just to make sure that this is working (I apologize for my ignorance):
>>
>> When I run:
>> model = SGDClassifier(loss='log', class_weight='balanced')
>> model.fit = model.partial_fit
>> slider1 = SlidingEstimator(model, scoring='roc_auc')
>> slider1.fit(X_train, y_train)
>>
>> or
>>
>> clf = make_pipeline(Vectorizer(), StandardScaler(), model)
>> slider2 = SlidingEstimator(clf, scoring='roc_auc')
>> slider2.fit(X_train, y_train)
>>
>> I do not get any error, while I would expect:
>>
>> ValueError: class_weight 'balanced' is not supported for partial_fit. In order to use 'balanced' weights, use compute_class_weight('balanced', classes, y). Pass the resulting weights as the class_weight parameter.
>>
>>
>> Since this is what I get with:
>> model.fit(X_train[:,:,single_time_point], y_train)
>>
>> Is there a good reason for that? E.g. class weights are computed
>> internally beforehand by SlidingEstimator?
>>
>> Thank you again!
>>
>> Giulia
>>
>> On Wed, Aug 5, 2020 at 7:18 PM Jean-Rémi KING <jeanremi.king at gmail.com>
>> wrote:
>>
>>>         External Email - Use Caution
>>>
>>> Hi Giulia,
>>>
>>> I think you should be able to change the method:
>>>
>>> model = sklearn.linear_model.SGDClassifier()
>>> model.fit = model.partial_fit
>>> slider = mne.decoding.SlidingEstimator(model)
>>> for X, y in train_batches:
>>>     slider.fit(X, y)
>>>
>>> Best
>>>
>>> JR
>>>
>>> On Wed, 5 Aug 2020 at 18:40, Giulia Gennari <
>>> giulia.gennari1991 at gmail.com> wrote:
>>>
>>>>         External Email - Use Caution
>>>>
>>>> Hi!
>>>>
>>>> I would need to try decoding with incremental learning (EEG data).
>>>> I was planning to use logistic regression by means of the SGDClassifier
>>>> <https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html>
>>>>  .
>>>> I would then need to call .partial_fit to make my estimator learn on
>>>> each of my training sets.
>>>> However:
>>>>
>>>> 'GeneralizingEstimator' object has no attribute 'partial_fit'
>>>>
>>>> Same issue for SlidingEstimator.
>>>> Is there a way to work around this limitation?
>>>>
>>>> Thank you so so much in advance!
>>>>
>>>> Giulia Gennari
>>>> _______________________________________________
>>>> Mne_analysis mailing list
>>>> Mne_analysis at nmr.mgh.harvard.edu
>>>> https://mail.nmr.mgh.harvard.edu/mailman/listinfo/mne_analysis
>>>
>>> _______________________________________________
>>> Mne_analysis mailing list
>>> Mne_analysis at nmr.mgh.harvard.edu
>>> https://mail.nmr.mgh.harvard.edu/mailman/listinfo/mne_analysis
>>
>> _______________________________________________
>> Mne_analysis mailing list
>> Mne_analysis at nmr.mgh.harvard.edu
>> https://mail.nmr.mgh.harvard.edu/mailman/listinfo/mne_analysis
>
> _______________________________________________
> Mne_analysis mailing list
> Mne_analysis at nmr.mgh.harvard.edu
> https://mail.nmr.mgh.harvard.edu/mailman/listinfo/mne_analysis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.nmr.mgh.harvard.edu/pipermail/mne_analysis/attachments/20200807/7723010f/attachment-0001.html