[Mne_analysis] Computing regression on sensor data then transforming to source space

Fri Feb 21 07:26:42 EST 2014

Hi everyone,

First with regards the continuing discussion of regression models which include both scaled and categorical variables, etc.: There are general purpose open sources statistical packages, e.g. R, which implement ANOVA, ANCOVA, MANCOVA, etc. in the general linear model format, i.e. the one we are discussing.  It therefore might be worth evaluating one or more of them for development of python wrappers which use them to do the regressions we are discussing.  

Second with regards both the equivalent variance discussion and the assertion of the equivalence of computing the Betas in sensor or source space: Perhaps I am misunderstanding.  I had just assumed that the regression on an independent variable was to be for one sensor at a time.  Thus in the statistical framework the problem could be handled as a MANOVA with the sensor measures being multiple dependent variables or as a repeated measures ANOVA where the sensor measures are repeated measures.  In neither case though would regression on the independent variable against a source be conceptually equivalent or give the same answer. i.e. against a linear combination of the sensors,   

For the MANOVA approach, the regression against each sensor is handled separately.  So I think that at least the differences in variances across the sensors don't matter.  What do you think?  I don't write with the voice of authority here and would like to understand this.

Don

Don Krieger, Ph.D.
Department of Neurological Surgery
University of Pittsburgh
(412)648-9654 Office
(412)521-4431 Cell/Text

________________________________________
From: mne_analysis-bounces at nmr.mgh.harvard.edu [mne_analysis-bounces at nmr.mgh.harvard.edu] on behalf of Hari Bharadwaj [hari at nmr.mgh.harvard.edu]
Sent: Thursday, February 20, 2014 10:57 AM
To: Denis-Alexander Engemann
Cc: mne_analysis at nmr.mgh.harvard.edu
Subject: Re: [Mne_analysis] Computing regression on sensor data then transforming to source space

Hi Denis et al.,
    It appears to me that there are two separate issues being confused
here and perhaps there will be some clarity if we talk through it:

(A) Whether to compute "betas" in sensor-space or source-space:
    This is not really a difficult question within the MNE or other linear
inverse solution framework. Because, for a given inverse operator
(let's call it M), computing betas in sensor or source space should
lead to identical results unless the statistical model being fitted to
compute the betas is somehow non-linear.

(B) Choice of the noise model used to compute the operator M:
    This issue is more subtle and important and does not depend on the
sequence in which you do things in (A). One has to consider this
question regardless of whether he/she is doing stats in sensor-space
or source space. By projecting things to source-space first, one
doesn't become immune to an inappropriate choice of the noise model.
   If the noise covariance employed corresponds to that of single-trials
(eg., if you choose to scale it with nave = 1 because you are
projecting single trial data), the map would be unnecessarily smooth
even if later you combine 100s of trials to show you final "map" of
interest. The prior (the minimum norm part) will be given much more
weight and the data fit will be given lot less weight than is
appropriate. Thus, one should employ a noise-covariance scaling *with
the foresight* of what analysis is planned, especially in the context
of regressions.
   I think there is one special case in the context of regressions/ANOVA
that occurs frequently and is easy to handle. This is the case where
"betas" are computed with the weights that are normalized (these
weights are sometimes called "contrasts" in stats parlance)..
 That is if: beta = sum_over_i{w_i * x_i} where x_i is the mean from the
i^th group of trials, and if sum_over_i { w_i**2} == 1, and the x_i's
come from the same number of trials, then the beta values have the same
noise variance as the x_i's. To take the simplest example of this case,
if x_1 and x_2 are evoked responses for conditions 1 and 2 (each with
nave = 100 trials), then if you compute your beta values as the
difference divided by sqrt(2), then scaling the noise-covariance by nave
= 100 is appropriate when computing M. In practice, if all the conditions
that go into the stats have the same number of trials, and if the stats
employs orthonormal contrasts, then the default MNE scaling of nave =
#trials works well..

Hope this helps..

Hari

On Thu, February 20, 2014 5:48 am, Denis-Alexander Engemann wrote:
> Hi everyone,
>
> this is really an interesting and productive discussion which I enjoy
> following very much. And it's especially timely and relevant, since we
> plan to support high-level functions for single trial regression in
> MNE-Python in the future. Tal Linzen has recently started drafting
> examples and first functions, cf.
> https://github.com/mne-tools/mne-python/pull/1034.
>
> Since the there were a few open questions for Teon's use case we
> decided to postpone adding direct support for projecting beta-maps.
> However, if we could use the collective knowledge and experience
> distributed over this mailing list, it might be possible to clarify
> those questions and add support this use case directly in MNE-Python
> rather soon.
>
> My main concern with the beta-map projection approach is that the
> noise structure (and the units) as reflected in the noise covariance
> computed on raw M/EEG signals don't necessarily match with the noise
> structure present in the beta maps. This would mean a model mismatch.
> To tackle this issues different options might be possible, such as
> using an identity matrix as noise covariance or estimating the noise
> covariance on whitened single trials.
>
> Input on both the work-in-progress on MNE-Python and the conceptual
> issues is highly appreciated.
>
> Cheers,
> Denis
>
> On Thu, Feb 20, 2014 at 7:57 AM, Alexandre Gramfort
> <alexandre.gramfort at telecom-paristech.fr> wrote:
>> hi Don,
>>
>> thanks a lot for sharing these insights.
>>
>> Although I get the idea of what you suggest, I am not sure
>> I would be able to perfectly replicate this analysis.
>> Do you happen to have a script you could share?
>>
>> Also can you give the full ref from which the figure is extracted?
>>
>> thanks again
>>
>> Best,
>> Alex
>>
>>
>> On Wed, Feb 19, 2014 at 8:30 PM, Krieger, Donald N. <kriegerd at upmc.edu>
>> wrote:
>>> Dear Teon,
>>>
>>>
>>>
>>> You have raised several interesting questions on which I would like to
>>> expand.
>>>
>>> Hari responded to several technical issues, viz. (1) constraints on
>>> what you
>>> do to retain the validity of your subsequent projection into source
>>> space
>>> and (2) weighting the regression to compensate for unequal numbers of
>>> trials
>>> for different levels of the independent variable.
>>>
>>>
>>>
>>> Here are some points about the meaning of what you are doing and about
>>> the
>>> technical issues.
>>>
>>> (1)    If the independent variable is scaled rather than a 0/1 dummy,
>>> i.e.
>>> has multiple numeric levels, then your regression is asking a specific
>>> quantitative question about the amplitude of the magnetic field/source,
>>> i.e.
>>> is the amplitude a linear function of the independent variable?  If for
>>> example the variable takes values n and 2n, you are asking: "Is the
>>> amplitude for the "2n" trials double what it is for the "n" trials?
>>>
>>> (2)    I think it's reasonable to assume that many of the sources
>>> contributing to the magnetic field have nothing to do with the task.
>>> Although you are working with single trial data, your regression across
>>> the
>>> trials is collapsing the data in a generalized version of averaging.
>>> That
>>> helps attenuate the contributions to the field of unrelated sources.
>>> But if
>>> (a) there was a way up front to define regions of interest within the
>>> brain
>>> which you think are involved in the task, and (b) if the linear
>>> hypothesis
>>> you are testing is true, you should do better by doing the projection
>>> first
>>> and then doing the regression on the vertices within one ROI at a time.
>>>   In
>>> that way you take advantage of the signal space separation capabilities
>>> of
>>> your projection operation to isolate the sources you think are
>>> involved.  If
>>> you want to get formal statistics from your regression, you must find a
>>> way
>>> to adjust the degrees of freedom since presumably the source estimates
>>> from
>>> nearby vertices lack independence.
>>>
>>> (3)    Multidimensional regression: I presume that you are doing your
>>> regression for a single time point, tau.  Or perhaps you are averaging
>>> the
>>> amplitude values centered on the peak.  In either case you get a single
>>> number for each magnetic field sensor for each trial.  Instead you
>>> could use
>>> multiple points about the center of a peak and use a low order
>>> polynomial of
>>> tau multiplied by your original independent variable.  Note that
>>> averaging
>>> is equivalent to using a zero-order polynomial.   If you use say 21
>>> data
>>> points centered on the peak, you increase your degrees of freedom by
>>> quite a
>>> lot.  Of course your 21 data points lack independence but you still are
>>> using more information to do the regression.
>>>
>>> (4)     The more important additional variable is along the time axis
>>> for
>>> the sequence of trials.  If you use a polynomial function for that, any
>>> non-zero Beta other than the zero-order one represents a
>>> nonstationarity in
>>> your measurements.  This is rarely assessed but with humans doing a
>>> task is
>>> always a concern and it's interesting too.  The attached figure
>>> illustrates
>>> ideas (3) and (4) with evoked potential data.
>>>
>>>
>>>
>>> I hope I'm understanding you correctly and that this is helpful.
>>>
>>>
>>>
>>> Regards,
>>>
>>>
>>>
>>> Don
>>>
>>>
>>>
>>> Don Krieger, Ph.D.
>>>
>>> Department of Neurological Surgery
>>>
>>> University of Pittsburgh
>>>
>>> (412)648-9654 Office
>>>
>>> (412)521-4431 Cell/Text
>>>
>>>
>>>
>>> From: mne_analysis-bounces at nmr.mgh.harvard.edu
>>> [mailto:mne_analysis-bounces at nmr.mgh.harvard.edu] On Behalf Of Teon
>>> Brooks
>>> Sent: Wednesday, February 19, 2014 12:34 AM
>>> To: mne_analysis at nmr.mgh.harvard.edu
>>> Subject: [Mne_analysis] Computing regression on sensor data then
>>> transforming to source space
>>>
>>>
>>>
>>> Hi MNE listserv,
>>>
>>>
>>>
>>> I have single-trial data that I would like to regress a predictor
>>> (let's say
>>> word frequency) on it and then compute a source estimate. I'm planning
>>> to
>>> use mne-python to do this computation. I was wondering if I could do
>>> the
>>> regression over single trial sensor data first, get the beta values for
>>> each
>>> sensor over time, and then compute the source estimate as if it were an
>>> evoked object.
>>>
>>>
>>>
>>> My presumption is that it should be fine if the source transformation
>>> is
>>> linear. The other option would be to source transform the data then do
>>> the
>>> regression but the problem with doing this first is that computing the
>>> source estimates is more demanding on memory (say about 1000 trials
>>> with the
>>> around 5000 sources over 600-800ms of time). It would be more efficient
>>> if
>>> this computation could be done first if it is not computationally ill.
>>>
>>>
>>>
>>> What are your thoughts?
>>>
>>>
>>>
>>> Best,
>>>
>>> --
>>>
>>> teon
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Mne_analysis mailing list
>>> Mne_analysis at nmr.mgh.harvard.edu
>>> https://mail.nmr.mgh.harvard.edu/mailman/listinfo/mne_analysis
>>>
>>>
>>> The information in this e-mail is intended only for the person to whom
>>> it is
>>> addressed. If you believe this e-mail was sent to you in error and the
>>> e-mail
>>> contains patient information, please contact the Partners Compliance
>>> HelpLine at
>>> http://www.partners.org/complianceline . If the e-mail was sent to you
>>> in
>>> error
>>> but does not contain patient information, please contact the sender and
>>> properly
>>> dispose of the e-mail.
>>>
>> _______________________________________________
>> Mne_analysis mailing list
>> Mne_analysis at nmr.mgh.harvard.edu
>> https://mail.nmr.mgh.harvard.edu/mailman/listinfo/mne_analysis
> _______________________________________________
> Mne_analysis mailing list
> Mne_analysis at nmr.mgh.harvard.edu
> https://mail.nmr.mgh.harvard.edu/mailman/listinfo/mne_analysis
>
>
>

--
Hari Bharadwaj
PhD Candidate, Biomedical Engineering,
Boston University
677 Beacon St.,
Boston, MA 02215

Martinos Center for Biomedical Imaging,
Massachusetts General Hospital
149 Thirteenth Street,
Charlestown, MA 02129

hari at nmr.mgh.harvard.edu
Ph: 734-883-5954

_______________________________________________
Mne_analysis mailing list
Mne_analysis at nmr.mgh.harvard.edu
https://mail.nmr.mgh.harvard.edu/mailman/listinfo/mne_analysis