Hi, I have done an analysis involving three groups, so there are three pairwise comparisons across two hemispheres = 6 p-maps. I want to adjust for multiple comparisons (across the vertices), so I use FDR. But since FDR determines the threshold basd on the actual p-values, I get 6 different tresholds:
comparison 1: lh and rh, 0.016 and 0.028 (I can choose .01) comparison 2: lh and rh, 0.01 and 0.001 (I can choose.001) comparison 3 lh and rh, 0.001 and 0.0001 (I can choose .0001)
There are lots of significant vertices in comparison 1 and nothing significant, after correction, in comparison 3. Is there anything wrong with using different tresholds here, and concluding that in comparison 1 there were extensive differences between the groups, whereas in comparison 3 there were none? I'm not sure if this is a problem, but I'm afraid some reviewers might have an issue with it. Across the hemispheres, I can choose a conservative threshold which covers both hemispheres, i.e. lower than both the FDR-adjusted treshold for lh and rh. But between the comparisons the tresholds differ even more, by a factor of 10 and 100. And if I choose the most conservative of all the adjusted thresholds, I'm afraid that I'll make a type II error in comparison 1.
From what I understand, the adjusted threshold for comparison 3 is more
conservative because of the actual empirical data (the distribution of p-values), so that's an empirical argument for using a more conservative threshold there.
And: What if I pooled all thre p-maps (sig.mgh) and did an FDR on the whole thing, would that be a better approach? And does Freesurfer use the Benjamini algorithm, and if you do, can I use Tom Nichols' matlab function for FDR (http://www.sph.umich.edu/~nichols/FDR/FDR.m) for pooling all three p-maps?
Thank you!
Some links that may be helpful: http://surfer.nmr.mgh.harvard.edu/fswiki/FsTutorial/QdecMultipleComparisons http://surfer.nmr.mgh.harvard.edu/fswiki/FsTutorial/GroupAnalysis http://surfer.nmr.mgh.harvard.edu/fswiki/MultipleComparisons
Hope it helps.
PPJ ----------------------------------------------------------- Pedro Paulo de M. Oliveira Junior Diretor de Operações Netfilter & SpeedComm Telecom
On Tue, Mar 24, 2009 at 12:38, Lars M. Rimol larilin@gmail.com wrote:
Hi, I have done an analysis involving three groups, so there are three pairwise comparisons across two hemispheres = 6 p-maps. I want to adjust for multiple comparisons (across the vertices), so I use FDR. But since FDR determines the threshold basd on the actual p-values, I get 6 different tresholds:
comparison 1: lh and rh, 0.016 and 0.028 (I can choose .01) comparison 2: lh and rh, 0.01 and 0.001 (I can choose.001) comparison 3 lh and rh, 0.001 and 0.0001 (I can choose .0001)
There are lots of significant vertices in comparison 1 and nothing significant, after correction, in comparison 3. Is there anything wrong with using different tresholds here, and concluding that in comparison 1 there were extensive differences between the groups, whereas in comparison 3 there were none? I'm not sure if this is a problem, but I'm afraid some reviewers might have an issue with it. Across the hemispheres, I can choose a conservative threshold which covers both hemispheres, i.e. lower than both the FDR-adjusted treshold for lh and rh. But between the comparisons the tresholds differ even more, by a factor of 10 and 100. And if I choose the most conservative of all the adjusted thresholds, I'm afraid that I'll make a type II error in comparison 1.
From what I understand, the adjusted threshold for comparison 3 is more conservative because of the actual empirical data (the distribution of p-values), so that's an empirical argument for using a more conservative threshold there.
And: What if I pooled all thre p-maps (sig.mgh) and did an FDR on the whole thing, would that be a better approach? And does Freesurfer use the Benjamini algorithm, and if you do, can I use Tom Nichols' matlab function for FDR (http://www.sph.umich.edu/~nichols/FDR/FDR.m) for pooling all three p-maps?
Thank you!
-- yours, LMR
Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
Well, I believe there is a problem in principle here. FDR deals with multiple comparisons across the surface (or brain volume), but how do you deal with a series of such analyses? Of course, if you use a different method of correction you avoid this problem but that's not the point.
LMR
Date: Tue, 24 Mar 2009 14:03:46 -0300 Subject: Re: [Freesurfer] FDR correction From: ppj@netfilter.com.br To: larilin@gmail.com CC: freesurfer@nmr.mgh.harvard.edu
Some links that may be helpful: http://surfer.nmr.mgh.harvard.edu/fswiki/FsTutorial/QdecMultipleComparisons http://surfer.nmr.mgh.harvard.edu/fswiki/FsTutorial/GroupAnalysishttp://surf...
Hope it helps. PPJ----------------------------------------------------------- Pedro Paulo de M. Oliveira Junior Diretor de Operações Netfilter & SpeedComm Telecom
On Tue, Mar 24, 2009 at 12:38, Lars M. Rimol larilin@gmail.com wrote:
Hi, I have done an analysis involving three groups, so there are three pairwise comparisons across two hemispheres = 6 p-maps. I want to adjust for multiple comparisons (across the vertices), so I use FDR. But since FDR determines the threshold basd on the actual p-values, I get 6 different tresholds:
comparison 1: lh and rh, 0.016 and 0.028 (I can choose .01) comparison 2: lh and rh, 0.01 and 0.001 (I can choose.001) comparison 3 lh and rh, 0.001 and 0.0001 (I can choose .0001)
There are lots of significant vertices in comparison 1 and nothing significant, after correction, in comparison 3. Is there anything wrong with using different tresholds here, and concluding that in comparison 1 there were extensive differences between the groups, whereas in comparison 3 there were none? I'm not sure if this is a problem, but I'm afraid some reviewers might have an issue with it. Across the hemispheres, I can choose a conservative threshold which covers both hemispheres, i.e. lower than both the FDR-adjusted treshold for lh and rh. But between the comparisons the tresholds differ even more, by a factor of 10 and 100. And if I choose the most conservative of all the adjusted thresholds, I'm afraid that I'll make a type II error in comparison 1.
From what I understand, the adjusted threshold for comparison 3 is more conservative because of the actual empirical data (the distribution of p-values), so that's an empirical argument for using a more conservative threshold there.
And: What if I pooled all thre p-maps (sig.mgh) and did an FDR on the whole thing, would that be a better approach? And does Freesurfer use the Benjamini algorithm, and if you do, can I use Tom Nichols' matlab function for FDR (http://www.sph.umich.edu/~nichols/FDR/FDR.m) for pooling all three p-maps?
Thank you!
Dear Lars, Pedro and all,
There should be no problem in using different thresholds for each of these pairwise comparisons.
Let's consider the following extreme case: Suppose in one of these comparisons you have a strong effect (say, "activation") that comprises large brain regions. Suppose that for the 5 other comparisons, there is no effect at all. If you pool all your six comparisons to calculate a single threshold for all of them, due to the adaptiveness of the Benjamini & Hochberg procedure, the calculated threshold will be such that it will produce more liberal results (i.e. higher p-threshold) for the 5 comparisons where there is no experimental effect, than would be obtained by not pooling, implying that some vertices where the null is true will be (falsely) declared as positive for these comparisons. On the other hand, the vertices where there is no effect will also give their contribution to compute this threshold, but their influence will be in the opposite direction, producing more conservative results (i.e. lower p-threshold) for the comparison where "activation" is present, than would be without pooling, resulting in "active" vertices remaining not detected (false negatives).
Both are clearly undesirable, as the amount of errors (both type I and II) is increased by bleeding the effect/absence of effect from one comparison into another.
In other words, when including in the same analysis different sets of comparisons (which possibly includes different experimental hypotheses, which definitely would preclude pooling), one will be losing one of the nicest features of the B&H procedure: the weak control of FWE (i.e. when the null is true everywhere, you are controlling FWE, even using an FDR procedure) for each comparison if the null for any of these comparisons is true everywhere.
This also means that, although FDR would still be controlled globally, one cannot make inferences about each comparison individually, which I believe was the whole point of making the comparisons initially.
Hope this helps!
Kind regards,
Anderson
Lars M. Rimol wrote:
Well, I believe there is a problem in principle here. FDR deals with multiple comparisons across the surface (or brain volume), but how do you deal with a series of such analyses? Of course, if you use a different method of correction you avoid this problem but that's not the point.
LMR
Date: Tue, 24 Mar 2009 14:03:46 -0300 Subject: Re: [Freesurfer] FDR correction From: ppj@netfilter.com.br To: larilin@gmail.com CC: freesurfer@nmr.mgh.harvard.edu
Some links that may be helpful:
http://surfer.nmr.mgh.harvard.edu/fswiki/FsTutorial/QdecMultipleComparisons http://surfer.nmr.mgh.harvard.edu/fswiki/FsTutorial/GroupAnalysis http://surfer.nmr.mgh.harvard.edu/fswiki/MultipleComparisons
Hope it helps.
PPJ
Pedro Paulo de M. Oliveira Junior Diretor de Operações Netfilter & SpeedComm Telecom
On Tue, Mar 24, 2009 at 12:38, Lars M. Rimol <larilin@gmail.com mailto:larilin@gmail.com> wrote:
Hi, I have done an analysis involving three groups, so there are three pairwise comparisons across two hemispheres = 6 p-maps. I want to adjust for multiple comparisons (across the vertices), so I use FDR. But since FDR determines the threshold basd on the actual p-values, I get 6 different tresholds: comparison 1: lh and rh, 0.016 and 0.028 (I can choose .01) comparison 2: lh and rh, 0.01 and 0.001 (I can choose.001) comparison 3 lh and rh, 0.001 and 0.0001 (I can choose .0001) There are lots of significant vertices in comparison 1 and nothing significant, after correction, in comparison 3. Is there anything wrong with using different tresholds here, and concluding that in comparison 1 there were extensive differences between the groups, whereas in comparison 3 there were none? I'm not sure if this is a problem, but I'm afraid some reviewers might have an issue with it. Across the hemispheres, I can choose a conservative threshold which covers both hemispheres, i.e. lower than both the FDR-adjusted treshold for lh and rh. But between the comparisons the tresholds differ even more, by a factor of 10 and 100. And if I choose the most conservative of all the adjusted thresholds, I'm afraid that I'll make a type II error in comparison 1. From what I understand, the adjusted threshold for comparison 3 is more conservative because of the actual empirical data (the distribution of p-values), so that's an empirical argument for using a more conservative threshold there. And: What if I pooled all thre p-maps (sig.mgh) and did an FDR on the whole thing, would that be a better approach? And does Freesurfer use the Benjamini algorithm, and if you do, can I use Tom Nichols' matlab function for FDR (http://www.sph.umich.edu/~nichols/FDR/FDR.m <http://www.sph.umich.edu/%7Enichols/FDR/FDR.m>) for pooling all three p-maps? Thank you! -- yours, LMR _______________________________________________ Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu <mailto:Freesurfer@nmr.mgh.harvard.edu> https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
Dear Anderson, Thank you for that very thoughtful reply!
I think your argument is convicing and I've decided to use FDR for each comparison individually.
Lars
On Thu, Mar 26, 2009 at 5:37 PM, Anderson Winkler relkniw@bol.com.brwrote:
Dear Lars, Pedro and all,
There should be no problem in using different thresholds for each of these pairwise comparisons.
Let's consider the following extreme case: Suppose in one of these comparisons you have a strong effect (say, "activation") that comprises large brain regions. Suppose that for the 5 other comparisons, there is no effect at all. If you pool all your six comparisons to calculate a single threshold for all of them, due to the adaptiveness of the Benjamini & Hochberg procedure, the calculated threshold will be such that it will produce more liberal results (i.e. higher p-threshold) for the 5 comparisons where there is no experimental effect, than would be obtained by not pooling, implying that some vertices where the null is true will be (falsely) declared as positive for these comparisons. On the other hand, the vertices where there is no effect will also give their contribution to compute this threshold, but their influence will be in the opposite direction, producing more conservative results (i.e. lower p-threshold) for the comparison where "activation" is present, than would be without pooling, resulting in "active" vertices remaining not detected (false negatives).
Both are clearly undesirable, as the amount of errors (both type I and II) is increased by bleeding the effect/absence of effect from one comparison into another.
In other words, when including in the same analysis different sets of comparisons (which possibly includes different experimental hypotheses, which definitely would preclude pooling), one will be losing one of the nicest features of the B&H procedure: the weak control of FWE (i.e. when the null is true everywhere, you are controlling FWE, even using an FDR procedure) for each comparison if the null for any of these comparisons is true everywhere.
This also means that, although FDR would still be controlled globally, one cannot make inferences about each comparison individually, which I believe was the whole point of making the comparisons initially.
Hope this helps!
Kind regards,
Anderson
Lars M. Rimol wrote:
Well, I believe there is a problem in principle here. FDR deals with multiple comparisons across the surface (or brain volume), but how do you deal with a series of such analyses? Of course, if you use a different method of correction you avoid this problem but that's not the point.
LMR
Date: Tue, 24 Mar 2009 14:03:46 -0300 Subject: Re: [Freesurfer] FDR correction From: ppj@netfilter.com.br To: larilin@gmail.com CC: freesurfer@nmr.mgh.harvard.edu
Some links that may be helpful:
http://surfer.nmr.mgh.harvard.edu/fswiki/FsTutorial/QdecMultipleComparisons http://surfer.nmr.mgh.harvard.edu/fswiki/FsTutorial/GroupAnalysis http://surfer.nmr.mgh.harvard.edu/fswiki/MultipleComparisons
Hope it helps.
PPJ
Pedro Paulo de M. Oliveira Junior Diretor de Operações Netfilter & SpeedComm Telecom
On Tue, Mar 24, 2009 at 12:38, Lars M. Rimol larilin@gmail.com wrote:
Hi, I have done an analysis involving three groups, so there are three pairwise comparisons across two hemispheres = 6 p-maps. I want to adjust for multiple comparisons (across the vertices), so I use FDR. But since FDR determines the threshold basd on the actual p-values, I get 6 different tresholds:
comparison 1: lh and rh, 0.016 and 0.028 (I can choose .01) comparison 2: lh and rh, 0.01 and 0.001 (I can choose.001) comparison 3 lh and rh, 0.001 and 0.0001 (I can choose .0001)
There are lots of significant vertices in comparison 1 and nothing significant, after correction, in comparison 3. Is there anything wrong with using different tresholds here, and concluding that in comparison 1 there were extensive differences between the groups, whereas in comparison 3 there were none? I'm not sure if this is a problem, but I'm afraid some reviewers might have an issue with it. Across the hemispheres, I can choose a conservative threshold which covers both hemispheres, i.e. lower than both the FDR-adjusted treshold for lh and rh. But between the comparisons the tresholds differ even more, by a factor of 10 and 100. And if I choose the most conservative of all the adjusted thresholds, I'm afraid that I'll make a type II error in comparison 1.
From what I understand, the adjusted threshold for comparison 3 is more conservative because of the actual empirical data (the distribution of p-values), so that's an empirical argument for using a more conservative threshold there.
And: What if I pooled all thre p-maps (sig.mgh) and did an FDR on the whole thing, would that be a better approach? And does Freesurfer use the Benjamini algorithm, and if you do, can I use Tom Nichols' matlab function for FDR (http://www.sph.umich.edu/~nichols/FDR/FDR.mhttp://www.sph.umich.edu/%7Enichols/FDR/FDR.m) for pooling all three p-maps? Thank you!
-- yours, LMR
Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
Freesurfer mailing listFreesurfer@nmr.mgh.harvard.eduhttps://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
freesurfer@nmr.mgh.harvard.edu