Hi Doug and Nick,
thanks for your quick answer. I think that the assumption of linearity can often be made in genetics and in large GWAS studies the coding is done as suggested by Nick. However, Nick's solution does not suit this particular case since one of the homozygous groups has less than 10 subjects which, in my opinion, might skew our results. Therefore we would like to lump homozygotes and the carriers together and have 2 groups (instead of 3).
Unfortunately I and Stefan Brauns do not understand your reply, Doug. We were suggesting to include the binary variable for genotype as a "variable" ( covariate) instead of a "class" (factor) in the FSGD. This would enable us to get around the problem of "small cell sizes". Why do you say this is not possible with an FSGD?
We would just specify
Variables age SNP
and SNP would be 0 and 1 instead of 0, 1 and 2 (as suggested by Nick)
This is how the header would look like:
GroupDescriptorFile 1 Title rs8216888_status MeasurementName thickness Class SCZMALEMGH Class SCZFEMALEMGH Class HCMALEMGH Class HCFEMALEMGH Class SCZMALEIA Class SCZFEMALEIA Class HCMALEIA Class HCFEMALEIA Class SCZMALEUMN Class SCZFEMALEUMN Class HCMALEUMN Class HCFEMALEUMN Class SCZMALEUNM Class SCZFEMALEUNM Class HCMALEUNM Class HCFEMALEUNM Variables age SNP
In contrast, if we would include it as a class - we would have 32 instead of 16 classes and then "Variables age"
The question is if it violates some assumptions if we specify a "variables" (covariate) which is binary.
Many thanks, Stefan
Message: 6 Date: Wed, 02 Dec 2009 13:18:20 -0500 From: Douglas N Greve greve@nmr.mgh.harvard.edu Subject: Re: [Freesurfer] "dummy variable" in mri-glmfit To: Stefan Brauns stefan.brauns@googlemail.com Cc: freesurfer freesurfer@nmr.mgh.harvard.edu Message-ID: 4B16AF6C.9080103@nmr.mgh.harvard.edu Content-Type: text/plain; charset=UTF-8; format=flowed
Do you mean having just another column in your design matrix with 0s and 1s? You can do this, but not with an FSGD. You'll have to supply your own matrix. An easy way to do this would be to run mri_glmfit with and FSGD without the genotype. This will create a matrix Xg.dat in the output dir, then just modify that matrix and pass it to a new call to mri_glmfit
doug
Stefan Brauns wrote:
Hi there,
we would like to test the effect of a binary variable (genotype = carrier vs. homozygous) on cortical thicknes in mri-glmfit. Since we are also controlling for gender and aquisition site (4 sites) we already have 16 groups. In order to control for age as a covariate we need at least 2 subjects per group to be able to estimate an age slope.
If we include the aforementioned binary variable (genotype) as a factor (two different "groups"), we would have 32 groups and unfortunately not enough subjects per group.
Is it possible to include binary variables ("dummy variable" coded as 0 and 1) such as genotype or gender as covariates (slope), in order to reduce the number of groups and examine the effect on thickness? In simple regression this would not affect the results - what would we expect here?
Many thanks,
Stefan