Hi Doug and Nick,
thanks for your quick answer. I think that the assumption of linearity can often be made in genetics and in large GWAS studies the coding is done as suggested by Nick. However, Nick's solution does not suit this particular case since one of the homozygous groups has less than 10 subjects which, in my opinion, might skew our results. Therefore we would like to lump homozygotes and the carriers together and have 2 groups (instead of 3).
Unfortunately I and Stefan Brauns do not understand your reply, Doug. We were suggesting to include the binary variable for genotype as a "variable" ( covariate) instead of a "class" (factor) in the FSGD. This would enable us to get around the problem of "small cell sizes". Why do you say this is not possible with an FSGD?
We would just specify
Variables age SNP
and SNP would be 0 and 1 instead of 0, 1 and 2 (as suggested by Nick)
This is how the header would look like:
GroupDescriptorFile 1
Title rs8216888_status
MeasurementName thickness
Class SCZMALEMGH
Class SCZFEMALEMGH
Class HCMALEMGH
Class HCFEMALEMGH
Class SCZMALEIA
Class SCZFEMALEIA
Class HCMALEIA
Class HCFEMALEIA
Class SCZMALEUMN
Class SCZFEMALEUMN
Class HCMALEUMN
Class HCFEMALEUMN
Class SCZMALEUNM
Class SCZFEMALEUNM
Class HCMALEUNM
Class HCFEMALEUNM
Variables age SNP
In contrast, if we would include it as a class - we would have 32 instead of 16 classes and then "Variables age"
The question is if it violates some assumptions if we specify a "variables" (covariate) which is binary.
Many thanks, Stefan
Message: 6
Date: Wed, 02 Dec 2009 13:18:20 -0500
From: Douglas N Greve <greve(a)nmr.mgh.harvard.edu>
Subject: Re: [Freesurfer] "dummy variable" in mri-glmfit
To: Stefan Brauns <stefan.brauns(a)googlemail.com>
Cc: freesurfer <freesurfer(a)nmr.mgh.harvard.edu>
Message-ID: <4B16AF6C.9080103(a)nmr.mgh.harvard.edu>
Content-Type: text/plain; charset=UTF-8; format=flowed
Do you mean having just another column in your design matrix with 0s and
1s? You can do this, but not with an FSGD. You'll have to supply your
own matrix. An easy way to do this would be to run mri_glmfit with and
FSGD without the genotype. This will create a matrix Xg.dat in the
output dir, then just modify that matrix and pass it to a new call to
mri_glmfit
doug
Stefan Brauns wrote:
> Hi there,
>
> we would like to test the effect of a binary variable (genotype
> = carrier vs. homozygous) on cortical thicknes in mri-glmfit. Since we
> are also controlling for gender and aquisition site (4 sites) we
> already have 16 groups. In order to control for age as a covariate we
> need at least 2 subjects per group to be able to estimate an age slope.
>
> If we include the aforementioned binary variable (genotype) as a
> factor (two different "groups"), we would have 32 groups and
> unfortunately not enough subjects per group.
>
> Is it possible to include binary variables ("dummy variable" coded as
> 0 and 1) such as genotype or gender as covariates (slope), in order to
> reduce the number of groups and examine the effect on thickness? In
> simple regression this would not affect the results - what would we
> expect here?
>
> Many thanks,
>
> Stefan
>