New subject: Freesurfer and Grid computing

7 Dec 2006


      Nick and Uri: thank you for your replies,
I'm not really into parallel processing and forgive me if I talk
nonsense :). From your stories I sketch the following possible
scenarios:
1. We use a sort of batch system. For instance, a Linux bash script with
10 recon-all commands or with 10 FSL feat commands. The system assigns
each individual command to the node with the least load and the output
data are automatically stored on a central (fi RAID) server. I guess
this is the way the UCLA FSL/MAC grid works, since I can see no need to
alter the software itself, if the applications are suitable to work from
and write to an external computer. I'll contact the group exactly how
they work. I guess we need very fast connections between the individual
servers and the storage host to do this.... I don't know PBS, but I
think this is exactly how the Sun grid engine works also, so maybe they
are comparable. Batches are submitted to the host and are processed one
at a time. In fact this is how I work myself right now sometimes when
I'm in a hurry. I copy all my FSL source data to different computers and
let each computer run a subset of the feat batch. By hand I group the
data afterwards and I perform only the group averaging on a single
computer.
2. We allow for more sophisticated parallelizing. You give the example
of assigning the reconstruction of left and right hemispheres
separately. This is the most elegant solution but implies rewriting the
code.
I think the second scenario would eventually be the most elegant but the
first scenario would be less difficult to implement. In fact, I guess
most applications (like FSL, freesurfer, SPM) work with batches, so if
we have software that manages these batches we already have won a lot
(and it saves a lot of manual work.... ;(
What do you think?
Thank you,
Andries van der Leij
PS: here's a presentation with screenshots of the sun grid: it looks
fairly straightforward.
http://www.sun.com/products-n-solutions/edu/whitepapers/pdf/bioinformati
cs_supercomputer.pdf
PS2: Nick, are you Dutch?
-----Original Message-----
From: uhasson@gmail.com [mailto:uhasson@gmail.com] On Behalf Of U.
Hasson
Sent: Wednesday, December 06, 2006 9:50 PM
To: Nick Schmansky
Cc: Andries van der Leij; Freesurfer Mailing List
Subject: Re: [Freesurfer] Freesurfer and Grid computing
At The Uni. of Chicago, we've been playing around with parallelizing
freesurfer on a 128 node grid (256 processors).  Developers have
parallelized  procedures (scripts) by unpacking "for" loops that
rotate across left and right hemispheres [i.e., they fork the
independent processing of left and right hemispheres to different
nodes running in parallel, whenever possible].
The main point of this work is  to acquire provenance records and
therefore Freesurfer scripts are "wrapped" or expressed using a
virtual data system language (VDS/VDL). The freesurfer implementation,
AFAIK, is in its baby steps, but the general workflow model is pretty
well established
http://www.ci.uchicago.edu/wiki/bin/view/VDS/VDSWeb/WebMain
Best,
Uri
On 12/6/06, Uri Hasson uhasson@gmail.com wrote:
...
Here (Uni. of Chicago), we've been playing around with parallelizing
freesurfer on a 128 node grid (256 processors), and  developers have
parallelized  procedures (scripts) by unpacking "for" loops that
rotate across left and right hemispheres [i.e., they fork the
independent processing of left and right hemispheres to different
nodes running in parallel, whenever possible].
The main point of this work is  to acquire provenance records and
therefore Freesurfer scripts are "wrapped" or expressed using a
virtual data system language (VDS/VDL). The freesurfer implementation,
AFAIK, is in its baby steps, but the general workflow model is pretty
well established
http://www.ci.uchicago.edu/wiki/bin/view/VDS/VDSWeb/WebMain
Best,
Uri
On 12/6/06, Nick Schmansky nicks@nmr.mgh.harvard.edu wrote:
...
Andries,
I am not aware of usage of Freesurfer in a (Sun) Grid Engine
environment
...
...
(such as that used by the Cohen group at UCLA).
However, here at the MGH/MIT/HMS Martinos Center we use a cluster of
some 100+ nodes configured with Linux Centos 4, and governed by PBS
(Portable Batch System).  Researchers here often conduct studies
with
...
...
dozens to hundreds of brains, and for each subject, an instance of
Freesurfer's 'recon-all -s <subject> -all' script is submitted to
the
...
...
batch system, which, under the hood, gets submitted to one computing
node.  Thus, several dozen brains can be processed in a day (and a
half).
Freesurfer does not currently support fine-grain parallelism.  Some
coarse-grain parallelism, whereby each brain hemisphere is processed
independently (benefiting multiprocessor nodes) is possible, but not
currently implemented in our 'recon-all' script, as the error
handling
...
...
and logging for doing so is somewhat tricky (and so this feature is
in-
...
...
the-works-but-not-anytime-soon).
In short, if you plan on using Freesurfer in studies with large
numbers
...
...
of subjects, I would recommend some kind of computing cluster, and
some
...
...
fairly simple batch software (like PBS) should be sufficient.  For
instance, I know of one group that has successfully run Freesurfer
on
...
...
their Altix Itanium Linux cluster.
Groetjes,
Nick
On Wed, 2006-12-06 at 18:44 +0100, Andries van der Leij wrote:
...
______________________________________________________________________
...
...
...
From: Andries van der Leij
Sent: Wednesday, December 06, 2006 5:59 PM
To: 'freesurfer@nmr.mgh.harvard.edu'
Subject: Freesurfer and Grid computing
Dear Freesurfer community,
I'm a PHD student at the university of Amsterdam and I'm currently
investigating the possibilities to streamline our MRI data
processing
...
...
...
stream. Next summer we'll obtain a research-only scanner. I'm
trying
...
...
...
to push the group to also invest in computing power and am
currently
...
...
...
investigating the applications that researchers will most probably
use.
I came across a project of the group of Cohen at UCLA. They have
configured a Apple (unix) grid and have proposed a more or less
standard setup specially designed for MRI analyses:
http://airto.bmap.ucla.edu/mt-
static/NICluster/archives/2005/06/welcome.html
It is my understanding that one of the members has rewritten the
FSL
...
...
...
code which allow distributed parallel processing in a Grid. See
the
...
...
...
benchmarks here:
http://airto.bmap.ucla.edu/bmcweb/bmc_bios/MarkCohen/Apple/Benchmarks.ht
m
...
...
...
My question is fairly simple: Are similar steps taken in the
Freesurfer community? I have no experience with this app myself,
but
...
...
...
it is my understanding that Freesurfer consumes a lot of
resources.
...
...
...
Thank you very much in advance,
Andries van der Leij

Freesurfer mailing list
Freesurfer@nmr.mgh.harvard.edu
https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer

Freesurfer mailing list
Freesurfer@nmr.mgh.harvard.edu
https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer

RE: [Freesurfer] Freesurfer and Grid computing