Is there a standard benchmark for FreeSurfer? I've been using the data under subjects (Bert?/Ernie?) and running a recon- all:
recon-all -s ernie -i ./sample-001.mgz -i ./sample-002.mgz -all
On our hardware using the 5.1 distributed binary (freesurfer-Linux- centos4_x86_64-stable-pub-v5.1.0.tar.gz) it takes about 12 hours.
I was surprised that 5.1 was running so much faster than 5.0. With 5.0 (freesurfer-Linux-centos5_x86_64-stable-pub-v5.0.0.tar.gz) it was taking about 18 hours. Did anyone else notice a big speed-up from 5.0 to 5.1? Maybe it's a difference between centos5 vs. centos4? If so, wouldn't you expect the former to be faster?
If I back-port the changes Nick made to configure.in for the dev branch to the stable release of 5.1 and build from source on our systems, I'm able to run in ~10 hours. I'm guessing this is mostly due to the difference in the versions of gcc used on our system (4.1.2) vs. those used for the centos4 distributed binary?
For the dev release, it's taking about ~11 hours. I'm guessing the dev branch is mostly focused on features/bug-fixes and performance is only looked at before a release?
Besides GPUs, what else are people doing to increase performance?
Cheers, Malcolm
Hi Malcolm
in collaboration with IBM we are also looking at MPI and pthreads.
cheers Bruce
On Fri, 13 Jan 2012, Malcolm Tobias wrote:
Is there a standard benchmark for FreeSurfer? I've been using the data under subjects (Bert?/Ernie?) and running a recon- all:
recon-all -s ernie -i ./sample-001.mgz -i ./sample-002.mgz -all
On our hardware using the 5.1 distributed binary (freesurfer-Linux- centos4_x86_64-stable-pub-v5.1.0.tar.gz) it takes about 12 hours.
I was surprised that 5.1 was running so much faster than 5.0. With 5.0 (freesurfer-Linux-centos5_x86_64-stable-pub-v5.0.0.tar.gz) it was taking about 18 hours. Did anyone else notice a big speed-up from 5.0 to 5.1? Maybe it's a difference between centos5 vs. centos4? If so, wouldn't you expect the former to be faster?
If I back-port the changes Nick made to configure.in for the dev branch to the stable release of 5.1 and build from source on our systems, I'm able to run in ~10 hours. I'm guessing this is mostly due to the difference in the versions of gcc used on our system (4.1.2) vs. those used for the centos4 distributed binary?
For the dev release, it's taking about ~11 hours. I'm guessing the dev branch is mostly focused on features/bug-fixes and performance is only looked at before a release?
Besides GPUs, what else are people doing to increase performance?
Cheers, Malcolm
Malcolm,
actually, they (IBM) are looking at openmp (to allow multiple threads to process for-loops) and SSE3 instructions (better vectorization).
recon-all --help contains some timings for an AMD processor. centos4 vs. centos5 itself should not account for any speed differences, but it is true that our centos5 build was built with gcc 4.1 while our centos4 build uses gcc 3.4.7, so those compiler difference likely account for speed differences.
another major factor that affects runtime is whether the Intel Nahalem architecture exists on your system. this memory controller is much better at handling the wide memory layout of freesurfer structures (minimizing cache-line hits).
Nick
On Fri, 2012-01-13 at 09:13 -0500, Bruce Fischl wrote:
Hi Malcolm
in collaboration with IBM we are also looking at MPI and pthreads.
cheers Bruce
On Fri, 13 Jan 2012, Malcolm Tobias wrote:
Is there a standard benchmark for FreeSurfer? I've been using the data under subjects (Bert?/Ernie?) and running a recon- all:
recon-all -s ernie -i ./sample-001.mgz -i ./sample-002.mgz -all
On our hardware using the 5.1 distributed binary (freesurfer-Linux- centos4_x86_64-stable-pub-v5.1.0.tar.gz) it takes about 12 hours.
I was surprised that 5.1 was running so much faster than 5.0. With 5.0 (freesurfer-Linux-centos5_x86_64-stable-pub-v5.0.0.tar.gz) it was taking about 18 hours. Did anyone else notice a big speed-up from 5.0 to 5.1? Maybe it's a difference between centos5 vs. centos4? If so, wouldn't you expect the former to be faster?
If I back-port the changes Nick made to configure.in for the dev branch to the stable release of 5.1 and build from source on our systems, I'm able to run in ~10 hours. I'm guessing this is mostly due to the difference in the versions of gcc used on our system (4.1.2) vs. those used for the centos4 distributed binary?
For the dev release, it's taking about ~11 hours. I'm guessing the dev branch is mostly focused on features/bug-fixes and performance is only looked at before a release?
Besides GPUs, what else are people doing to increase performance?
Cheers, Malcolm
Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
I think IBM has a better compiler. Better than gcc and slightly slower than intel compiler
Sent from my iPad
On Jan 18, 2012, at 16:09, Nick Schmansky nicks@nmr.mgh.harvard.edu wrote:
Malcolm,
actually, they (IBM) are looking at openmp (to allow multiple threads to process for-loops) and SSE3 instructions (better vectorization).
recon-all --help contains some timings for an AMD processor. centos4 vs. centos5 itself should not account for any speed differences, but it is true that our centos5 build was built with gcc 4.1 while our centos4 build uses gcc 3.4.7, so those compiler difference likely account for speed differences.
another major factor that affects runtime is whether the Intel Nahalem architecture exists on your system. this memory controller is much better at handling the wide memory layout of freesurfer structures (minimizing cache-line hits).
Nick
On Fri, 2012-01-13 at 09:13 -0500, Bruce Fischl wrote:
Hi Malcolm
in collaboration with IBM we are also looking at MPI and pthreads.
cheers Bruce
On Fri, 13 Jan 2012, Malcolm Tobias wrote:
Is there a standard benchmark for FreeSurfer? I've been using the data under subjects (Bert?/Ernie?) and running a recon- all:
recon-all -s ernie -i ./sample-001.mgz -i ./sample-002.mgz -all
On our hardware using the 5.1 distributed binary (freesurfer-Linux- centos4_x86_64-stable-pub-v5.1.0.tar.gz) it takes about 12 hours.
I was surprised that 5.1 was running so much faster than 5.0. With 5.0 (freesurfer-Linux-centos5_x86_64-stable-pub-v5.0.0.tar.gz) it was taking about 18 hours. Did anyone else notice a big speed-up from 5.0 to 5.1? Maybe it's a difference between centos5 vs. centos4? If so, wouldn't you expect the former to be faster?
If I back-port the changes Nick made to configure.in for the dev branch to the stable release of 5.1 and build from source on our systems, I'm able to run in ~10 hours. I'm guessing this is mostly due to the difference in the versions of gcc used on our system (4.1.2) vs. those used for the centos4 distributed binary?
For the dev release, it's taking about ~11 hours. I'm guessing the dev branch is mostly focused on features/bug-fixes and performance is only looked at before a release?
Besides GPUs, what else are people doing to increase performance?
Cheers, Malcolm
Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.
Pedro,
I've got a run going with the Intel compilers now (I'm assuming that's what you meant?). Besides producing faster code, it will be interesting to see whether the compilers have a noticeable result on the results.
Malcolm
On Wednesday 18 January 2012 12:14:42 Pedro Paulo de Magalhães Oliveira Junior wrote:
I think IBM has a better compiler. Better than gcc and slightly slower than intel compiler
Sent from my iPad
On Jan 18, 2012, at 16:09, Nick Schmansky nicks@nmr.mgh.harvard.edu wrote:
Malcolm,
actually, they (IBM) are looking at openmp (to allow multiple threads to process for-loops) and SSE3 instructions (better vectorization).
recon-all --help contains some timings for an AMD processor. centos4 vs. centos5 itself should not account for any speed differences, but it is true that our centos5 build was built with gcc 4.1 while our centos4 build uses gcc 3.4.7, so those compiler difference likely account for speed differences.
another major factor that affects runtime is whether the Intel Nahalem architecture exists on your system. this memory controller is much better at handling the wide memory layout of freesurfer structures (minimizing cache-line hits).
Nick
On Fri, 2012-01-13 at 09:13 -0500, Bruce Fischl wrote:
Hi Malcolm
in collaboration with IBM we are also looking at MPI and pthreads.
cheers Bruce
On Fri,
13 Jan 2012, Malcolm Tobias wrote:
Is there a standard benchmark for FreeSurfer? I've been using the data under subjects (Bert?/Ernie?) and running a recon- all:
recon-all -s ernie -i ./sample-001.mgz -i ./sample-002.mgz -all
On our hardware using the 5.1 distributed binary (freesurfer-Linux- centos4_x86_64-stable-pub-v5.1.0.tar.gz) it takes about 12 hours.
I was surprised that 5.1 was running so much faster than 5.0. With 5.0 (freesurfer-Linux-centos5_x86_64-stable-pub-v5.0.0.tar.gz) it was taking about 18 hours. Did anyone else notice a big speed-up from 5.0 to 5.1? Maybe it's a difference between centos5 vs. centos4? If so, wouldn't you expect the former to be faster?
If I back-port the changes Nick made to configure.in for the dev branch to the stable release of 5.1 and build from source on our systems, I'm able to run in ~10 hours. I'm guessing this is mostly due to the difference in the versions of gcc used on our system (4.1.2) vs. those used for the centos4 distributed binary?
For the dev release, it's taking about ~11 hours. I'm guessing the dev branch is mostly focused on features/bug-fixes and performance is only looked at before a release?
Besides GPUs, what else are people doing to increase performance?
Cheers, Malcolm
Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.
Malcolm,
I think it will be at least 10% faster. I messed with the Intel compiler a couple years ago, but they wanted to charge a yearly fee for its usage (mgh doesnt qualify as an academic user), so we nixed using that compiler. also, fyi, I was unable to get the AMD compiler to build the code tree.
if/when we get the openmp/sse3 stuff working, i'm thinking we'll provide a centos6 build, which uses gcc 4.4.5, in addition to our 'lowest-common-denominator' build of centos4 (which seems to work on quite a large range of systems). from what i understand of the openmp build, we would offer a flag to recon-all where you specify the number of threads (cores) you want to use. even a single core system would get a little speed-up because of intel's hyper-threading thing.
N.
On Wed, 2012-01-18 at 12:24 -0600, Malcolm Tobias wrote:
Pedro,
I've got a run going with the Intel compilers now (I'm assuming that's what you meant?). Besides producing faster code, it will be interesting to see whether the compilers have a noticeable result on the results.
Malcolm
On Wednesday 18 January 2012 12:14:42 Pedro Paulo de Magalhães Oliveira Junior wrote:
I think IBM has a better compiler. Better than gcc and slightly slower than intel compiler
Sent from my iPad
On Jan 18, 2012, at 16:09, Nick Schmansky nicks@nmr.mgh.harvard.edu wrote:
Malcolm,
actually, they (IBM) are looking at openmp (to allow multiple threads to process for-loops) and SSE3 instructions (better vectorization).
recon-all --help contains some timings for an AMD processor. centos4 vs. centos5 itself should not account for any speed differences, but it is true that our centos5 build was built with gcc 4.1 while our centos4 build uses gcc 3.4.7, so those compiler difference likely account for speed differences.
another major factor that affects runtime is whether the Intel Nahalem architecture exists on your system. this memory controller is much better at handling the wide memory layout of freesurfer structures (minimizing cache-line hits).
Nick
On Fri, 2012-01-13 at 09:13 -0500, Bruce Fischl wrote:
Hi Malcolm
in collaboration with IBM we are also looking at MPI and pthreads.
cheers Bruce
On Fri,
13 Jan 2012, Malcolm Tobias wrote:
Is there a standard benchmark for FreeSurfer? I've been using the data under subjects (Bert?/Ernie?) and running a recon- all:
recon-all -s ernie -i ./sample-001.mgz -i ./sample-002.mgz -all
On our hardware using the 5.1 distributed binary (freesurfer-Linux- centos4_x86_64-stable-pub-v5.1.0.tar.gz) it takes about 12 hours.
I was surprised that 5.1 was running so much faster than 5.0. With 5.0 (freesurfer-Linux-centos5_x86_64-stable-pub-v5.0.0.tar.gz) it was taking about 18 hours. Did anyone else notice a big speed-up from 5.0 to 5.1? Maybe it's a difference between centos5 vs. centos4? If so, wouldn't you expect the former to be faster?
If I back-port the changes Nick made to configure.in for the dev branch to the stable release of 5.1 and build from source on our systems, I'm able to run in ~10 hours. I'm guessing this is mostly due to the difference in the versions of gcc used on our system (4.1.2) vs. those used for the centos4 distributed binary?
For the dev release, it's taking about ~11 hours. I'm guessing the dev branch is mostly focused on features/bug-fixes and performance is only looked at before a release?
Besides GPUs, what else are people doing to increase performance?
Cheers, Malcolm
Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.
OpenMP (depending how efficiently FreeSurfer can be parallelized) would be a great benefit here.
I have no idea what's causing the centos4 vs. 5 results, but it might be interesting to start collecting some performance data if enough people are interested.
All the benchmark numbers I've mentioned have been run on Nehalem based x5550 CPUs. I only run 1 job/node, so the Turbo-mode is probably kicking in.
Cheers, Malcolm
On Wednesday 18 January 2012 12:09:32 Nick Schmansky wrote:
Malcolm,
actually, they (IBM) are looking at openmp (to allow multiple threads to process for-loops) and SSE3 instructions (better vectorization).
recon-all --help contains some timings for an AMD processor. centos4 vs. centos5 itself should not account for any speed differences, but it is true that our centos5 build was built with gcc 4.1 while our centos4 build uses gcc 3.4.7, so those compiler difference likely account for speed differences.
another major factor that affects runtime is whether the Intel Nahalem architecture exists on your system. this memory controller is much better at handling the wide memory layout of freesurfer structures (minimizing cache-line hits).
Nick
On Fri, 2012-01-13 at 09:13 -0500, Bruce Fischl wrote:
Hi Malcolm
in collaboration with IBM we are also looking at MPI and pthreads.
cheers Bruce
On Fri,
13 Jan 2012, Malcolm Tobias wrote:
Is there a standard benchmark for FreeSurfer? I've been using the data under subjects (Bert?/Ernie?) and running a recon- all:
recon-all -s ernie -i ./sample-001.mgz -i ./sample-002.mgz -all
On our hardware using the 5.1 distributed binary (freesurfer-Linux- centos4_x86_64-stable-pub-v5.1.0.tar.gz) it takes about 12 hours.
I was surprised that 5.1 was running so much faster than 5.0. With 5.0 (freesurfer-Linux-centos5_x86_64-stable-pub-v5.0.0.tar.gz) it was taking about 18 hours. Did anyone else notice a big speed-up from 5.0 to 5.1? Maybe it's a difference between centos5 vs. centos4? If so, wouldn't you expect the former to be faster?
If I back-port the changes Nick made to configure.in for the dev branch to the stable release of 5.1 and build from source on our systems, I'm able to run in ~10 hours. I'm guessing this is mostly due to the difference in the versions of gcc used on our system (4.1.2) vs. those used for the centos4 distributed binary?
For the dev release, it's taking about ~11 hours. I'm guessing the dev branch is mostly focused on features/bug-fixes and performance is only looked at before a release?
Besides GPUs, what else are people doing to increase performance?
Cheers, Malcolm
Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.
Hi Nick-
I was explicitly told by you there was not stable centos5 release of 5.1. And unless it is located elsewhere, it is not located in:
ftp://surfer.nmr.mgh.harvard.edu/pub/dist/freesurfer/5.1.0/
Can you please clarify this?
Thanks, Wil
|-----Original Message----- |From: Nick Schmansky [mailto:nicks@nmr.mgh.harvard.edu] |Sent: Wednesday, January 18, 2012 10:10 AM |To: Bruce Fischl |Cc: freesurfer@nmr.mgh.harvard.edu; Malcolm Tobias |Subject: Re: [Freesurfer] Performance questions | |Malcolm, | |actually, they (IBM) are looking at openmp (to allow multiple threads to |process for-loops) and SSE3 instructions (better vectorization). | |recon-all --help contains some timings for an AMD processor. centos4 vs. |centos5 itself should not account for any speed differences, but it is true that |our centos5 build was built with gcc 4.1 while our centos4 build uses gcc 3.4.7, |so those compiler difference likely account for speed differences. | |another major factor that affects runtime is whether the Intel Nahalem |architecture exists on your system. this memory controller is much better at |handling the wide memory layout of freesurfer structures (minimizing cache- |line hits). | |Nick | | |On Fri, 2012-01-13 at 09:13 -0500, Bruce Fischl wrote: |> Hi Malcolm |> |> in collaboration with IBM we are also looking at MPI and pthreads. |> |> cheers |> Bruce |> |> On Fri, |> 13 Jan 2012, Malcolm Tobias wrote: |> |> > |> > Is there a standard benchmark for FreeSurfer? |> > I've been using the data under subjects (Bert?/Ernie?) and running a |> > recon- |> > all: |> > |> > recon-all -s ernie -i ./sample-001.mgz -i ./sample-002.mgz -all |> > |> > On our hardware using the 5.1 distributed binary (freesurfer-Linux- |> > centos4_x86_64-stable-pub-v5.1.0.tar.gz) it takes about 12 hours. |> > |> > I was surprised that 5.1 was running so much faster than 5.0. With |> > 5.0 |> > (freesurfer-Linux-centos5_x86_64-stable-pub-v5.0.0.tar.gz) it was |> > taking about |> > 18 hours. Did anyone else notice a big speed-up from 5.0 to 5.1? |> > Maybe it's a difference between centos5 vs. centos4? If so, |> > wouldn't you expect the former to be faster? |> > |> > If I back-port the changes Nick made to configure.in for the dev |> > branch to the stable release of 5.1 and build from source on our |> > systems, I'm able to run in |> > ~10 hours. I'm guessing this is mostly due to the difference in the |> > versions of gcc used on our system (4.1.2) vs. those used for the |> > centos4 distributed binary? |> > |> > For the dev release, it's taking about ~11 hours. I'm guessing the |> > dev branch is mostly focused on features/bug-fixes and performance |> > is only looked at before a release? |> > |> > Besides GPUs, what else are people doing to increase performance? |> > |> > Cheers, |> > Malcolm |> > |> > |> _______________________________________________ |> Freesurfer mailing list |> Freesurfer@nmr.mgh.harvard.edu |> https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer |> |> |
Wil,
this is correct, we do not officially support a centos5 build, because the centos4 build works on centos5, and we want to make public distributions and maintenance as simple as possible.
in future, we will start releasing a centos6 build, because we will begin using openmp in our builds, but openmp is only supported in gcc 4.2 and later, which centos4 and centos5 do not have, but centos6 does.
N.
On Thu, 2012-01-19 at 23:49 +0000, Irwin, William wrote:
Hi Nick-
I was explicitly told by you there was not stable centos5 release of 5.1. And unless it is located elsewhere, it is not located in:
ftp://surfer.nmr.mgh.harvard.edu/pub/dist/freesurfer/5.1.0/
Can you please clarify this?
Thanks, Wil
|-----Original Message----- |From: Nick Schmansky [mailto:nicks@nmr.mgh.harvard.edu] |Sent: Wednesday, January 18, 2012 10:10 AM |To: Bruce Fischl |Cc: freesurfer@nmr.mgh.harvard.edu; Malcolm Tobias |Subject: Re: [Freesurfer] Performance questions | |Malcolm, | |actually, they (IBM) are looking at openmp (to allow multiple threads to |process for-loops) and SSE3 instructions (better vectorization). | |recon-all --help contains some timings for an AMD processor. centos4 vs. |centos5 itself should not account for any speed differences, but it is true that |our centos5 build was built with gcc 4.1 while our centos4 build uses gcc 3.4.7, |so those compiler difference likely account for speed differences. | |another major factor that affects runtime is whether the Intel Nahalem |architecture exists on your system. this memory controller is much better at |handling the wide memory layout of freesurfer structures (minimizing cache- |line hits). | |Nick | | |On Fri, 2012-01-13 at 09:13 -0500, Bruce Fischl wrote: |> Hi Malcolm |> |> in collaboration with IBM we are also looking at MPI and pthreads. |> |> cheers |> Bruce |> |> On Fri, |> 13 Jan 2012, Malcolm Tobias wrote: |> |> > |> > Is there a standard benchmark for FreeSurfer? |> > I've been using the data under subjects (Bert?/Ernie?) and running a |> > recon- |> > all: |> > |> > recon-all -s ernie -i ./sample-001.mgz -i ./sample-002.mgz -all |> > |> > On our hardware using the 5.1 distributed binary (freesurfer-Linux- |> > centos4_x86_64-stable-pub-v5.1.0.tar.gz) it takes about 12 hours. |> > |> > I was surprised that 5.1 was running so much faster than 5.0. With |> > 5.0 |> > (freesurfer-Linux-centos5_x86_64-stable-pub-v5.0.0.tar.gz) it was |> > taking about |> > 18 hours. Did anyone else notice a big speed-up from 5.0 to 5.1? |> > Maybe it's a difference between centos5 vs. centos4? If so, |> > wouldn't you expect the former to be faster? |> > |> > If I back-port the changes Nick made to configure.in for the dev |> > branch to the stable release of 5.1 and build from source on our |> > systems, I'm able to run in |> > ~10 hours. I'm guessing this is mostly due to the difference in the |> > versions of gcc used on our system (4.1.2) vs. those used for the |> > centos4 distributed binary? |> > |> > For the dev release, it's taking about ~11 hours. I'm guessing the |> > dev branch is mostly focused on features/bug-fixes and performance |> > is only looked at before a release? |> > |> > Besides GPUs, what else are people doing to increase performance? |> > |> > Cheers, |> > Malcolm |> > |> > |> _______________________________________________ |> Freesurfer mailing list |> Freesurfer@nmr.mgh.harvard.edu |> https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer |> |> |
freesurfer@nmr.mgh.harvard.edu