Re: [Freesurfer] Notes on CUDA Acceleration

27 Aug 2010


      On Thu, 2010-08-26 at 23:23 +0200, Georg Homola wrote:
...
allow me one additional remark that may be crucial for those considering to
invest in new cards. Although the Fermi class cards make use of the same
architecture (Geforce GTX 480 and Tesla C2050 for example), for consumer
products (GTX 400 series), double precision performance has been limited to
a quarter of that of the "full" Fermi architecture (Tesla C20xx). Error
checking and correcting memory (ECC) is also disabled on consumer cards. I
don't really know how important double precision is for the CUDA enabled
Freesurfer tools, but this could mean you have to buy four GTX cards to
catch up with the performance of one Tesla card.
This is correct. At the moment, I don't think that I use double
precision anywhere, hence we're experimenting with CUDA Capability 1.1.
I may have to start using double precision, given the problems which
have just been found with mri_em_register_cuda. However, I'm not sure
what the performance impact of the degraded GeForce performance will be.
I'm reasonably certain that most of the code is bandwidth bound, so if
anything a GeForce will outpace a Tesla, even if it uses double
precision.
Of greater concern is the amount of memory available. The Tesla cards
have quite a bit more RAM. This is likely to become important in the
near future, as I work to get the rest of the mri_ca_register pipeline
onto the GPU - the GCA structure is quite sparse, but for the initial
port, I'll burn RAM instead of coming up with a cunning packing method.
There will be enough to debug without worrying about optimisation.
Regards,
Richard

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Freesurfer] Notes on CUDA Acceleration