On 05/04/11 16:55, Richard G. Edgar wrote:
On Tue, 2011-04-05 at 16:30 +0100, Ian Malone wrote:
Richard G. Edgar wrote:
On the standard test case we use here, a full recon-all run takes 8 hours on a 3.2 GHz Nehalem core, and about 4 hours 20 mins when using the Tesla C2050.
Would I be right in concluding that a 4-core Nehalem (e.g. i7) has more throughput than the C2050 then?
Yes, but less than having 3 CPU jobs, and one GPU one. I did test once, and there isn't much penalty to running one recon-all job per core on a Nehalem system.
Right now, the CPU still does most of the work in the recon-all stream - it's something of a game of Amdahl's Law Wac-A-Mole. You could always try starting 4 GPU jobs at once.... I've not done the testing, but a C2050 would probably have enough RAM, and in any given recon-all run, the GPU does spend a lot of time idle. Hence, it would end up being divvied up between the four jobs.
Thanks, that's interesting to know.