OpenMP parallelization

List overview All Threads
Download

newer

older

Re: [Freesurfer] mris_label2annot...

freesurfer labels to binary masks...

Akio Yamamoto

25 Jun 2012 25 Jun '12

10:39 a.m.

FreeSurfer experts,

I need to make the process of each one brain 10-20 times faster somehow - one way should be parallelization approach.

Currently I'm trying to add OpenMP parallelization to time-consuming part of the source code, especially mri_ca_register and mri_em_register.

Not completed yet, but at this point of time, I can not see the speed-up in proportion to the number of CPU cores; it's just x2.5 speed-up using 8 or 16 cores.

I'm afraid there might be fundamental limitations in algorithm and/or implementation of the code. Should I proceed with this work?

Any advice, help or comment would be appreciated.

Akio

Show replies by date

Bruce Fischl

25 Jun 25 Jun

10:40 a.m.

Hi Akio

we have made some progress on this, but it is different for different algorithms. If you get the current dev codebase you'll find some examples of MPI pragmas.

cheers Bruce On Mon, 25 Jun 2012, Akio Yamamoto wrote:

...

FreeSurfer experts,

I need to make the process of each one brain 10-20 times faster somehow

one way should be parallelization approach.

Currently I'm trying to add OpenMP parallelization to time-consuming part of the source code, especially mri_ca_register and mri_em_register.

Not completed yet, but at this point of time, I can not see the speed-up in proportion to the number of CPU cores; it's just x2.5 speed-up using 8 or 16 cores.

I'm afraid there might be fundamental limitations in algorithm and/or implementation of the code. Should I proceed with this work?

Any advice, help or comment would be appreciated.

Akio

Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer

Nick Schmansky

10:56 a.m.

aiko,

in particular, grep on HAVE_OPENMP in the c files in dev/utils. eg gcamorph.c., also dev/mri_ca_register/mri_ca_register.c. we havent done anything with em reg, but welcome any improvements you can make per what you see with how ca_reg was done.

this is the pattern of speed improvements we've seen with openmp:

https://surfer.nmr.mgh.harvard.edu/fswiki/CaRegTimings

note that the nehalem/sandybridge/(newest) architecture is essential for this improvement, as it accesses scattered memory structures much more efficiently.

nick

...

Hi Akio

we have made some progress on this, but it is different for different algorithms. If you get the current dev codebase you'll find some examples of MPI pragmas.

cheers Bruce On Mon, 25 Jun 2012, Akio Yamamoto wrote:

...
FreeSurfer experts,

I need to make the process of each one brain 10-20 times faster somehow

one way should be parallelization approach.

Currently I'm trying to add OpenMP parallelization to time-consuming part of the source code, especially mri_ca_register and mri_em_register.

Not completed yet, but at this point of time, I can not see the speed-up in proportion to the number of CPU cores; it's just x2.5 speed-up using 8 or 16 cores.

I'm afraid there might be fundamental limitations in algorithm and/or implementation of the code. Should I proceed with this work?

Any advice, help or comment would be appreciated.

Akio

Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer

Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer

5136

Age (days ago)

5136

Last active (days ago)

freesurfer@nmr.mgh.harvard.edu

2 comments

3 participants

tags (0)

participants (3)

Akio Yamamoto
Bruce Fischl
Nick Schmansky