Greetings,
I've been running more extensive tests on mri_em_register_cuda, trying
to determine why it sometimes gives substantially different results to
mri_em_register. The problem appears to stem from the construction of
the transform matrix on the CPU and GPU. This occurs slightly
differently on each platform, leading to inevitable differences in
output. These can be large enough to 'capture' the search to a different
transform.
If you are encountering problems with mri_em_register_cuda, I would
suggest making the following changes to mri_em_register.c:
At the start of the file (around line 48), there are two lines which
read
#define FAST_TRANSLATION 1
#define FAST_TRANSFORM 1
change these to
#define FAST_TRANSLATION 0
#define FAST_TRANSFORM 0
and recompile.
This will cause the program to compute all of the transforms on the CPU,
and only evaluate the energy function on the GPU. Runtime will increase
from around 30 seconds to around 4 minutes.
Regards,
Richard