Hi, We have encountered a serious problem running recon-all
FREESURFER_HOME: /opt/freesurfer-530
Build stamp: freesurfer-Linux-centos6_x86_64-stable-pub-v5.3.0
RedHat release: CentOS release 6.5 (Final)
Kernel info: Linux 2.6.32-431.el6.x86_64 x86_64
The entire command-line executed: /opt/freesurfer-530/bin/recon-all -use-gpu -all -s surface_1 The problem happened when it came to this point mri_ca_register_cuda -nobigventricles -T transforms/talairach.lta -align-after -mask brainmask.mgz norm.mgz /opt/freesurfer-530/average/RB_all_2008-03-26.gca transforms/talairach.m3z
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2012 NVIDIA Corporation Built on Fri_Sep_21_17:28:58_PDT_2012 Cuda compilation tools, release 5.0, V0.2.1221
Driver : 6.0 Runtime : 5.50
Acquiring CUDA device Using default device CUDA device: Tesla K20m not handling expanded ventricles... using previously computed transform transforms/talairach.lta renormalizing sequences with structure alignment, equivalent to: -renormalize -regularize_mean 0.500 -regularize 0.500 using MR volume brainmask.mgz to mask input volume... reading 1 input volumes... logging results to talairach.log
======= NUMBER OF OPENMP THREADS = 1 ======= reading input volume 'norm.mgz'... reading GCA '/opt/freesurfer-530/average/RB_all_2008-03-26.gca'... label assignment complete, 0 changed (0.00%) det(m_affine) = 1.22 (predicted orig area = 6.5) freeing gibbs priors...done. average std[0] = 5.0 AllocateHost: Warning - not thread safe! RecvAll: Did not reset gca in dst GCAMcopyNodePositions: On GPU RecvAll: Did not reset gca in dst GCAMcopyNodePositions: On GPU RecvAll: Did not reset gca in dst
The same error message kept on repeating and, then, it said ============================================= GPU MRI Label timers --------------------
MarkLabelBorderVoxels Total : -nan ms (avg) 0 ms (tot) VoxInLabelWithPartialVolume Compute : -nan ms (avg) 0 ms (tot) Total : -nan ms (avg) 0 ms (tot) ============================================= CUDA Error in file 'mriconvolve_cuda.cu' on line 945 : unload of CUDA runtime failed. Abort (core dumped) ERROR: mri_ca_register with non-zero status 134 but continuing despite the error
We have tried the same data with two different GPUs (Tesla K20m and Quadro K4000). They both had the same problem.
Any suggestions?
Thanks, -- Kevin Tran, Contractor UNIX System Administrator The National Institutes of Health NIMH, Laboratory of Brain and Cognition 10 Center Drive, Room 4C215 Bethesda MD 20892-1366