Re: [Freesurfer] CUDA Error - all CUDA-capable devices are busy or unavailable

19 Aug 2010

      Actually I believe it's related with the CUDA architecture itself.
You cannot run multiple CUDA process at the same time in the same GPU, in a
multi-core environment it may happen if you start multiple recon-all.
---------------------------------------------------------------------
Pedro Paulo de Magalhães Oliveira Junior
Diretor de Operações
Netfilter & SpeedComm Telecom
-- www.netfilter.com.br
-- For mobile: http://www.netfilter.com.br/mobile
On Thu, Aug 19, 2010 at 13:57, Nick Schmansky nicks@nmr.mgh.harvard.eduwrote:
...
hello cuda beta users!  this problem 'all CUDA-capable
devices are busy or unavailable.', seems to fall into the category of
'post-release curse', because i am seeing this problem locally as well,
but havent seen it in the months we've been using the gpu code.  we have
found rebooting the machine seems to work, but thats not a real
solution.  i suspect our detection scheme is tripping a flag in the
driver thats not getting untripped or cleared the next time around.
when we find a better solution, we'll post new _cuda libs on our site,
which i'm expecting will be a regular occurrence over the next few
months.  glad to see so many willing gpu users though!
n.
On Thu, 2010-08-19 at 17:08 +0200, Daniel Guellmar wrote:
...
Hi folks,
I'm trying to employ the new cuda binaries, which come with freesurfer
version 5.0.0, however, if I'm trying to execute a cuda binary (e.g.
mri_ca_register_cuda) I get the following output:
Acquiring CUDA device
 Using default device
 CUDA Error in file 'devicemanagement.cu' on line 46 : all CUDA-capable
devices are busy or unavailable.
This error occurs on two different systems which are cuda capable. Both
systems run with Ubuntu 9.10, both have the latest developer driver for
linux (256.40) and the latest cuda toolkit (3.1) on it. The GPU
Computing SDK code samples compile and work fine. The device query on
both hosts work fine ... see following output
Host 1:
CUDA Device Query (Runtime API) version (CUDART static linking)
There are 2 devices supporting CUDA
Device 0: "Tesla C2050"
  CUDA Driver Version:                           3.10
  CUDA Runtime Version:                          3.10
  CUDA Capability Major revision number:         2
  CUDA Capability Minor revision number:         0
  Total amount of global memory:                 2817720320 bytes
  Number of multiprocessors:                     14
  Number of cores:                               448
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per block:           1024
  Maximum sizes of each dimension of a block:    1024 x 1024 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Clock rate:                                    1.15 GHz
  Concurrent copy and execution:                 Yes
  Run time limit on kernels:                     Yes
  Integrated:                                    No
  Support host page-locked memory mapping:       Yes
  Compute mode:                                  Default (multiple host
threads can use this device simultaneously)
  Concurrent kernel execution:                   Yes
  Device has ECC support enabled:                Yes
Device 1: "Tesla C2050"
  CUDA Driver Version:                           3.10
  CUDA Runtime Version:                          3.10
  CUDA Capability Major revision number:         2
  CUDA Capability Minor revision number:         0
  Total amount of global memory:                 2817982464 bytes
  Number of multiprocessors:                     14
  Number of cores:                               448
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per block:           1024
  Maximum sizes of each dimension of a block:    1024 x 1024 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Clock rate:                                    1.15 GHz
  Concurrent copy and execution:                 Yes
  Run time limit on kernels:                     Yes
  Integrated:                                    No
  Support host page-locked memory mapping:       Yes
  Compute mode:                                  Default (multiple host
threads can use this device simultaneously)
  Concurrent kernel execution:                   Yes
  Device has ECC support enabled:                Yes
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.10, CUDA
Runtime Version = 3.10, NumDevs = 2, Device = Tesla C2050, Device =
Tesla C2050
Host 2:
CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA
Device 0: "Tesla C1060"
  CUDA Driv  CUDA Runtime Version:                          3.10
  CUDA Capability Major revision number:         1
  CUDA Capability Minor revision number:         3
  Total amount of global memory:                 4294770688 bytes
  Number of multiprocessors:                     30
  Number of cores:                               240
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       16384 bytes
  Total number of registers available per block: 16384
  Warp size:                                     32
  Maximum number of threads per block:           512
  Maximum sizes of each dimension of a block:    512 x 512 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             256 bytes
  Clock rate:                                    1.30 GHz
  Concurrent copy and execution:                 Yes
  Run time limit on kernels:                     No
  Integrated:                                    No
  Support host page-locked memory mapping:       Yes
  Compute mode:                                  Default (multiple host
threads can use this device simultaneously)
  Concurrent kernel execution:                   No
  Device has ECC support enabled:                No
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.10, CUDA
Runtime Version = 3.10, NumDevs = 1, Device = Tesla C1060
Any comments on that?
Regards and thanks in advance,
Daniel
--
Dr.-Ing. Daniel Güllmar
Medical Physics Group / IDIR I
Jena University Hospital
MRT-Gebäude am Steiger
Philosophenweg 3
07743 Jena
Tel: +49-3641-9-35373
Fax: +49-3641-9-35081
www: http://ww.mrt.uni-jena.de
____________________
Universitätsklinikum Jena
Körperschaft des öffentlichen Rechts und Teilkörperschaft der
Friedrich-Schiller-Universität
Jena Bachstraße 18, 07743 Jena
Verwaltungsratsvorsitzender: Prof. Dr. Thomas Deufel; Medizinischer
Vorstand: Prof. Dr. Klaus Höffken;
Wissenschaftlicher Vorstand: Prof. Dr. Klaus Benndorf; Kaufmännischer
Vorstand und Sprecher des Klinikumsvorstandes Rudolf Kruse
Bankverbindung: Sparkasse Jena; BLZ: 830 530 30; Kto.: 221;
Gerichtsstand Jena
Steuernummer: 161/144/02978; USt.-IdNr. : DE 150545777
_______________________________________________
Freesurfer mailing list
Freesurfer@nmr.mgh.harvard.edu
https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer

Freesurfer mailing list
Freesurfer@nmr.mgh.harvard.edu
https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
The information in this e-mail is intended only for the person to whom it
is
addressed. If you believe this e-mail was sent to you in error and the
e-mail
contains patient information, please contact the Partners Compliance
HelpLine at
http://www.partners.org/complianceline . If the e-mail was sent to you in
error
but does not contain patient information, please contact the sender and
properly
dispose of the e-mail.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Freesurfer] CUDA Error - all CUDA-capable devices are busy or unavailable