[Mne_analysis] MNE-C script "mne_organize_dicom" and character encoding

Thu Aug 31 00:31:07 EDT 2017

Hi,

Script "mne_organize_dicom" has a problem with non-ASCII characters and
character encoding. DICOM file contents (including the MRI series name
which in some cases contain non-ASCII characters such as '°') are
usually encoded using Latin-1 in Western countries. On the other hand
all modern operating systems use UTF-8 encoding nowadays, and this is
especially true of system functions that create folders: they expect as
an argument the name of the folder encoded using UTF-8. The
"mne_organize_dicom" script fails to re-encode to UTF-8, actually it
doesn't address encoding at all. The folder creation fails because of
illegal folder name.

As a quick fix, I suggest changing line 64 of "mne_organize_dicom" from:

mne_dicom_essentials  --in $filename >>$$.one

to:

mne_dicom_essentials  --in $filename | iconv -f LATIN1 -t UTF-8 >>$$.one

On the long term it would be better to fix "mne_dicom_essentials"
directly and:

1. have "mne_dicom_essentials" detect the encoding used in DICOM files
(tag Specific Character Set (0008,0005)) or at least assume Latin-1
which should work in most cases,

2. have "mne_dicom_essentials" detect the encoding expected by system
functions or at least assume UTF-8 which should work in most cases,

3. convert between encodings.

Best,
Dimitri Papadopoulos