hi andrew,
i haven't done many systematic comparisons but there are a few practical considerations to take into account. in recon-all, i believe a few steps are affected by the openmp option and that creates resource underutilization. processors run idle when those steps are not being run.
in my ad hoc analysis i have found i can process a single subject in about 5 hours using openmp 8, but that holds up 8 processors for that subject. the same subject can be processed in about 12 hours one 1 processor. say i have 16 processors, i can process 16 subjects in say 12 hours using 1 processor per recon. however, using 8 per recon would take about 40 hours, 2 subjects every 5 hours.