BATCH: brun_fastsurfer.sh

Usage

$ ./brun_fastsurfer.sh --help
Script to run FastSurfer on multiple subjects in parallel/series.

Usage:
brun_fastsurfer.sh --subject_list <file> [other options]
OR
brun_fastsurfer.sh --subjects <subject_id>=<file> [<subject_id>=<file> [...]] [other options]
OR
brun_fastsurfer.sh [other options]

Other options:
brun_fastsurfer.sh [...] [--batch "<i>/<n>"] [--parallel_subjects [surf=][<N>]]
    [--run_fastsurfer <script to run fastsurfer>] [--statusfile <filename>] [--debug] [--help]
    [<additional run_fastsurfer.sh options>]

Author:   David Kügler, david.kuegler@dzne.de
Date:     Nov 6, 2023
Version:  1.0
License:  Apache License, Version 2.0

Documentation of Options:
Generally, brun_fastsurfer works similar to run_fastsurfer, but loops over multiple subjects from
i. a list passed through stdin of the format (one subject per line)
---
<subject_id>=<path to t1 image>[ <subject-specific parameters>[ ...]]
...
---
ii. a subject_list file using the same format (use Ctrl-D to end the input), or
iii. a list of subjects directly passed (this does not support subject-specific parameters)

--batch "<i>/<n>": run the i-th of n batches (starting at 1) of the full list of subjects
  (default: 1/1, == run all). "slurm_task_id" is a valid option for "<i>".
  Note, brun_fastsurfer.sh will also automatically detect being run in a SLURM JOBARRAY and split
  according to $SLURM_ARRAY_TASK_ID and $SLURM_ARRAY_TASK_COUNT (unless values are specifically
  assigned with the --batch argument).
--parallel_subjects [surf=|seg=][<n>]: parallel execution multiple subjects.
  * --parallel_subjects [<n>]: a total of n parallel processes (or num_subjects, if <n> is not
    provided). The default is serial execution, or '--parallel_subjects 1'.
  * --parallel_subjects seg=[<m>]: activate parallel execution of segmentation.
  * --parallel_subjects surf=[<n>]: activate parallel execution of surface reconstruction.
    Note, to avoid out-of-memory errors when parallelizing the segmentation using
    --parallel_subjects on gpus, calculate the "safe" number of concurrent segmentations and
    specify "<m>" with the flag '--parallel_subjects seg=<m>' in addition to
    '--parallel_subjects surf=<n>' (together max. m+n processes will run, n/m may be "max").
  Logfiles unchanged, console output for individual subjects is interleaved with subject_id prepended.
--run_fastsurfer <path/command>: This option enables the startup of fastsurfer in a more controlled
  manner, for example to delegate the fastsurfer run to container:
  --run_fastsurfer "singularity exec --nv --no-home -B <dir>:/data /fastsurfer/run_fastsurfer.sh"
  Note, paths to files and --sd have to be defined in the container file system in this case.
--statusfile <filename>: a file to document which subject ran successfully. Also used to skip
  surface recon, if the previous segmentation failed.
--threads <n>|seg=<n>|surf=<n>: specify number of threads for each parallel "processes", e.g.
  --parallel_subjects seg=2 --threads seg=2 will start 2 parallel processes with 2 threads each.
--debug: Additional debug output.
--help: print this help.

With the exception of --t1 and --sid, all run_fastsurfer.sh options are supported, see
'run_fastsurfer.sh --help'.

This tool requires functions in stools.sh (expected in same folder as this script).

Subject Lists

The input files and options may be specified in three ways:

  1. By writing them into the console (or by piping them in) (default) (one case per line),

  2. by passing a subject list file --subject_list <filename> (one case per line), or

  3. by passing them on the command line --subjects "<subject_id>=<t1 file>" [more cases] (no additional options supported).

These files/input options will usually be in the format <subject_id>=<t1 file> [additional options], where additional options are optional and enable passing options different to the “general options” given on the command line to brun_fastsurfer.sh. One example for such a case-specific option is an optional T2w image (e.g. for the HypVINN). An example subject list file might look like this:

001=/data/study/raw/T1w-001.nii.gz --t2 /data/study/raw/T2w-001.nii.gz
002=/data/study/raw/T1w-002.nii.gz --t2 /data/study/raw/T2w-002.nii.gz
002=/data/study/raw/T1w-003-alt.nii.gz --t2 /data/study/raw/T2w-003.nii.gz
... 

Parallelization with brun_fastsurfer.sh

brun_fastsurfer.sh has powerful builtin parallel processing capabilities. These are hidden underneath the --parallel_subjects [seg=|surf=][<n>] and the --device <device> as well as --viewagg_device <device> flags. One of the core properties of FastSurfer is the split into the segmentation (which uses Deep Learning and therefore benefits from GPUs) and the surface pipeline (which does not benefit from GPUs). For ideal batch processing, we want different resource scheduling.

--parallel_subjects allows three parallel batch processing modes: serial, single parallel pipeline and dual parallel pipeline.

Serial processing (default)

Each case/image is processed after the other fully, i.e. surface reconstruction of case 1 is fully finished before segmentation of case 2 is started. This setting is the default and represents the manual flags --parallel_subjects 1.

Single parallel pipeline

This mode is ideal for CPU-based processing for segmentation. It will process segmentations and surfaces in series in the same process, but multiple cases are processed at the same time.

$FASTSURFER_HOME/brun_fastsurfer.sh --parallel_subjects 4 --threads 2 --parallel

will start 4 segmentations (and surface reconstructions) at the same time, and will start a fifth, when the surface processing of one of the four first cases is finished (--parallel_subjects 4). It will try to use 2 threads per case (--threads 2) and perform reconstruction of left and right hemispheres in parallel (--parallel). --parallel_subjects max or --parallel_subjects will remove the limit and start all cases at the same time (each with the target number of threads given in --threads).

Dual parallel pipeline

This is ideal for GPU-based processing for segmentation. It will process segmentations and surfaces in separate pipelines, which is useful for optimized GPU loading. Multiple cases may be processed at the same time.

$FASTSURFER_HOME/brun_fastsurfer.sh --device cuda:0-1 --parallel_subjects seg=2 --parallel_subjects surf=max \
  --threads seg=8 --threads surf=2 --parallel

will start 2 parallel segmentations (--parallel_subjects seg=2) using GPU 0 for case 1 and GPU for case 2 (--device cuda:0-1 – same as --device cuda:0,1). After one of these segmentations 1 is finished, segmentation of case 3 will start on that same device as well as the surface reconstruction (without putting a limit on parallel surface reconstructions, --parallel_subjects surf=max). Each segmentation process will aim to use 8 threads/cores (--threads seg=8) and each surface reconstruction process will aim to use 2 threads (--threads surf=2) with both hemispheres processed in parallel (--parallel).

Note, if your GPU has sufficient video memory, two parallel segmentations can run on the same GPU, but the script cannot schedule more than one process per GPU for multiple GPUs, i.e. --device cuda:0,1 --parallel_subjects seg=4 is not supported.

Questions

Can I disable the progress bars in the output?

You can disable the progress bars by setting the TQDM_DISABLE environment variable to 1, if you have tqdm>=4.66.

For docker, this can be done with the flag -e, e.g. docker run -e TQDM_DISABLE=1 ..., for singularity with the flag --env, e.g. singularity exec --env TQDM_DISABLE=1 ... and for native installations by prepending, e.g. TQDM_DISABLE=1 ./run_fastsurfer.sh ....