Open reviews

May 19, 2018 — Open review of preprint by Esteban and colleagues: review

Esteban, O., Markiewicz, C. J., Blair, R. W., Moodie, C. A., Isik, A. I., Erramuzpe, A., Kent, J. D., Goncalves, M., DuPre, E., Snyder, M., Oya, H., Ghosh, S. S., Wright, J., Durnez, J., Poldrack, R. A., Gorgolewski, K. J. (2018). FMRIPrep: a robust preprocessing pipeline for functional MRI. bioRxiv, 306951. DOI


Esteban and colleagues introduce fMRIPrep, a robust, state-of-the-art preprocessing pipeline for BIDS-formatted fMRI data. FMRIPrep uses Nipype to construct a workflow adapted to the structure of your dataset (specified in the BIDS metadata), drawing largely on existing analysis tools at each step (e.g., FSL, ANTs, FreeSurfer, AFNI). I think FMRIPrep is a solution to an underappreciated problem: idiosyncratic lab-, and often study-specific in-house pipelines, which tend toward overfitting, require considerable manual intervention, and impede reproducibility. The manuscript is well-written and describes both the software and the motivations for using it well. I have a few comments—largely non-critical—related more to features of the software and documentation than the manuscript itself, as well as some language suggestions.

Major comments:

In Figure 5, the histogram on the right was a bit confusing to me at first. If I understand correctly, the histogram relates to the vertical color bar, so the heavy tail for feat indicates more yellow (high variability) voxels in the brain mask(?). I assume that voxels are what’s being counted and binned in the histogram? Also why is the main peak of the distribution for fMRIPrep somewhat higher variability than the main peak for feat? An additional explanatory sentence in the caption could help clear this up.

I imagine a majority of researchers will be using fMRIPrep on a server at their institution (not locally), and therefore using Singularity. Not sure about the authors’ preferred method for creating an fMRIPrep Singularity image, but the “installation” documentation for this seems a bit arcane (or outdated), as something simple like “singularity build fmriprep.sqsh docker://poldracklab/fmriprep” does the trick.

Also related to using fMRIPrep on a remote server: I think for some people it’s not immediately obvious how to efficiently run a browser (locally) to view files living on a server. For example, one way is to use Mac OSX Finder’s Go > “Connect to Server…” functionality, but there are several ways—might be worthwhile to provide a pointer to help new users past this potential hurdle.

What are reasons for NOT using fMRIPrep? (Aside from the issues you mention at line 328.) I genuinely tried to think hard on this, but all I could come up with is: 1. If you really want “infinite” flexibility (which is obviously a double-edged sword); 2. If you want students to suffer through implementing each step for didactic purposes, or to learn shell-scripting or Python along the way; 3. If you’re trying to reproduce some in-house lab pipeline. I’m not sure any of these are particularly good reasons, but it’s a question I’ve been asked—useful to consider.

It’d be nice to estimate the prospective memory footprint (and/or runtime) when initializing fMRIPrep. Not sure if this is feasible (or if it’s already happening and I just didn’t realize), but it seems possible given how stereotyped the pipeline is and how much is already specified in the BIDS metadata. Would be useful when submitting jobs that require you to provision these things ahead of time (e.g., via Slurm).

Any plans to incorporate options for study-specific anatomical template construction or EPI template normalization (via Calhoun et al., 2017)? Never tried these personally, but may be of interest to some.

More BIDS-related, but seems like the automated methods section (plus references) could also basically write the “Image acquisition” section based on BIDS metadata.

Maybe I missed this, but is the code for benchmarking fMRIPrep against FSL’s feat (plus flame) available online somewhere? (i.e., relating to Figures 5 and 6)

Minor comments:

Line 74: “fMRI for analysis” > “fMRI data for analysis” reads better to me

Line 79: “composed by” > “composed of”

Table 1: “State-of-art” > “State-of-the-art”

Line 123: I think you want the word “template” here instead of “atlas”

Line 143: “Such cortical mask” > “Such a cortical mask” or “This cortical mask”

Line 180: Not sure it’s fair to say afni_proc.py only runs on volumetric data. It can also operate on surfaces (example 8 in afni_proc.py documentation), but you have to provide existing FreeSurfer surfaces—it doesn’t invoke FreeSurfer to construct them for you.

Line 194: “implements “fieldmap-less” SDC to the BOLD images” sounds strange to me. Maybe “applies … to the …” instead

Line 209: “proposed as a mean” > “proposed as a means”

Line 285: “without smoothing step” > “without the smoothing step”

Figure 3: “robuster” > “more robust” in bottom example text

Figure 6 caption: “fMRIPrep allows the researcher for a finer control” > “fMRIPrep affords the researcher finer control”

Thanks to Matteo Visconti di Oleggio Castello and Yarik Halchenko for help using fMRIPrep, as well as Mark Pinsk and members of the Hasson and Norman Labs at Princeton for helpful discussion. Full disclosure: I’m historically an AFNI user and less familiar with the FSL and ANTs functionality.

References:

Calhoun, V. D., Wager, T. D., Krishnan, A., Rosch, K. S., Seymour, K. E., Nebel, M. B., … & Kiehl, K. (2017). The impact of T1 versus EPI spatial normalization templates for fMRI data analyses. Human Brain Mapping, 38(11), 5331-5342.