Open data


Narratives: fMRI data for evaluating models of naturalistic language comprehension

Alt text
Ethel Franklin Betts (1908), from The Orphant Annie Book, by James Whitcomb Riley (wikimedia)

OpenNeuro ds002345  OpenNeuro DataLad DOI

The “Narratives” collection aggregates fMRI datasets acquired over the course of seven years (2011–2018) while participants listened to spoken stories. In aggregate, participants listened to 27 diverse stories ranging from ~3 to ~56 minutes for a total of ~4.6 hours of unique audio stimuli. The collection currently includes 345 unique subjects participating in a total of 891 functional scans with accompanying anatomical data. Data are organized into a machine-readable format according to the BIDS standard with exhaustive metadata derived from the original DICOMs. Anonymized subject labels are linked across sessions and include demographic and behavioral variables including age, gender, condition, and comprehension score. Auditory stimuli are included in the dataset for non-commercial scholarly research—principally feature extraction—under fair use or fair dealing provisions. The data collection amounts to over 350,000 functional volumes of story-listening fMRI data and accompanying stimuli, totaling 6.4 days. The scripts used to collate and process these data are available at the GitHub repository. Slides for a presentation of this dataset at SfN 2019 are available on Google Slides. The public data release is accompanied by a data descriptor paper currently in preparation. If you find this dataset useful, please cite the following:

Nastase, S. A., Liu, Y.-F., Hillman, H., Zadbood, A., Hasenfratz, L., Keshavarzian, N., Chen, J., Honey, C. J., Yeshurun, Y., Regev, M., Nguyen, M., Chang, C. H. C., Baldassano, C., Lositsky, O., Simony, E., Chow, M. A., Leong, Y. C., Brooks, P. P., Micciche, E., Choe, G., Goldstein, A., Vanderwal, T., Halchenko, Y. O., Norman, K. A., & Hasson, U. (2020). Narratives: fMRI data for evaluating models of naturalistic language comprehension. bioRxiv. DOI PDF

This dataset has been re-analyzed in the following publications:

Nastase, S. A., Liu, Y. F., Hillman, H., Norman, K. A., & Hasson, U. (2020). Leveraging shared connectivity to aggregate heterogeneous datasets into a common response space. NeuroImage, 116865. DOI

Chen, H.-Y. S., & Honey, C. J. (2020). Constructing and forgetting temporal context in the human cerebral cortex. Neuron, 106. DOI

Nastase, S. A., Gazzola, V., Hasson, U., & Keysers, C. (2019). Measuring shared responses across subjects using intersubject correlation. Social Cognitive and Affective Neuroscience, 14(6), 667–685. DOI

Lin, X., Sur, I., Nastase, S. A., Divakaran, A., Hasson, U., & Amer, M. R. (2019). Data-efficient mutual information neural estimator. arXiv, arXiv:1905.03319. DOI