October 29, 2020 — Open review of preprint by Di and Biswal:
Di, X., & Biswal, B. B. (2020). Dissecting individual differences in responses to naturalistic stimuli in functional MRI: effects of development and gender. bioRxiv.
Di and Biswal use PCA to decompose responses to a naturalistic movie into multiple shared components, and relate these components to demographic and behavioral variables. This method can be used to “discover” multiple groups or trends in the subject sample that might be obscured in leave-one-out (LOO) intersubject-correlation (ISC) analysis. From the perspective of naturalistic neuroimaging and ISC analysis, I think this is a useful piece, and the point about limitations of LOO ISC is well-made. I think the piece could benefit from expanding on the relation to pairwise ISC, examining how the number of subjects contributes to a component, and clarifying a couple other elements of the analysis pipeline. I try to offer some suggestions where the figures could be improved and where the writing needs some work.
Although the benefit of this method over LOO ISC is convincing, I’m having a more difficult time seeing the benefit over the approach discussed by Finn et al., 2020. In that approach, they’re comparing the full pairwise ISC matrix to a pairwise behavioral (or demographic) similarity matrix (in a style similar to RSA). The behavioral similarity matrix can accommodate all sorts of multidimensional effects, and you can compare (or combine) multiple behavioral similarity matrices (e.g. using regression). The PCA approach discussed here is closely related (eigendecomposition of the covariance matrix), but provides a more data-driven way to “discover” components of the pairwise ISC matrix that may not be directly coded into the behavioral or demographic similarity matrices. On the other hand, the way each of these PCs has a corresponding time series we can visualize is unique with respect to Finn’s method. I think it would be worth expanding a bit on the relationship between these two methods either in the Introduction or Discussion.
At line 285 you mention that for a given PC “the differences in variance explained may reflect the number of subjects represented in each group.” This gets at one of my core questions. In pairwise ISC, one or a handful of oddball subjects will pop out as stripes in the pairwise ISC matrix. In your PCA method, an oddball subject will get roughly their own PC with a low eigenvalue and will probably be ignored. How many subjects should an effect span before it floats to the top as a “significant” PC? I wonder if bootstrap resampling subjects could give us an idea which PCs are stable, even if they rely on fewer subjects (but this would require matching PCs across bootstrap samples). Alternatively, it could be interesting to simulate a simple dataset and vary the number of subjects with a given developmental effect. PCA is a fairly simple transformation, so this question probably has a simple answer—but I think it would be worth spending a bit more timing giving readers a better intuition on this.
I’m not very familiar with the group ICA implementation in GIFT, so I’m not entirely sure how to interpret the subject-wise ICs you feed into the PCA method (the X matrix in equation 3). To get subject-wise ICs, the “time series was back reconstructed to each subject for each IC.” So, the IC for each subject is that subject’s data projected onto an IC that was defined at the group level—is that correct? And the variability across subjects in your X matrix for a given IC reflects both spatial (topographic) and temporal variability in how each subject’s data projects on the group-level IC? With this in mind, could some variability in the subject-specific ICs simply be due to spatial misalignment when normalizing children and adults to the same template? (Presumably these ICs capturing this kind of uninteresting variance are among the 5 manually discarded?) Probably best to explain this in a little more in depth for readers (like me) who aren’t very familiar with the fMRI ICA literature.
After reading the Introduction and seeing Figure 1, I was expecting to see some explicit comparison of your PCA method with LOO ISC in the movie-watching dataset. The average time series across subjects (or N – 1 subjects) used in the LOO ISC computation will be very similar to PC1 in ICs with only one significant PC; and a combination of the top PCs in e.g. IC17. Presumably it also correlates with age. I don’t think this analysis/visualization is strictly necessary, but it could be a nice way to reinforce the point made in Figure 1 (at the authors’ discretion).
I think Figure 1 makes a compelling case for the utility of this method. However, in some cases, the axes (and labels) are a bit confusing. For example, should we think of the y-axis in column one (hypothetical function) as the magnitude of your “a” parameter on the consistent term “c”, such that each subject (depending on their age) will have a different weight “a”? In panel C, do the two colors effectively correspond to two subgroups in your sample? You guide us a bit in the figure caption, but I think some of these things could be made more explicit. I would note that these plots (e.g. the pairwise ISC matrices in panels D–F) correspond to a single neural variable (e.g. voxel, ROI, or IC). I would also consider increasing the size of the tick labels a bit so they’re more visible.
When statistically evaluating the variance explained by PCs, you generate a permutation distribution by randomly shuffling the time series. Shuffling the time series like this will probably abolish certain properties of the original time series (e.g. temporal autocorrelation), making your permutation distribution qualitatively different from the actual time series. Permuting the time series using either circular time-shift randomization (e.g. Kauppi et al., 2014) or phase randomization (e.g. Lerner et al., 2011) will better preserve properties of the original time series and may result in a more conservative distribution.
In section 3.4 and Figure 8, you seem to use “correlation” and “variance explained” interchangeably. Does the y-axis in Figure 8 denote R-squared for each model?
I’m not sure I fully agree with the framing in the second paragraph of the Introduction (starting line 55). The first two sentences create the impression that cortical areas with high ISCs (e.g. perceptual areas) will have low individual differences, and individual differences may live in cortical areas with low ISCs. I would be a little more careful in writing this. Perceptual areas with high ISCs may also have very reliable individual differences (e.g. Feilong et al., 2018), and other areas with low ISC may not reflect individual differences; i.e. they may simply not be engaged by the stimulus.
Future work might also look more closely at these PC time series in hopes of understanding what they encode. For example, you could use the “reverse correlation” approach to qualitatively explore whether peaks in the PC time series correspond to particular events of the movie. Or, more formally, you could fit an encoding model to these PC time series. Not suggesting the authors perform these analyses, but it might be worth pointing people toward this direction (e.g. in the Discussion).
The title of the paper seems to focus on the question of age and gender differences during naturalistic movie-watching. However, (at least in my reading) the main purpose of the paper is to develop a new method for disentangling multiple shared response components that capture individual differences, where age and gender are really just convenient variables we can use for demo-ing this new method.
In the manuscript, you mention “supplementary materials” (line 309) and supplementary figures S3 (line 338) and S4 (line 379); but I don’t see any supplementary files from the journal. I found a supplementary document on bioRxiv, but some of the supplementary figures appear to now be reported in the main text (probably just a versioning issue).
Line 20: I would probably word “projecting the relations into a 1-D space” as “projecting the relations onto a single dimension”
Line 21: “whether there are single or multiple” > “whether there are one or more” or just “whether there are multiple”
Line 31: “And the PCA-based” > “This PCA-based”
Line 41: I’m not sure about this “separated by age” phrasing; maybe just “related to age”
Line 80: Am I correct in thinking the two formulations are (roughly) the same because of the proportional relationship between “c” and “id”?
Line 83: “developments” > “development”
Line 88: “the age effects highly nonlinear” > “the nonlinearity of age effects”
Line 115: “carrying” > “carrying out”
Line 121: I think with two experimentally-defined groups, the group comparison is relatively straightforward in typical ISC analysis: you can compute LOO ISCs separately for the two groups, or use the pairwise ISCs according to Chen et al., 2016.
Line 147: When you say “3 subjects with no slice gap and 26 subjects with a 10% slice gap,” should this account for all subjects? Or is this only referring to adult subjects? What gap did the other 53 subjects have?
Line 166: Does your high-pass filter (cutoff: 1/128 Hz) also include linear and quadratic trends?
Lines 169–171: This is redundant with the same sentences in the previous paragraph.
Line 205: “noises” > “noise”
Line 244: remove “than random” (also line 247)
Line 279: I don’t understand this sentence; do you mean “In the current case, 10 out of 15 networks yielded a 2nd PC [or more than one PC] that explained greater than chance-level variance”?
Figure 3: I would use a color map with 15+ different colors (or shades of the same color) so that we don’t see repeated colors for different IC networks (for example, in seaborn, I would use “hls” or “cubehelix”; not sure what the analogous colormaps are in Matlab). It would also be nice if these colors matched the colors used to plot the ICs on the brain in Figure 2; or group/order them according to the groups in Figure 2.
Line 297: “order” > “older”
Line 301: “two PC scores” > “two PCs”
Line 324: “kinds” > “kind”
Line 325: Remind readers that by “fitted well” you mean the model with the lowest AIC.
Line 327: I’m confused about what you mean by “90% of the total response”; do you mean 90% of the maximum PC loading?
Figure 7: Indicate in the figure caption that this “significant” difference for IC2 does not withstand correction for multiple tests.
Line 360: remove “much”
Line 362: reword this sentence
Figure 9: Is the bottom overlap map thresholded?
Chen, G., Shin, Y. W., Taylor, P. A., Glen, D. R., Reynolds, R. C., Israel, R. B., & Cox, R. W. (2016). Untangling the relatedness among correlations, part I: nonparametric approaches to inter-subject correlation analysis at the group level. NeuroImage, 142, 248-259.
Feilong, M., Nastase, S. A., Guntupalli, J. S., & Haxby, J. V. (2018). Reliable individual differences in fine-grained cortical functional architecture. NeuroImage, 183, 375-386.
Finn, E. S., Glerean, E., Khojandi, A. Y., Nielson, D., Molfese, P. J., Handwerker, D. A., & Bandettini, P. A. (2020). Idiosynchrony: from shared responses to individual differences during naturalistic neuroimaging. NeuroImage, 116828.
Kauppi, J. P., Pajula, J., & Tohka, J. (2014). A versatile software package for inter-subject correlation based analyses of fMRI. Frontiers in Neuroinformatics, 8, 2.
Lerner, Y., Honey, C. J., Silbert, L. J., & Hasson, U. (2011). Topographic mapping of a hierarchy of temporal receptive windows using a narrated story. Journal of Neuroscience, 31(8), 2906-2915.