Figure 1A shows the reconstructed anatomical locations of the arrays (Montreal Neurological Institute (MNI) coordinates in Table 1) and the average normalized net responses of all visually-responsive channels to the intact versus scrambled stimuli (classic LOC stimuli and naturalistic LOC images). The significantly stronger responses to intact images of objects compared to scrambled ones demonstrate that all arrays were located in shape-sensitive cortex, in agreement with Decramer et al. 25 However, it should be noted that there is diversity in our findings across the four arrays. While the stronger responses to intact images compared to scrambled ones are observed in most arrays, for array 3, this statement only holds true for the classic localizer, and in array 1, the selectivity is minor. One possible reason for this variability is that the localizer stimuli were not optimal for each array. The stimuli presented during the localizer task may not have fully captured the preferred shapes or specific categories for each array. Had the arrays been presented with optimal intact and scrambled stimuli tailored to their specific preferences, the differences in selectivity among the arrays may have been more pronounced.
Table 1
MNI coordinates of Utah arrays
ARRAYS | X | Y | Z |
1 | 42 | -76 | -1 |
2 | -35 | -89 | -8 |
3 | -41 | -83 | 9 |
4 | -38 | -84 | -5 |
Single – channel responses reveal tuning complexity
We recorded from 237 visually responsive MUA sites (array 1: 51, array 2: 94, array 3: 27, array 4: 65) and 332 visually responsive LFP sites (high – gamma, 60–120 Hz; array 1: 85, array 2: 96, array 3: 86, array 4: 65). First, we determined the selectivity for shape, for category and any shape-category interactions (Fig. 1B) using 2 – way ANOVA on the net MUA and LFP responses (see Methods). Figure 2 shows the MUA (Fig. 2A, B and C) and LFP (Fig. 2D, E, and F) responses for six (three MUA sites and three LFP sites) example channels. The first example channel (recorded in array 2, Fig. 2A) responded strongly to several shape types (e.g. shape type 5,6 and 8), but much less to other shape types (e.g. shape type 7 and 9, main effect of shape pshape = 0.0001). The different categories within each shape type evoked similar responses in this MUA site (pcategory = 0.52, pinteraction= 0.65, Supplementary table 1 for details on statistics). The robust shape selectivity and lack of category selectivity were also evident in the average responses of the LFP example site (recorded in array 2) (Fig. 2D). In contrast, the example site in Fig. 2B (recorded in array 3) responded strongly to certain exemplars of the category ‘animals’ (those from shape types 5 and 6), which represents a significant shape x category interaction (p = 0.0007) with a weak main effect of category (p = 0.026) and no significant main effect of shape (p = 0.06, Supplementary table 1). The shape x category interaction effect was even more pronounced in the high – gamma example site than in the MUA example site (eta2MUA = 0.07, eta2LFP = 0.19, Fig. 2E and Supplementary Table 1). Finally, the example site shown in Fig. 2C (from array 2) displayed stronger neural responses to certain members of a particular shape type (e.g. ‘Fruits’ for shape type 6), which constituted another type of interaction between shape and category (p = 0.000), combined with a main effect of shape (p = 0.00002), but no significant effect of category (p = 0.46, Supplementary table 1). These interactions could be due to selectivity for the specific exemplar (e.g., the fruit for shape type 6 is a bunch of grapes), to subtle differences between the members of the same shape or category in their shape and category properties, or due to variations in other dimensions such as variations in contour or texture. Overall, these results suggest that while shape selectivity is a dominant feature of the visual responses in the sites of human occipitotemporal cortex that we sampled, interactions between shape and category were also observed in a subset of neural sites.
To illustrate the shape and category responses of all visually-responsive channels, Figs. 3A and B show an overview of the z-scored responses (see Methods) per array at the MUA and LFP level, respectively. We ordered the channels from top to bottom based on their selectivity as determined in the 2-way ANOVA with factors shape type and category: channels indicated by the blue bracket showed a main effect of shape type only, channels indicated by the yellow bracket showed a main effect of category only, and channels with the green bracket showed a significant shape type x category interaction (sometimes in combination with a main effect of shape type and/or category). The channels below the green bracket were visually-responsive but did not show any significant effect in the two-way ANOVA. The order of the columns (from left to right) was determined based on the average response of all visually-responsive channels across each array separately. The plots ordered according to shape type (left panels in Fig. 3A and B) clearly illustrate that our stimulus set evoked strong MUA and LFP responses on a large number of recording channels. Additionally, the stimulus selectivity was relatively broad for all arrays (FigS1) (median Swidth MUA: sarray1 = 0.69, sarray2 = 0.62, sarray3 = 0.86, sarray4 = 0.7, median Swidth LFP: sarray1 = 0.5, sarray2 = 0.52, sarray3 = 0.69, sarray4 = 0.52).
Visual inspection does not suggest a clear preference for specific shape types in any of the arrays. When plotting the responses according to category (right panels in Fig. 3A and B), the results were qualitatively similar, except for the category ‘animals’ in array 3, which clearly evoked strong responses to a subset of shape types belonging to this category, as illustrated in the example channels in Fig. 2B and 2D. To investigate the overall shape type or category preference for each array more quantitatively, we averaged the MUA and high – gamma responses across all visually-responsive channels (Fig. 4A). Arrays 1, 2 and 4 responded significantly less to shape types 7, 8 and 9 (which were characterized by a lower surface area and high aspect ratio), whereas for array 3, the MUA response to the category ‘animals’ was significantly higher compared to the other categories (Fig. 4A). The high-gamma responses ranked according to shape type (Fig. 3B left panel) appeared very similar to the MUA responses, which was supported by the significant correlations between MUA and high-gamma responses for all arrays (Fig. 4B). When plotted according to category, the high gamma responses of array 3 contained an even more pronounced preference for the category ‘animals’ than the MUA responses (Fig. 3B and eta2 values in Fig S2B).
Further analysis of all individual visually-responsive electrodes (using two-way ANOVA with factors shape type and category) confirmed the high diversity of neural tuning for shape type and category. At the MUA level, the highest number of channels showed a significant interaction between shape type and category for all arrays (Fig. 3C). More specifically, out of the 237 visually responsive MUA sites, 39 sites (16%) were significantly selective for the shape type dimension alone, merely 8 sites (3%) showed a significant main effect of category alone, compared to 114 sites (48%) with interactions between shape type and category (chi2 = 143, p < 0.0001). At the LFP level, we also observed mainly shape type selectivity and shape-category interactions, although Array 1 and Array 2 showed more channels with a significant main effect of shape type (chi2 = 6.8, p < 0.0001). In two arrays, the proportion of significant shape type x category interactions was significantly higher in the MUA (27 and 63% for array 1 and 2, respectively) compared to the LFP responses (12 and 22% for array 1 and 2, respectively; array 3 had a similar proportion of interactions in MUA and LFP, and for array 4 the LFP signal was of low quality).
To test the effect sizes for shape type and category, we compared the eta2 of all sites with significant effects (Fig. S2). Overall, the eta2 values for shape type were higher than for category in array 1, 2 and 4, and this difference in eta2 was more pronounced for sites displaying a main effect of shape. Interestingly, in arrays 1, 2, and 4, even for channels with only a significant interaction or with both significant shape and category main effects, eta2 was significantly stronger for shape type compared to category. However, this was not the case for the shape type x category interaction channels of array 3, where both shape and category effect sizes were similarly strong.
Dissimilarity analysis suggests that shape type is the dominant representation in all arrays
The average response across individual channels can exhibit weak category selectivity, but the categorical structure of the stimulus set may also appear in the pattern of activity distributed across the entire neuron population. 20 Therefore, we investigated how information about shape type and category was represented in the multichannel activity patterns. Per pair of stimuli, we correlated the spatial multi-channel response pattern for each microarray (see Methods). The resulting dissimilarity matrices (1 – correlation, Fig. 5A) were correlated with behavioral dissimilarity matrices for the shape type and category dimensions as well as with the physical dissimilarity matrix based on the silhouettes (Fig. 5B) by means of Representational Similarity Analysis (RSA). 26 For all microarrays, the multi-channel analysis revealed significant shape-based and silhouette representations in the MUA responses, but no significant correlation with the category matrix (Fig. 5C and Table 2). At the LFP level, we observed similar results for array 3 and 4 (Figure S3), but array 1 only correlated significantly with the silhouette dissimilarity matrix and array 2 only with the shape dissimilarity matrix (Table S2 and Fig S3). Thus, the multichannel response pattern of all 4 arrays in LOC was predominantly shape-type. Moreover, the neural (MUA) dissimilarity matrices correlated significantly with both the perceptual and the physical dissimilarities. Interestingly, these population-level analyses suggest no contribution of category similarity, while the aforementioned single-channel analyses revealed many sites with an interaction between shape and category tuning.
Table 2
Results of Representational Similarity Analysis (RSA) conducted on the MUA neural dissimilarity matrices. The following key measures are reported: Rho (Pearson Correlation): Rho represents the Pearson correlation coefficient, quantifying the similarity between the neural dissimilarity matrices and the behavioral dissimilarity matrices ; p: The p-value associated with the correlation coefficient, indicating the level of statistical significance.
ARRAYS | Category | Shape | Silhouette |
1 | Rho = 0.02, p = 0.27 | Rho = 0.1, p = 0.00 | Rho = 0.15, p = 0.00 |
2 | Rho = 0.02, p = 0.27 | Rho = 0.11, p = 0.00 | Rho = 0.10, p = 0.00 |
3 | Rho = 0.002, p = 0.45 | Rho = 0.2, p = 0.00 | Rho = 0.18, p = 0.00 |
4 | Rho = 0.03, p = 0.16 | Rho = 0.18, p = 0.00 | Rho = 0.17, p = 0.00 |
Next, we visualized the representation of the stimuli in the neural spaces of each array using MDS on the dissimilarity values. The 2D solutions of the MDS are shown in Fig. 6. To evaluate the presence of clustering in each dimension, the stimuli were color coded according to shape type (top row of Fig. 6) and semantic category (bottom row of Fig. 6). As an additional step to verify the existence of shape and/or category clusters within each array, we applied agglomerative hierarchical cluster analysis (Fig.S5). Shape clustering was evident with both methods in arrays 1, 2, and 4, with aspect ratio as an important factor mainly in array 1 and 2, while the MDS solution color-coded based on category did not exhibit a clear clustering. Array 3, on the other hand, did not exhibit strong clustering for the shape dimension, but when color-coded according to category, three exemplars of the category "animals" (rabbit, owl, and fish) were clearly separated from the other stimuli (see Fig S4 for the LFP results, where a similar observation is made). The hierarchical cluster analysis corroborated this observation, since a subset of animal exemplars clustered together in the neural space of Array 3. Overall, these findings are consistent with the shape-based representations we found in the multivariate correlation analysis, but they also suggest the presence of some additional category information in array 3.
Linear decoders detect reliably both category and shape information
The MDS analysis offers a representation of the stimuli in a limited number of dimensions in the neural space of the recorded population, but a decoder can utilize all the multidimensional information in a population. Moreover, decoding can be performed over time, which can also give insight into the temporal dynamics of the neural responses. Therefore, we trained linear Support Vector Machines on the neural responses per array in 100 ms bins (sliding window of 50 ms), and tested on each time bin of individual trials whether we could correctly classify either the shape type or the category. Figure 7A illustrates the temporal evolution of the normalized decoding accuracy at the MUA level (as described in the Methods section) for the two decoders (shape type and category). The decoding accuracy was normalized by subtracting the chance level accuracy, where the chance level represents the expected accuracy by random chance. In all 4 arrays, we could reliably decode shape type starting as early as 75 ms after stimulus onset for array 1, compared to 100 ms for array 2, and 200 ms after stimulus onset for arrays 3 and 4 (Fig. 7A). Furthermore, and in line with the previous analyses, array 3 also showed significant classification of category information, which was predominantly restricted to the "animals" category (see confusion matrix in Fig. 7B). Remarkably however, despite the presence of primarily shape type representations on the other arrays, we also obtained significant classification of category on arrays 1, 2 and 4, which emerged almost simultaneously with the shape type classification. Thus, although neither individual channels nor the multichannel response pattern appeared to furnish any category information, a population of shape-selective neurons in human visual cortex contained reliable information about object category (Fig S6 for LFP decoding).
To further investigate the predominant association of category information with the "animals" category, we conducted additional analyses by removing the "animals" category and performing the decoding again (Fig. S7). The decoding accuracy for arrays 1 and 2 at both the MUA and LFP levels remained unaffected. However, a noticeable decline in both accuracy and significance was observed for array 3 at both the MUA and LFP level. These findings were consistent with the observations from the confusion matrices (Fig. 7B, S6B), emphasizing that the category information was predominantly restricted to the "animals" category for array 3.
Lastly, we assessed the generalization of the decoders over time (Fig. 7C). The shape and category decoders were trained using 100 ms time windows, and then tested on every 100 ms window that followed or preceded the training bin. Each window was then shifted by 50 ms. The decoding accuracy of array 2 generalized over the entire stimulus duration for both shape type and category, suggesting a very stationary population representation emerging early after stimulus onset, while arrays 1, 3 and 4 exhibited a more transient generalization of the classifier. At the high-gamma frequency range (as depicted in Fig. S6), we observed, on average, highly similar decoding performance, albeit with lower levels of accuracy.