The Anterior Temporal Cortex in Action

Intelligent manipulation of handheld tools marks a major discontinuity between humans and our closest ancestors. Here we identied neural representations about how tools are typically manipulated within left anterior temporal cortex, by shifting a searchlight classier through whole-brain real action fMRI data when participants grasped 3D-printed tools in ways considered typical for use (i.e., by their handle). These neural representations were automatically evocated as task performance did not require semantic processing. In fact, ndings from a behavioural motion-capture experiment conrmed that actions with tools (relative to non-tool) incurred additional processing costs, as would be suspected if semantic areas are being automatically engaged. These results substantiate theories of semantic cognition that claim the anterior temporal cortex combines sensorimotor and semantic content for advanced behaviours like tool manipulation.


Introduction
The human ability to use tools (like using a knife for cutting) symbolises a great step in our evolutionary lineage (Ambrose, 2001), but the brain mechanisms underpinning this behaviour remain debated. Over the past decades, theoretical models (Lewis, 2006;Osiurak & Badets, 2016;Buxbaum, 2017) and neuroimaging evidence converge to propose that intelligent tool-use is the result of functionally interacting neural systems (for recent summaries see Garcea et al., 2014;Reynaud et al., 2016). One such neural system is the posterior parietal sensorimotor circuit proposed to perform conceptual processing about the objects during sensing and handling by cognitive embodiment theories (Allport, 1985;Mahon & Caramazza, 2008;Martin, 2016). The classic dual visual stream theory (Milner & Goodale, 1995) further incorporates visual ventrally located brain areas (e.g., Lateral Occipital Temporal Cortex) for perceiving tool properties (e.g., visual form, shape; Lingnau & Downing, 2015). Additional dual stream models describe the Inferior Parietal Lobule (IPL) and posterior Middle Temporal Gyrus (MTG) as neural sites that integrate information from sensorimotor and perceptual brain regions into a visuo-kinesthetic format relevant for tool manipulation (Rizzolatti & Matelli, 2003;Osiurak & Badets, 2016;Buxbaum, 2017). Most recently, focus has shifted toward the role of the anterior temporal cortex in tool-use (e.g., Lesourd et al., 2021), based on claims from semantic models that this area constitutes an amodal hub which weaves abstract conceptual representations (Lambon Ralph et al., 2017;Jefferies et al., 2020).
Each of these 'tool-use' brain regions have been identi ed by seminal picture-viewing neuroimaging studies (e.g., Chao et al. 1999). The involvement of posterior/inferior parietal and lateral occipital cortices initially suggested to code tool-related information by picture viewing studies has since been replicated by a small number of functional MRI (fMRI) experiments involving real tool manipulation (Valyear et al., 2012;Gallivan et al., 2013;Brandi et al., 2014;Styrkowiec et al., 2019;Knights et al., 2021; see Valyear et al. 2016 for a review). The anterior temporal cortex, however, has yet to be identi ed with real action tasks during which participants are asked to manipulate tools with their hands. This is at odds with traditional neuropsychology evidence showing that anterior temporal lobe degeneration in semantic dementia patients causes the loss of conceptual knowledge about everyday objects, despite retained shape processing and praxis (Hodges et al., 1992;Mummery et al., 2000). In fact, converging neuroimaging evidence shows that anterior temporal cortex represents conceptual information about tools, like the usual locations or functions associated with a tool, but these ndings are restricted to high-level cognitive tasks thought to rely on mechanisms distinct from real hand-tool manipulation (Goldenberg, 2017;Snow & Culham, 2021), such as picture recognition, language or pantomime (Kaleneine et al., 2009;Peelen & Caramazza, 2012;Ishibashi et al., 2018;Ishibashi et al., 2011;Marstaller et al., 2018;Chen et al., 2016).
Here, we applied whole-brain searchlights to an fMRI dataset where humans performed real hand actions with 3D-printed tools with their right-hand that were considered typical for tool-use (grasped by the handle) or not (grasped by the tool-head; Figure 1A/1C; re-analysis of Knights et al., 2021). By including biomechanically matched actions with control non-tools (grasping right vs. left) we could test which brain regions speci cally contained multivariate representations about the typicality of tool actions (like grasping a spoon by its handle) independently of kinematic differences between typical and atypical actions. In fact, in a control behavioural experiment, we examined if these tool and non-tool actions were appropriately matched for biomechanics by recording hand kinematic using high-resolution motioncapture during the same paradigm outside the MR environment ( Figure 1B).

Real Action fMRI Experiment
Whole-brain searchlight Multivoxel Pattern Analysis (MVPA) (Figure 2A) (Kriegekorte et al 2006;Smith & Goodale 2015) was used to identify the brain regions that represented how to appropriately grasp tools for use (i.e., by handle rather than tool-head). Speci cally, a stringent typicality difference map (Figure 2) was generated using a searchlight subtraction analysis that controlled for low-level hand kinematics: the multivariate decoding map of right vs. left grasps of control non-tools was subtracted from the decoding map of typical (right) vs. atypical (left) grasps of tools (see Methods). This difference map thus reveals which brain areas contain information about how to grasp tools correctly for subsequent use, independently of low-level differences between right vs. leftward grasping movements.
As presented in Figure 2, signi cantly higher decoding accuracy for tools than non-tools was observed in a large cluster (see Table 1 for cluster sizes) comprising an anterior portion of the left Superior and Middle Temporal Gyri (STG; MTG) that extended into the Parahippocampal Gyrus (PHG). Other clusters surviving correction for multiple comparisons included those within the right Fusiform Gyrus (FG) and anterior Superior Parieto-Occipital Cortex (aSPOC). No cluster of activity demonstrated higher decoding accuracy in the reverse direction, that is, for non-tools higher than tools. Behavioural Motion-Capture Experiment To better understand action processing speed for tools vs. non-tools, we measured hand kinematics with high-resolution motion-capture while participants performed the same task outside the MRI. As presented in Figure 3, analysis of reaction time (RT) and movement time ( Figure S1). No other signi cant main effects or any interaction between object category and typicality were found (all p's > 0.15). Importantly, this lack of interaction indicates that timing did not differ speci cally for grasping tools typically vs. atypically when compared to the matched movements with control non-tools.

Discussion
Our real action searchlight analysis presents the rst fMRI evidence that left anterior temporal cortex is sensitive to action information about tool movements during real 3D object manipulation ( Figure 2B). These results are in line with recent tool-use models (e.g., Lesourd et al., 2021) that include claims from semantic cognitive theories about the role of anterior temporal cortex in constructing abstract object representations (Schwartz et al., 2011;Lambon Ralph et al., 2017;Jefferies et al., 2020). According to these leading models, the anterior temporal cortex processes conceptual knowledge that is feature invariant (i.e., generalises across exemplar identities) like the typical way tools are handled for use (i.e., grasp tool by its handle), as demonstrated here.
Anatomically, the reported neural region peaks here are near anterior temporal lobe clusters known to code semantics during tool pantomimes (Chen et al., 2016) or tool manufacturing (Putt et al., 2017). The peak location of these regions is further along the posterior axis than reported during object knowledge association tasks (e.g., Peelen & Caramazza, 2012), but general standardised neuropsychological tests of associative knowledge have been reported at comparable locations (e.g., Visser et al. 2012). Implementing specialised distortion correction (Embleton et al. 2010) or an increased eld of view (Visser et al., 2010) will be useful for future fMRI studies to address whether areas further along the temporal pole also code information about tool manipulation. As for the left lateralisation of this effect, this resembles a popular model of the left hemisphere tool processing networks (Lewis, 2006) and is in line with the fact that all movements were performed with the right-hand during our study. Moreover, left lateralised anterior temporal responses have been reported for semantic language processing (Visser & Lambon Ralph, 2011) which, when considered alongside our results, resembles the prevalent view across philosophy (Montagu, 1976), and more recently neuroscience (e.g., Stout & Charminade, 2012;Thibault et al., 2021), that language and motor skills are tightly linked.
Remarkably, this tool-related semantic content was detectable even when task performance was independent to tool conceptual processing. That is, unlike prior tasks that have asked participants to explicitly attend to different tool associations [e.g., pantomiming actions related to scissors vs. pliers (Chen et al. 2016) or recalling if a tool is typically found in the kitchen vs. garage (Peelen & Caramazza, 2012)], our participants were simply instructed to grasp the 'left' or 'right' side of the stimuli and, throughout all aspects of experimentation (see Methods), the stimuli were purposefully referred to as 'objects' (rather than 'tools'). Since participants were not required to form intentions of using these tools, or even process their identities, our results therefore demonstrate that tool representations are automatically triggered. Like similar ndings (e.g., Rizzolatti et al., 1988;Tucker & Ellis, 1998;Valyear et al., 2012), this automaticity supports in uential affordance theories (Gibson, 1979;Cisek, 2007;Bach et al., 2014;Buxbaum, 2017) which predict that merely viewing objects potentiates action. Our results provide evidence of this phenomena for humans at a ne spatial resolution (e.g., compared to the Bereitschaftspotential; Shibaski & Hallet, 2006) and during realistic object manipulation (i.e., for tool-use that are directly viewed without the use of mirrors).
Representations about actions with tools also extended into the fusiform and medial parieto-occipital cortex ( Figure 2B), consistent with previous views that these areas code semantics, due to either showing crossmodal responses (e.g., reading tool words and viewing tool pictures; Devlin et al. 2005;Binder et al., 2009;Fairhall & Caramazza, 2013) or representing learnt object-associations (Liuzzi et al., 2020). In fact, our results are in line with the hub-and-spoke theory (Patterson, et al., 2007) suggesting that these two domain-general systems (e.g., for perception or action, Milner & Goodale, 2006) may act as spokes to a left anterior temporal cortex 'hub' when automatically processing learnt tool movements. Indeed, fusiform cortex is well known for processing perceptual information about object form (e.g., Kourtzi et al. 2005) whilst visuomotor computations are attributed to SPOC (e.g., Prado et al., 2005) and both regions are sensitive to prior experience, such as for processing typical action routines (Scholz et al., 2009;Rossit et al., 2013) or object functions (Weisberg et al. 2006). Alternatively, these regions could be implicated in networks supporting inference about object properties and their relationship to the laws of physics (e.g., Osiurak & Badets, 2016;Fischer et al., 2016;Schwettmann et al., 2019), though this account does not necessarily preclude a role of the anterior temporal cortex in the semantic aspects of tool-use.
Consistent with the neural differences observed by contrasting actions with tool and non-tool objects ( Figure 2B), our behavioural motion-capture results similarly demonstrated slower overall responses for grasping tools than non-tools (Figure 3). From an experimental perspective, the nding of a general object category main effect independent of reach direction indicates that the biomechanics for actions involving the handle and head of the tools were appropriately matched. In other words, basic kinematic differences between different actions cannot simply explain the tool-speci c decoding. Considered theoretically, the observed faster non-tool responses are consistent with many accounts describing how tool-related actions are achieved via psychological (e.g., Arbib, 1981;Rumiati & Humphreys, 1998;Christensen et al., 2019) and neural (e.g., Jeannerod et al., 1995;Milner & Goodale, 1995;Johnson-Frey, 2003;Young, 2006;Bub, Masson & Cree, 2008;Buxbaum, 2017) mechanisms that are distinct from those used for basic motor control. Similar slowing for tools has been observed in simple button-press RT experiments when comparing pictures of tools and of simple shapes (Vingerhoets et al., 2009) or other object categories (e.g., natural objects; Borghi et al., 2007). As with our ndings, these simple RT effects are thought to be caused by the interference from the additional processing of competing (yet task irrelevant) functional associations that are automatically triggered by viewing tools (e.g., Cisek & Kalaska, 2010;Jax & Buxbaum, 2010).
By virtue of the grasping paradigm used here, our results are unable to capture which brain regions represent real tool-use (like scooping with a spoon). Our grasping paradigm ensured that biomechanical properties of the movements were tightly controlled across conditions (e.g., by specifying grip points), but ongoing work in our laboratory is extending these paradigms to real tool-use with more variable degrees of freedom. Further, additional functional connectivity approaches utilising Dynamic Causal Modelling (DCM) (e.g., Tak et al., 2021) will be suited to deepen our understanding of the relationship between the anterior temporal cortex and other systems proposed to support tool-use. For example, DCM could be used to determine whether, as predicted by hub-and-spoke theory (Lambon Ralph, Jefferies, Patterson & Rogers, 2017) left anterior temporal cortex in uences ventral visual stream activity in a bidirectional manner.
Altogether, neural representations were detected for the rst time in anterior temporal areas that leading theories of semantic cognition claim to build rich amodal relationships about objects and their uses. By observing the automaticity of these task-irrelevant effects across both behaviour and the brain, our results begin to uncover which, as well as how, speci c brain regions have evolved to support e cient tool-use, a de ning feature of our species. Methods fMRI Participants Nineteen healthy participants (10 male; mean age = 23 +/-4.2 years; age range, 18-34 years, described in Knights et al. (2021), performed the fMRI real action experiment, with each providing written consent in line with procedures approved by the School of Psychology Ethics Committee at the University of East Anglia.

Apparatus & Stimuli
The 3D-printed kitchen tool and biomechanically matched non-tool bar objects were adapted from Brandi et al. (2014) (Figure 1A). As in Knights et al. (2021), the dimensions of each non-tool were matched to one of the tools, such that variability was minimized and kinematic requirements were as similar as possible between different grasps (i.e., left vs right and small vs large), including controlling for low-level shape features that can confound tool-effects, like elongation (Sakuraba et al., 2012). The MR-compatible turntable apparatus used to present the 3D objects ( Figure 1A) and its setup are described in , including the use of an upper-arm restraint and industry standard cushioning to minimise the risk of motion artefacts.

Experimental design
A powerful block-design fMRI paradigm ) maximised the contrast-to-noise ratio, to generate reliable estimates of average voxel response patterns (Fig. 1C), while also improving the detection of blood oxygenation level-dependent (BOLD) signal changes without signi cant interference from artefacts during overt movement (Birn et al., 2004). Brie y, a block began with an auditory instruction ('Left' or 'Right') and participants grasped the object during 10 second ON-block when the object was brie y illuminated using a right-handed precision grip (i.e., index nger and thumb) along the vertical axis. Throughout experimentation (i.e., consent materials, training instructions) the stimuli were referred to as 'objects' such that participants were naïve to the study's purpose of examining typical versus atypical tool actions.

Acquisition
The BOLD fMRI measurements were acquired using a 3T wide bore GE-750 Discovery MR scanner. To achieve a good signal to noise ratio during the real action fMRI experiment, the posterior half of a 21channel receive-only coil was tilted and a 16-channel receive-only ex coil was suspended over the anterior-superior part of the skull (see Fig. 1B). A T1-weighted (T1w) anatomical image was acquired with BRAVO sequences, followed by T2*-weighted single-shot gradient Echo-Planer Imaging (EPI) sequences for each block of the real action experiment, using standard parameters for whole-brain coverage (see Knights et al., 2021).

Data Preprocessing
Preprocessing of the raw functional datasets and ROI de nitions were performed using BrainVoyager QX [version 2.8.2] (Brain Innovation, Maastricht, The Netherlands). Anatomical data were transformed to Talairach space and fMRI time series were pre-processed using standard parameters (no smoothing) before being coaligned to an anatomical dataset (see Knights et al., 2021). For each block of interest, and each single run independently, the timeseries were subjected to a general linear model with predictors per condition of interest, as to estimate activity patterns for searchlight MVPA (6 tool and 6 non-tools blocks per run). A small number of runs with movement or eye errors were removed from further analysis (see Knights et al., 2021).

Searchlight Pattern Classi cation
Searchlight MVPA (Kriegeskorte et al., 2006) was performed independently, per participant, for tool and non-tool trial types using separate linear pattern classi ers (linear support vector machines) that were trained to learn the mapping between a set of brain-activity patterns (β values computed from single blocks of activity) and the type of grasp being performed with the tools (typical vs atypical) or non-tools (right vs left). A cube mask (5 x 5 x 5 voxel length, equal to 125 voxels) was shifted through the entire brain volume, applying the classi cation procedure at each centre voxel (Smith & Goodale, 2015) to measure the accuracy that a given cluster of activity patterns could be used to discriminate between the different tool, or non-tool, actions.
To test the performance of our classi ers, decoding accuracy was assessed using an n-fold leave-onerun-out cross-validation procedure; thus, our models were built from n -1 runs and were tested on the independent nth run (repeated for the n different possible partitions of runs in this scheme; Duda et al., 2001;Smith and Muckli, 2010;Smith and Goodale, 2015;Gallivan et al., 2016) before averaging across n iterations to produce a representative decoding accuracy measure per participant and per voxel. Searchlight analysis space was restricted to a common group mask within Talairach space, de ned by voxels with a mean BOLD signal > 100 for every participant's fMRI runs to ensure that all voxels included in searchlight MVPA contained suitable activation. Beta estimates for each voxel were normalized (separately for training and test data) within a range of -1 to 1 before input to the SVM (Chang and Lin, 2011), and the linear SVM algorithm was implemented using the default parameters provided in the LibSVM toolbox (C= 1). Pattern classi cation was performed with a combination of inhouse scripts (Smith and Muckli, 2010;Smith and Goodale, 2015) implemented in Matlab using the SearchMight toolbox (Pereira & Botvinick, 2011).

Statistical Analysis
Voxel accuracies from searchlight MVPA for each participant were converted to unsmoothed statistical maps. To assess where in the brain coded information about typicality, we used a paired samples t-test approach: non-tool accuracy maps were subtracted from the tool accuracy map, producing single participant typicality difference maps (i.e., tool > non-tool) where it was tested, at the group-level, whether the difference in decoding accuracies for tools vs. non-tools was greater than zero at each voxel. The BrainVoyager cluster-level statistical threshold estimator (Goebel et al., 2006;Forman et al., 1995) was used for cluster correction (voxelwise thresholds were set to p = 0.01 and then the cluster-wise thresholds were set to p < .05 using a Monte Carlo simulation of 1000 iterations), before projecting results on to a standard surface (Xia et al., 2013).

Behavioural Control Experiment
Participants Twenty-two right-handed (Edinburgh Handedness Questionnaire;Old eld, 1971) healthy volunteers completed the motion-capture experiment (6 males, 19-29 years of age, mean age = 22.3, SD = 2.4). Ten participants had completed the previous fMRI experiment. All had normal or corrected-to-normal vision, no history of motor, psychiatric or neurological disorders and gave informed consent in accordance with the ethical committee at the University of East Anglia.

Apparatus & Stimuli
Stimuli were the same 3D-printed objects used in the fMRI experiment. A Qualisys Oqus (AB, Gothenberg, Sweden) sampling at 179 Hz, measured the position of small passive markers a xed to the participants' right wrist and the nails of the right index nger and thumb ( Figure 1B). The MR-compatible turntable apparatus was setup in the motion-capture laboratory identically to the fMRI experiment. This included using the same distances between the resting hand and object centre (43cm) and the centrally aligned red xation LED (subtending a mean visual angle of ~20° from the centre of stimuli), as well as requiring a comparable head tilt (~20°). The two minor differences between the MR and motion-capture environments was that for motion-capture there was no arm-strap or eye-monitoring cameras (though participants completed the same pre-experiment training and received verbal reminders between experimental blocks to maintain xation and to minimise upper arm movements) and the use of noise cancelling headphones (Bose Corporation, USA) to ensure that the sound of stimulus placement did not provide cues about an upcoming trial.

Experimental Design
Experimental designs were almost identical across the fMRI and behavioural control experiments. The rst difference was that the elements critical for modelling the haemodynamic response (baseline periods between trials) during fMRI were omitted were not carried out in this behavioural experiment: Second, an additional block was collected due to the risk of excluding trials due to marker-occlusion. On average participants completed seven runs (minimum six, maximum seven) totalling 84 experimental trials and 21 repetitions per condition per participant.
Kinematic data were obtained by localising the x, y and z positions of the markers attached to the index nger, thumb and wrist of the participants' right hand ( Figure 1B). These 3D positions were ltered using a low-pass Butterworth lter (10 Hz-cut-off, 2nd order). Wrist marker position determined movement onoffset (velocity-based criterion = 50mm/s) and, in the case that these value was never exceeded, the local minimum of the velocity trace was used as the offset of the outward reaches (Quinlan & Culham, 2015).
Trial-level reach kinematic dependent variables (Reaction Time, Movement Time, Peak Velocity and time to Peak Velocity; RT; MT; PV; tPV) were computed per the ve grasping repetitions and subsequently collapsed. The grand mean, per participant, for the four conditions were retained after removing problematic trials (2.62%) based on the following cases: marker occlusion (2.09%), incorrect object presentation (0.04%) and participant responses that were extremely slow (0.11%; i.e., >1000ms) or in the wrong direction (0.38%).

Statistical Analysis
Repeated measures ANOVAs were used to compare behavioural performance across conditions in a 2 (object category: tools vs. non-tools) x 2 (typicality: typical vs. atypical) factorial design.

Code Availability
Computer code for running the experiments and analysis of the fMRI (https://osf.io/zxnpv) and behavioural datasets are accessible from the Open Science Framework (https://osf.io/etyqs/). S.R., E.K. and F.W.S. conceptualised the study; E.K. and S.R. collected data; E.K., F.W.S and S.R. analysed data; E.K., F.W.S. and S.R. wrote the manuscript; S.R. and F.W.S. acquired funding. Figure 1 (A) fMRI Experiment. Participants laid under a custom-built MR-compatible turntable where 3D-printed tool and non-tool stimuli were presented within reaching distance in a block-design. (B) Motion-Capture Experiment. As a behavioural control experiment, participants performed this paradigm in a motioncapture laboratory to measure kinematics with infrared-re ective (IRED) markers a xed to the hand. (A & B). During the experiments, the rooms were completely dark, objects were visible only when illuminated, all actions were performed with the right-hand only and participants were naïve to study goals (i.e., they were asked to grasp right or left side of objects without mentioning we were investigating tools or typicality manipulation). (A) Wholebrain searchlight classi cation. For each participant, brain activation patterns were extracted from a mask (single blue cube) that was shifted through the entire fMRI volume. Decoding accuracy was measured with independent linear pattern classi ers for tool (top row) and non-tool actions (bottom row) that were trained to map between brain-activity patterns and the type of grasp being performed with the tools (typical vs atypical) or non-tools (right vs left). Typicality difference maps were produced by subtracting the decoding accuracy maps for tools and non-tools, as well as chance-level accuracy (50%). (B) Searchlight Results. The group typicality difference map demonstrated clusters in the left anterior temporal cortex, as well as right medial parietal and fusiform areas, where decoding accuracies were signi cantly higher for actions with tools (typical vs. atypical grasps) than non-tools (biomechanically equivalent right vs. left grasps).

Figure 3
Behavioural Results. Hand kinematics differed between object categories: participant's RTs and MTs were slower when grasping tools, relative to non-tools. Error bars represent standard error of the mean.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.