Details on the provenience, the sex and age of the specimens in our samples, and the micro-CT systems used to scan them, are provided in Table S1. Among the 25 southern African early hominins, one juvenile (DNH 34) and one adult (DNH 60) specimen from Drimolen were previously unpublished, and one adult (KW 9900) and three juvenile (KW 9600, KW 9700, KW 10840) *P. robustus* specimens from Kromdraai Unit P were described only for their cochlear shape4. The DNH 34 juvenile status is indicated by the degree of the opening of its subarcuate fossa (5.6%) which is equivalent to that measured in the KW 9600 and KW 10840 specimens4. Here, we describe these six new specimens from Drimolen and Komdraai Unit P by focusing on their taxonomically diagnostic features. Among the other fossil hominins in our sample, all have already been published. However, the BM of some of these published specimens is newly reconstructed and investigated in the present study. This is the case for one *P. robustus* (SK 83) and two *A. africanus* (Taung child, MLD 31) specimens. As demonstrated both previously4 and in the present study, there are no differences in BM size and shape between adult and juvenile specimens among hominid or hominin species. We therefore combine juvenile and adult specimens in our samples.

The majority of the sample is represented by the right BMs. In cases of damage of the right side, the left BM is reconstructed after mirroring, with no loss of size and shape information during this process. Mesh surfaces of each BM were obtained in Avizo Standard 8.1.1 (www.thermofisher.com) from micro-CT data resliced in a plane that best fitted the horizontal SCC. The measurement protocol of the SCC began with the computation of the centerlines of the SCCs from the surface models with the ‘Skeletonization’ pack and the ‘Autoskeleton” module. A set of five landmarks was digitized on each centerline skeleton (Fig. 1a) and the placement error was estimated (see below). Landmarks 1 and 2 (Ld1 and Ld2, respectively) are located on the horizontal SCC (HSC) and correspond to the intersections between the plane that best fit the anterior SCC (ASC) and, respectively, the lateral and medial ends of the HSC’s centerline. Landmark 3 (Ld3) is located on the center of the ampulla of the posterior SCC (PSC). Landmark 4 (Ld4) is the bifurcation point of the common crus and landmark 5 (Ld5) is the center of the ampulla of the ASC. We also placed one landmark at the apex of the cochlear curve (Landmark 6, Ld6).

We then assess the size and shape of each BM either by a geometric morphometric method (GMM) (see below), by a deformation-based method from computational anatomy (see below), or by using 35 variables that include one area, six arc lengths, six linear distances (line segments), 12 indices and 10 angles detailed in Table S3. Among these 35 variables, five have been defined in previous studies (Table S2). All these 35 variables but one (OWA) are measured directly on each centerline skeleton by using the best-fitting planes of the HSC (HSCP) and ASC (ASCP), and the surface mesh. All these measurements are provided in Supplementary Material (Data S1.xls).

We measure the external cochlear length (ECL) and the oval window area (OWA) by following the measurement protocol described in references 16, 26 and 27. We also measure the transverse labyrinthine index (TLI), the inclination of the ampular line and the cochlear basal turn relative to the orientation of the horizontal (or lateral) semi-circular canal (APA < LSCm and COs < LSCm, respectively), as defined in reference 13. Three curves are digitized on the HSC (between Ld1-Ld2), the PSC (between Ld3-Ld4) and the ASC (between Ld4-Ld5) in order to measure the three corresponding arc lengths (HSCL, PSCL and ASCL, respectively). On the PSC, we measure the arc lengths below and above the plane that best fit the HSC (PSCL.B and PSCL.A, respectively). We also measure six line segments (linear distances) between the following landmarks: (i) Ld3-Ld6 (HELPAM), (ii) Ld4-Ld6 (HELCRS), (iii) Ld5-Ld6 (HELAAM), (iv) Ld3-Ld5 (or “ampular line”) (PAMAAM), (v) Ld4-Ld3 (joining two extremities of the PSC) (CRSPAM), (vi) Ld4-Ld5 (CRSAAM).

We compute the following 11 indices: (i) the ratio dividing the arc length of the PSC (arc length between Ld3-Ld4) situated below and above the HSC (or Posterior Semi-circular Canal index 1, PSCI1), (ii) the ratio dividing the line segment (linear distance) between Ld3 and the plane best-fitting HSC (HSCP) by the Ld3-Ld4 line segment (PSCI2), (iii) the ECL/HSCL, ECL/PSCL, ECL/ASCL, HSCL/PSCL, HSCL/ASCL, PSCL/ASCL indices, (iv) the HEL1 ratio between HELPAM (Ld3-Ld6) and HELCRS (Ld4-Ld6), (v) the HEL2 ratio between HELPAM (Ld3-Ld6) and HELAAM (Ld5-Ld6), (vi) the CRS2 ratio between CRSPAM (Ld3-Ld4) and PAMAAM (Ld3-Ld5). Finally, we measure the following eight angles: (i) between the Ld2-Ld4 and CRSPAM (Ld4-Ld3) line segments (CRS1), (ii) between the PAMAAM and CRSPAM line segments (CRS3), (iii) between the Ld1-Ld2 and Ld2-Ld4 line segments (inclination of the common crus) (CRS4), (iv) between the Ld3-Ld2 and Ld2-Ld4 line segments (inclination of the common crus) (CRS5), (v) between the Ld2-Ld1 and Ld1-Ld5 line segments (inclination of the anterior ampulla) (AAM1), (vi) ) between the Ld4-Ld1 and Ld1-Ld5 line segments (inclination of the anterior ampulla) (AAM2), (vii) between the Ld1-Ld2 and Ld1-Ld6 line segments (inclination of the cochlea) (CO1), (viii) between the Ld1-Ld5 and the Ld1-Ld6 line segments (inclination of the cochlea) (CO2).

In order to assess measurement errors, each variable was measured twice on all the fossil specimens with more than a one-day interval between each trial. For each BM, the repeats cluster closely together. The distance between the repeats of the same BM are always significantly closer to one another than they are to any other specimen. We also observe a high reproducibility between the repeats (Wilcoxon signed-rand tests, p< 0.01), including for OWA that requires the most careful consideration (Fig. S4).

For GMM, we used the R packages Morpho 4.0.5 (see https://cran.r-project. org/web/packages/Morpho/index.html) and Geomorph version 4.0.0 (see https:// cran.r-project.org/package=geomorph). We considered four varying sets of semilandmarks (1, 7, 18 or 48) to represent each SCC. Therefore, we conducted four different analyses by using a total of either 8, 26, 59 or 149 (semi)-landmarks (Fig. S1).

We first performed a Generalized Procrustes analysis (GPA) (with scaling). Following superimposition, we summarized variation in shape space by using a Principal Components Analysis (PCA) and a between-group PCA (bgPCA). We use a permutation test (or Monte-Carlo test, or randomization test) in the R ade4 package (see https://cran.r-project.org/web/packages/ade4/index.html) in order to assess the statistical significance of the bgPCA. The statistical significance is evaluated with the ‘randtest.between’ function by simulating 999 permutations.

To complete our analysis, we additionally analyze our sample using a deformation-based method from computational anatomy. Instead of relying on sparse corresponding features such as GMM, this method estimates and analyzes dense deformations (diffeomorphisms) between non-homologuous curve and shapes. For pairwise alignment of the SCCs, we employ deformetrica 4.0 (www.deformetrica.org)35. The alignment is driven by the metric of currents, which enables a comparison of non-homologous curves. The difference between the SCCs of two specimens is then modelled as the amount of diffeomorphic (smooth and invertible) deformation needed to align them. A more detailed description of the method and application to the shape analysis of fossil hominins can be found in reference 11. Each diffeomorphism is modelled as a vector field which describes the displacements of a regular grid of control points, deforming the underlying space. The displacements of each control point represent the estimated amount of deformation between two curves. For our analysis, we compute a symmetric distance matrix from the control point displacements by calculating the each pairwise distance as the average L2-norm of the displacement vectors. We performed a non-metric Multidimensional Scaling (MDS)36 with an embedding dimension of two, in order to embed the high-dimensional data in a low-dimensional embedding space using the scikit-learn library (version 0.24.2) in Python 3.8.10.

Since we investigate multiple species across a wide range of sizes, the first modes of variation on the PCA and bgPCA, likely represent a combination of size-correlated shape differences and shape differences among taxa unrelated to size10,22. We therefore investigate the relationships between SCC shape and size changes (allometry). The common allometric component (CAC) represents the pooled within-group direction of covariation (after removing inter-group variation) between the shape variables on the log centroid size 10,22. The RSC scores describe the residuals of the pooled regression analysis, i.e., the non-allometric component. The first principal component of this analysis is referred to as residual shape component 1 (RSC1)10. We explore variation in the common allometric component (CAC) and residual shape components (RSCs)10. Finally, we employ the first few PC axes in the calculation of Canonical variate analysis (CVA) that maximizes the between-group variance relative to the within-group variance in order to differentiate a priori defined groups. In all the statistical analyses, the StW 53 and StW 151 specimens are considered as indeterminate and projected onto the (bg)PC1/(bg)PC2 representations to identify their closest neighbours. This is also the case for the DNH 34 and DNH 60 specimens when we first investigate their taxonomic (i.e., before their attribution to *P. robustus*) (Fig. 1). We differentiate nine a priori defined groups of fossils (Table S1) in the statistical analyses. We define four *P. robustus* groups: (i) one from Kromdraai Unit Unit P (KW 9600, KW 9700, KW 9900, KW 10840), (ii) one specimen from Kromdraai Unit Q-R (KB 6067), (iii) three from Drimolen (DNH 22, DNH 34, DNH 60) and (iv) three from Swartkrans (SK 83, SK 879, SKW 18/SK52). We also define four *A. africanus* groups: (i) the holotype from Taung, (ii) one specimen from Makapansgat (MLD 31, (ii) two samples from Sterkfontein in order to distinguish the specimens attributed to this species on a consensual basis (Sts 5, Sts 19, StW 98, StW 329) from those (StW 252/255/259, StW 498, StW 504/505, StW 573 and StW 578) that have been referred by one worker7 to a second species with purported robust australopith affinities.