Molecular Evolution of APP family proteins: Rapid Evolution of the Mammalian APLP1 as a Synaptic Adhesion Molecule


 Amyloid precursor protein (APP) family members are involved in essential neuronal activities, and dysfunction of APP family members leads to neurodegenerative diseases such as Alzheimer’s disease. Among the APP gene family members, amyloid precursor-like protein 1 (APLP1) is selectively expressed in neurons and has specialized functions during synaptogenesis. Although a potential role for APLP1 in neuronal evolution has been indicated, its precise evolutionary and functional contributions are unknown. This study shows the molecular evolution of the vertebrate APP family based on phylogenetic analysis, while contrasting the evolutionary differences within the APP family. Phylogenetic analysis showed 15 times higher substitution rate that is driven by positive selection at the stem branch of the mammalian APLP1, resulting in dissimilar protein sequences compared to APP/APLP2. Docking simulation identified one positively selected site in APLP1 that alters the heparin-binding site, which could affect its function, and dimerization rate. Furthermore, the evolutionary rate covariation between the mammalian APP family and synaptic adhesion molecules (SAMs) was confirmed, indicating that only APLP1 has evolved to gain synaptic adhesion property. Overall, our results suggest that the enhanced synaptogenesis property of APLP1 as one of the SAMs may have played a role in mammalian brain evolution.

Introduction APP (amyloid precursor protein) family comprises of type-I single-pass transmembrane proteins and is composed of APP, amyloid precursor-like protein (APLP) 1, and APLP2. APP and its truncated peptide, amyloid-β protein, has been extensively studied in neurodegenerative diseases, including Alzheimer's disease [1]. Although the physiological functions of APP family, such as synaptic plasticity, synaptogenesis, and neuronal protection have been reported, a clear association of these functions with disease onset remains elusive [2][3][4]. Importantly, there exists redundancy as well as differentiation of APP, APLP1, and APLP2 functions. A complementary role was shown when contrasting brain development was observed in triple knockout (KO) mice compared to single or double KO mice. In the triple KO, neuronal cells migrated away from their normal position to the marginal zone, while single or double KO showed no abnormalities in cell organization during brain development in any of the mutants, thus, showing overlapping APP family functions [5,6]. However, distinct roles for APP family were suggested based on aged-single KO mice that retained different electrophysiological responses related to synaptic dysfunction: long term potentiation (LTP) was reduced in APP −/− mice; spine density was reduced in APP −/− and APLP1 −/− mice; and ber volley/ excitatory post-synaptic potential (EPSP) slope was reduced in APLP1 −/− mice [2,7,8]. Furthermore, although double KO, APP −/− APLP2 −/− , and APLP1 −/− APLP2 −/− were more lethal to mice due to de cits in the neuromuscular junction, APP −/− APLP1 −/− mice were viable, clearly showing unique roles for each family member [6].
All APP family members contain large ectodomains, each comprising E1 and E2 domains, and a relatively smaller intracellular domain (Fig. 3A). E1 domain is further subdivided into two subdomains: heparinbinding domain (HBD) and copper-binding domain (CuBD), which bind to extracellular factors such as heparin, copper, or zinc ions, and secreted glycoproteins, including bulin1 [1]. Two heparin binding sites are present in HBD of E1 and E2, which the prior is thought to be lost in APLP1. The intracellular domain contains the YENPTY motif that is conserved from Drosophila APP (APPL) to vertebrate APP family, retaining several binding partners, including X11, FE65, and clathrin for endocytosis [1,9]. There is a Kunitz-type protease inhibitor domain (KPI) between E1 and E2 domains, which is absent from APLP1. APLP1 speci city for central nervous system (CNS) is one of the differentiated functions of APP family members [10,11]. For example, APLP1 is expressed selectively in neurons, whereas APP and APLP2 are ubiquitously expressed [2,10]. More precisely, within the developing cortex where all APP family members are expressed, APLP1 is expressed exclusively in the cortical plate (CP), which later becomes layer II-VI in the mature brain [12]. The physiological function of APLP1 is also modi ed; compared to APP and APLP2, APLP1 has an augmented synaptogenesis function. Thus, the treatment of non-neuronal HEK293T cells with APLP1 induces e cient pre-synapse formation for axon contacts [2,3,13]. Based on this function, the APP family is classi ed as a synaptic adhesion molecule (SAM), thus ful lling synaptic localization/function and cell adhesion property observed in proteins such as neuroligin and neurexin. After branching from the common ancestor of the APP family, the expression and function of APLP1 is primarily limited to neurons; however, the role of APLP1 in neuronal or brain evolution remains unclear.
The rate of trans-dimerization describes the synaptogenesis e ciency among APP family members [2,13]. Trans-dimer of E1 domain of APP family bridges pre-synapse and post-synapse, thus providing a site for the assembly of subsequent additional synaptic protein (e.g., X11) for synapse maturation [14]. E cient cell surface expression and surface property of APLP1 are the critical factors for dimerization, and a precise phenotype-genotype correlation is not undertaken yet [2,4]. Moreover, extracellular heparin and metal ions are factors known to accelerate dimerization rate [1,[15][16][17]. By revealing the genotypephenotype relationship of APP family dimerization, it is possible to identify factors facilitating the SAM function of APLP1 during vertebrate brain evolution.
This study focuses on the functional divergence of APLP1 from vertebrate APP/APLP2. We hypothesize that APLP1 has evolved during vertebrate evolution since exclusive neuronal expression of APLP1 is observed from zebra sh to humans (expression data for zebra sh is from https://z n.org/ZDB-FIG-050630-2590). Based on the phylogenetic reconstruction and positive selection analysis, we showed that APLP1 evolves faster than APP/APLP2. Notably, signi cant changes in APLP1 take place at the stem of the mammalian APLP1 changing its protein properties, including heparin-binding for dimerization, distinct from the vertebrate APP family. Finally, a signi cant rise in evolutionary rate covariation was observed between mammalian APLP1 and SAMs, indicating the evolutionary direction of mammalian APLP1 as SAMs. The function of APLP1 as SAMs perhaps played a role in mammalian brain evolution where a six-layered neocortex appeared.

Results
APLP1 distinct functions may have evolved from the stem branch of mammals As APLP1 has specialized molecular functions and expression patterns after duplication from the APP family, we hypothesized that it might have undergone accelerated evolution [18]. Based on the nucleotide sequences, a phylogenetic tree was constructed for 590 vertebrate APP family sequences (APP: 212, APLP1: 177, APLP2: 201) using neighbor-joining method [19]. Human APP family sequences were used as query sequences, and all the redundant sequences were cut after alignment (Supplementary Table 1). As reported, APP family was detected in class of mammal, bird, reptile, amphibian, and sh, while APLP1 was not found in birds [9]. Interestingly, the stem branch of mammalian APLP1 exhibited about 15 times longer branch (APLP1 = 0.260, APP = 0.011, APLP2 = 0.017) compared to that of APP or APLP2, indicating rapid evolution of ancestral mammalian APLP1 (Fig. 1a). In addition, the branching order of mammal, bird, reptile, amphibian, and sh is irregular for APLP1, though APP and APLP2 branching were based on the species tree.
To evaluate the selective pressure on the stem of mammalian APLP1, the ratio of dN (substitution altering amino acid) and dS (substitution not altering amino acid or neutral substitution) was calculated [20,21]. First, we prepared three multiple sequence alignments (MSAs) of vertebrate APP family containing sh, sh + amphibian, sh + amphibian + reptile + bird, and sh + amphibian + reptile + bird + mammal, respectively. Then dN/dS between the two groups were compared statistically. As a result, unlike APP or APLP2, APLP1 showed a signi cant increase in dN/dS when including mammals in MSA (Fig. 1b). Furthermore, MSAs that are composed of only mammals were created for calculating intramammalian dN/dS. Intra-mammalian dN/dS of APLP1 did not exhibit the highest dN/dS value, suggesting APLP1 has accelerated substitution rate only at the stem of mammals ( Supplementary  Fig. 1). Next, to examine branch-speci c evolution at the stem of mammalian APLP1, a branch-site model based on maximum likelihood, dN/dS estimation was applied [21]. As a result, tness to positive selection model (Model A) was signi cantly larger than its nested-null hypothesis (Model A null) at the stem branch of mammalian APLP1, supporting branch-speci c positive selection, in contrast to APP or APLP2 (Table 1). We also assessed the impact of genomic but not protein evolution effect on the rapid evolution of the stem of mammalian APLP1. This study estimated the GC-content (GC%) and the RSCU, which can be under natural selection as it positively correlates with the expression of a gene for the e ciency of transcription or translation [22,23]. As a result, GC% of the vertebrate APP family varies from class to class ( Supplementary Fig. 2). GC content at third codon position (GC3%), which re ects GC-biased selection, was relatively high in some classes (APP-sh, APLP1-mammal, APLP1-reptile, APLP2-mammal, and APLP2-sh). However, high GC3% was not restricted only to APLP-mammal. PCA was performed based on RSCU for the vertebrate APP family. APP, APLP1, and APLP2 were isolated to some extent, showing divergence on RSCU not only for APLP1, but also for the other APP family members (Fig. 3).
Altogether, GC-content and RSCU vary among vertebrate APP families, supporting the notion that the protein sequence of APLP1 has extensively evolved at the stem of mammal. Mammalian APLP1 has a discrete protein sequence compared to other vertebrate APP families To precisely differentiate the mammalian APLP1 protein sequence from the APP family, three distinguished clustering methods were applied to the same vertebrate APP family dataset. Figure 2a shows the phylogenetic tree of APP family protein sequences based on the neighbor-joining method, which clusters sequences from the closest pair branches [19]. It clearly showed distant mammalian APLP1 clade in contrast to mammalian APP and APLP2. This result is consistent with nucleotide sequence phylogenetic tree in Fig. 1a, suggesting nonsynonymous substitution substantially occurred at the stem branch of mammalian APLP1.
Subsequently, we conducted a PSA-based clustering of the APP family sequences using the Max-cut algorism to complement the above-mentioned phylogenetic analysis. In previous studies, it was observed that the interior branches close to the tree stem tend to reconstruct better when the evolutionary distance matrix is derived from the PSA, not from the MSA [24,25]. Additionally, from the same group, accurate sequence clusters were reported when using the graph-cut algorism, as opposed to the neighbor-joining method [24]. Hence, Max-cut, a graph-cut algorism, was applied to each of the PSA matrix of the APP family paralogs, which were comprised of 20-24 sequences each (Fig. 2b, c). The algorism attempts to maximize the total distance between the two clusters (i.e. obtain two clusters that discriminates between a set of sequences). Thus, we were able to obtain results consistent with that of the neighbor-joining method in Fig. 2a. Amniotes (mammal-bird-reptile) and other vertebrates (amphibian-sh) were clustered with regards to APP and APLP2, showing a phylogenetically reasonable grouping. However, APLP1 exhibited clusters between mammals-sh and birds-reptiles-amphibians, suggesting a dissimilarity in the APLP1 sequences among the amniotes, as illustrated in Fig. 2a.
To determine if the protein sequences of mammalian APLP1 have speci c properties distinct from other sequences, the sequences were clustered based on their amino acid composition grouped by attributes. Seven attributes (hydrophobicity, normalized Vander Waals volume, polarity, polarizability, charge, secondary structure, and solvent accessibility) as descriptors were calculated for each sequence [26,27] and clustered using PCA. The top panel in Fig. 2d shows PCA of the vertebrate APP family written by simple amino acid composition. The lower panel in Fig. 2d shows the PCA of clustering sequences based on seven attributes as descriptors. As a result, the mammalian APLP1 was distinguished solely from other members of the vertebrate families, suggesting it has a unique amino acid sequence (Fig. 2d). Factor loadings in PCA revealed amino acids with intermediate polarity (P, A, T, G, S) and strand inducing amino acids (V, I, Y, C, W, F, T), which negatively and positively isolated mammalian APLP1 clusters (Supplementary Table 2). Altogether, the mammalian protein APLP1 evolution might have been selected for a speci c evolutionary axis.
The heparin-binding domain of mammalian APLP1 undergoes a higher degree of positive selection Site-speci c dN/dS is useful for identifying which function of mammalian APLP1 has evolved. First, to extract domain that may have experienced functional evolution, pairwise dN/dS was estimated for each domain of the vertebrate APLP1. The comparison showed a signi cant increase in dN/dS in HBD (Fig. 3b). This domain is a subdivision of the E1 domain and is one of the binding sites for proteins, metal ions, and heparin, thus, regulating cis-or trans-dimerization of the APP family [13,28,29]. One of the positively selected sites (position 122 in human APLP1 sequence) in HBD is present in heparinbinding loop structures in APLP1 as detected by the branch-site model (Table 1, Fig. 3c). Additionally, another site in the loop (position 125 in human APLP1 sequence) that is required for heparin-binding is substituted by uncharged residues [16,29].
It is speculated that the positively charged loop of APP/APLP2 regulates dimerization through electrostatic interactions with heparin, a small extracellular molecule [15,16]. In contrast, mammalian APLP1 lacks the positively charged residues compared to other vertebrate APLP1 and considered to have lost the heparin-binding ability (Fig. 3c) [30,31]. In humans, four out of the six positions in APP and APLP2 HBD loop are positively charged residues, while in APLP1, only one out of the seven positions is a positively charged residue. Hence, we propose that positive selection leads to the loss of heparin-binding to the APLP1 loop. However, earlier studies showed that E1 domain heparin-a nity is directed only for APP, but not for APLP1, and does not show direct evidence for loss of heparin-binding at APLP1-E1 domain [16,31]. Furthermore, APLP1 localizes mostly at the cell surface, enhancing its ability to bind extracellular heparin compared to APP or APLP2 [2,29]. Docking simulations were used to identify whether APLP1 has the heparin-binding ability.
Human APP family HBD structures were modeled based on homology modeling implemented in Phyre2 and subsequently docked with heparin using the ClusPro web server (template and input sequences are shown in Supplementary Table 3). Figure. 4a shows (upper row) heparin (colored in yellow) bound to the positively charged loop (colored in red) in mammalian APP and APLP2, and non-mammalian APLP1. In contrast, mammalian APLP1, bound heparin at other positively charged surfaces (Fig. 4a, lower row). ClusPro quanti ed the results by counting the number of times heparin is bound to an arbitrary residue, where the contact is de ned as distance < 4.0 Å. The quanti cation represented the emergence of novel heparin-binding residues (R80, R82, R83, and R86 in human sequence) conserved solely for mammalian APLP1 (Fig. 4b, 4c). Here, the known binding loop is highlighted in red. Residues, R80, R82, and R83 are located in non-secondary structures, whereas R86 is present in a strand. Notably, R80 exhibited signs of signi cant positive selection, using the branch-site model (p-value = 0.002). Mammalian APLP1 has experienced loss of heparin-binding function at the loop and has subsequently acquired an alternative binding surface for heparin.
The heparin-binding region in the E2 domain was tested using a similar docking simulation procedure to identify if a mammalian APLP1 speci c binding mode can be observed, as in HBD. We did not detect extensive positional alterations of heparin for E2 docking ( Supplementary Fig. 4, Supplementary Table 3).
Coevolutionary partners of mammalian APLP1 could be distinct from APP or APLP2 As shown above, the mammalian APLP1 evolutionary direction is indicated by the phylogenetic analysis.
To further re ne the mammalian APLP1 functional evolutionary direction, we conducted a coevolutionary analysis. The coevolutionary analysis is based on phylogenetic tree comparison, where trees tend to have similar topology when they experience similar evolutionary pressure for related functionality [32,33].
Here, the similarity of the phylogenetic trees is represented as a correlation coe cient and is de ned as ERC. ERC is derived from the comparison of each of the corresponding branch lengths after optimization based on species tree (details provided in Materials and Methods) [34].
We focused speci cally on SAM proteins, as APLP1 has enhanced cell-cell adhesion function at synapse compared to APP or APLP2, based on previous reports [2]. Moreover, our phylogenetic analysis and docking simulation support the evolution of mammalian APLP1 towards cell-cell adhesion ( Fig. 3 and Fig. 4). ERC with mammalian APP family was calculated for reported 35 SAM and SAM-like proteins (e.g., neurexin, neuroligin, and calsyntenin3), and other neuronal-system related pathways including synapse pruning, neuronal migration, neuromuscular junction, neural precursor protein (NPC) proliferation and epithelial cell adhesion retrieved from the Reactome pathway or Gene Ontology [34][35][36]. In addition, we introduced set of proteins including all neuronal system related proteins (R-HSA-112316: 329 genes) from the Reactome pathway. As it is the parental node that involves most of neuronal system related proteins, the ERC value between this gene set and APLP1 was used as basal ERC value. Subsequently, ERC values derived from each dataset were tested whether it shows statistically higher ERC values compared to the basal ERC value. As a result, the ERC value between SAM proteins and mammalian APLP1 was signi cantly higher compared to the basal ERC value. ERC values of APLP1 with the other pathway related proteins did not statistically exceed the basal ERC (Fig. 5). This trend was absent for APP-SAM or APLP2-SAM ERCs (Supplementary Table 4, APP-SAMs = -0.009, APLP1-SAMs = 0.276, APLP2-SAMs = -0.132). Our ERC analysis revealed that the evolutionary pro le of mammalian APLP1 is relatively close to SAM proteins.

Discussion
Phylogenetic analysis, including phylogenetic tree reconstruction, positive selection analysis, and coevolution partner estimation, were applied to the vertebrate APP family. Vertebrate APP family tree displayed different branch lengths at the stem branch of the mammalian APLP1 compared to corresponding branches of APP and APLP2. It was determined to be the positively selected branch based on the branch-site model of dN/dS calculation; and considered subsequently to result in unique protein sequences among vertebrate APP family, as revealed by PCA of amino acid composition grouped by attributes. To identify the functional divergence of mammalian APLP1, in silico heparin docking simulation was conducted by focusing on HBD that had the highest pairwise dN/dS value. As a result, two of the positively selected sites in mammalian APLP1 HBD might lead to loss of heparin-binding; however, complemented in distinct binding regions. Finally, we used ERC values to re ne the evolutionary direction of mammalian APLP1 as SAM.
The occurrence of the APLP1 branching event before APP/APLP2 branching was suggested previously by similar phylogenetic analysis [37]. Another phylogenetic analysis suggested a distant function of APLP2 [9]. We used all available vertebrate sequences for the analysis, while earlier reports used only a limited number of sequences. Moreover, the differences in the results could be due to the use of different analytical parameters, since we performed dN/dS analysis to identify site-speci c positive selection based on codon sequences, while previous research focused on protein sequence comparisons.
Importantly, our analysis shows evolution acceleration at the stem of the mammalian APLP1, driven presumably by positive selection. PCA results show that (Fig. 2d) the evolution of mammalian APLP1 resulted in protein sequences with speci c protein properties, different from not only mammalian APP/APLP2, but also non-mammalian APLP1. The directional evolution of mammalian APLP1 could perhaps result in specialization towards mammalian neurons or brain since the APLP1 expression location has not changed from sh. It is worth comparing the mammalian APLP1 with nonmammalian APLP1 on the basis of cell biology experiments, including tests for cell-cell adhesion, dimerization ratio, rate of endocytosis, and e ciency of synaptogenesis, the reported functional differences of APLP1 compared to mammalian APP or APLP2 [2,38].
Branch-site models implemented in PAML enabled us to further analyze the speci c site under positive selection at the stem branch of mammalian APLP1 (Table 1, Fig. 3, 4). Substitutions at the heparinbinding loop in HBD in particular caused loss of heparin-binding function in the mammalian APLP1 lineage, which was complemented with another unique positively charged region at 80-86 position (Fig. 4). The positively selected sites of human APP are likely to bind heparin, thus, promoting dimerization of APP, as shown by site-directed mutations. In addition, the rigid binding mode of heparin to the loop is shown in the cocrystal structure of E1-APP and heparin complex [16]. As such, the binding region of heparin may have changed in the mammalian APLP1, a likelihood that alterations may have affected the e cient dimerization of the mammalian APLP1. However, dimerization could also occur without heparin [2]. Thus rapid evolution of the mammalian APLP1 protein surface property may be the basis for a faster dimerization rate.
Based on in vitro ndings that APLP1 exhibits SAM properties, we evaluated whether the evolutionary pro les are also directed towards SAM, using coevolutionary information of known SAM proteins [2]. As a result, the phylogenetic tree of the mammalian APLP1 showed a signi cant correlation with other SAM proteins, suggesting global evolutionary pressure on mammalian APLP1 to develop into SAM speci c sequence. Interestingly, the other neural pathways did not show high ERC value, indicating APLP1's particular evolution towards SAM proteins. ERC analysis tends to detect not only binding partners, but also proteins in same pathway or similar function, requiring further experiments to determine how biologically related the coevolving proteins are [32,33]. Notably, binding partners of each APP family members in mouse brain exhibits unique set of proteins, respectively [39]. This raises the possibility of acquisition of APLP1's speci c binding partners at stem of mammal, that may adjust evolutionary rate of APLP1 towards SAM proteins. Moreover, human APLP1 shows stronger localization at the cell surface and CNS compared to APP/APLP2, having considerable chance of interaction with binding partners which the other family members rarely coexpress [2,40]. Further investigations are needed to identify to what evolutionary demand mammalian APLP1 had responded. It is interesting to see how the accelerated evolution of APLP1 led the mammalian brain to achieve a unique six-layered cortex and expand the neocortex region among amniotes [41]. Since APP and APLP2 are expressed ubiquitously, evolutionary constraints may not be realized since their general function must be conserved.
A 1:1 orthologous sequence of human APLP1 and APLP2 was chosen using the same procedure as described earlier, veri ed through the OMA browser for a list of gene-level orthologues [45,46]. In contrast, sh APP linage seemed to independently experience duplication events resulting in duplicates named APPa and APPb [9]. As duplication may accelerate the heterogeneity of dN/dS value [47], we compared dN/dS of APPa/b and concluded that their insigni cant dN/dS values would not affect the results of the dN/dS analysis. The APPa/b sequences were collected as mentioned above using zebra sh APPa/b (APPa: NM_131564, APPb: NM_152886) as a query (total of 29 species each).

Phylogenetic Tree Reconstruction
Phylogenetic trees were constructed using the neighbor-joining method implemented in MEGA7 [19,38]. The evolutionary distance of the nucleotide sequence was calculated based on the maximum composite likelihood method [48]. Dayhoff matrix was used to calculate the evolutionary distance of the amino acid sequence [49]. Sites with gaps were not included in the reconstruction. To con rm the assurance of branching events, bootstrap value (resampling = 100 times) was inferred, and branches retaining bootstrap value > 50 are considered in the analysis [50].
Positive Selection Estimation Using Dn/ds Calculation A pairwise dN/dS calculation was performed in MEGA7 by Li-Wu-Luo method with 100 bootstrap replications for each APP family divided into either clade level ( sh, sh + amphibian, sh + amphibian + reptile + bird, and sh + amphibian + reptile + bird + mammal) or domain level (HBD, CuBD, E2 domain) [51]. Sites with gaps were excluded from the calculation. Bootstrap resampling was conducted, as shown above.
Maximum likelihood dN/dS calculation under branch-site model (Model A vs. Model A null) implemented in PAML was performed as described earlier, using EasyCodeML that contains GUI-based format conversion and CodeML in one platform [52,53]. Here, MSA and phylogenetic tree data were prepared separately for each vertebrate APP family using a similar procedure as described above with three inputs. This method differs from pairwise dN/dS calculation due to its direct estimation of positive selection on a branch of interest (foreground branch) with statistical testing.
Clustering Method Using Digital Annealer Protein sequences of the APP family paralogs were clustered using the Fujitsu's Digital Annealer (DA), a hardware specialized for solving optimization problems [54]. Brie y, given a sequence distance matrix and cost function, DA searches for optimal bipartition of the dataset to maximize the distance between the two clusters. The distance matrix (termed the similarity graph) was rst described by the evolutionary distance using pairwise sequence alignment (PSA). PSA and evolutionary distance (here, Poissoncorrected distance) calculation was performed using the Data Analysis in Molecular Biology and Evolution (DAMBE) software [25]. Thereafter, we assigned each sequence into an appropriate cluster and maximized the total distance between the two clusters; cost function was used to describe the total distance. As DA accepts binomial values for the variables, the total distance between two clusters containing n number of sequences was calculated as follows:

Principle Component Analysis (pca) And Dataset Preparation
Principle component analysis (PCA) was conducted for the vertebrate APP family sequence based on relative synonymous codon usage (RSCU), amino acid composition, and amino acid composition grouped by attributes. RSCU and GC-content were calculated in MEGA7 [19,38]. Seven attributes, hydrophobicity, normalized Vander Waals volume, polarity, polarizability, charge, secondary structure, and solvent accessibility and their composition for each sequence, were calculated using the ProtrWeb server [55]. Then, PCA using prcomp function in stats supplied as a default package in R was applied for the dataset. The results for ProtrWeb and PCA, including factor loadings, are shown in the supplementary table 2.

Tertiary Structure Modeling And Docking Simulation On Heparin
Tertiary structures of HBD of human APP family (APP: NM_000484, APLP1: NM_001024807, APLP2: NM_001642, reptile APLP1: XM_030212233.1, amphibian APLP1: XM_026702387, and sh APLP1: XM_012862627) were modeled based on homology modeling as implemented in Phyre2 server [56]. Brie y, each HBD of the APP family was truncated manually and subsequently submitted to the webserver. Alanine substitution model for human APLP1 was manually created by changing to R80A, R82A, R83A, and R86A. The default mode was used for tertiary structure prediction. Protein models with the con dence of 100% and percent identity > 40% were chosen for further prediction, which are preferred index values for reliable models. The template crystal structure for homology modeling was HBD of APP (PDB: chain A of 1MWP). Detailed inputs and output scores for modeling are presented in supplementary  table 3.
Subsequently, the predicted models were transferred to the heparin docking simulation implemented in ClusPro 2.0 server [57]. Brie y, the ligand (heparin) rst determines 70000 rotation poses and parallelly translates around xed HBD protein structure by 1.0 Å 3D grid. Then, for each grid point, the pose with the lowest interaction energy (scored by van der Waals attractive, van der Waals repulsive, and electrostatic energy) is selected. Finally, among the selected poses, neighboring poses de ned as RMSD within 9 Å are clustered to determine the most preferred binding mode chosen by its clustering size. Default parameters were used for heparin docking simulation.
The surface electrostatics of protein models were calculated using the PDB2PQR server using a pH of 7.4 [58]. The visualization, including models and surface electrostatics, was conducted in PyMOL.

Coevolutionary Analysis Using Evolutionary Rate Covariation And Coexpression
The coevolutionary analysis was undertaken using evolutionary rate covariation (ERC) value. ERC value is the degree of similarity in a pair of proteins based on phylogenetic tree similarities, here predicted in webserver [33]. The dataset is generated from all available mammalian protein sequences in the UCSC server. Brie y, the correlation coe cient derived from covariation of evolutionary distance between corresponding branches of pair proteins is calculated. Notably, this server normalizes the evolutionary distance by species tree before the calculation of the correlation coe cient.
The tness of model A and model A null were compared by likelihood ratio test (LRT) assuming test statistic 2Δℓ following χ 2 distribution: positive selection model or Model A (allowing dN/dS or ω > 1 for foreground branches) and purifying selection model or Model A null (not allowing dN/dS or ω = 1 for foreground branches). Subsequently, Bayes Empirical Bayes (BEB) analysis were applied for sites in foreground branch with ω > 1 to test whether its value signi cantly have ω > 1 (positive selection) judged by posterior probability (Posterior probability: * > 0.90, ** > 0.95, *** > 0.99).

Declarations Data Availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.