Analysis of physicochemical properties of yak KLK protein
The analysis results of basic physical and chemical properties of yak KLK protein showed that the molecular weight of KLK protein was between 17276.46-31884.23 Da. The number of amino acids was between 160-293, the instability index was between 33.64-56.01. The instability indexes of KLK1, KLK4, KLK5, and KLK6 were less than 40, and the stability was high, while the instability indexes of KLK7 and KLK15 were greater than 40, and the stability was poor. the aliphatic index was between 68.94-86.00, the average hydrophilicity of six KLK proteins was less than 0, so they were hydrophilic proteins. KLK1, KLK4, KLK5, KLK6, and KLK15 have signal peptides, while KLK7 has no signal peptide (Table 2).
Table 2 Analysis of physicochemical properties of encoded proteins
Name
|
Formula
|
Isoelectric point
|
Molecular weight
|
Number of amino acids
|
Instability
index
|
Aliphatic index
|
Grand average of hydropathicity
|
KLK1
|
C1274H1928N332O381S16
|
4.81
|
28504.28
|
259
|
36.55
|
79.77
|
-0.175
|
KLK4
|
C770H1167N207O231S8
|
4.99
|
17276.46
|
160
|
36.26
|
86.00
|
-0.03
|
KLK5
|
C1392H2180N406O417S19
|
8.64
|
31884.23
|
293
|
38.42
|
73.62
|
-0.29
|
KLK6
|
C1182H1857N347O346S17
|
6.93
|
27009.88
|
246
|
33.64
|
84.84
|
-0.244
|
KLK7
|
C1325H2089N369O405S20
|
9.02
|
30309.59
|
284
|
56.01
|
68.94
|
-0.259
|
KLK15
|
C1222H1954N362O362S18
|
8.23
|
28086.25
|
257
|
44.84
|
83.42
|
-0.187
|
Prediction of phosphorylation sites of yak KLK protein
The yak KLK protein has serine, threonine, and tyrosine phosphorylation sites (Table 3). The number of phosphorylation sites is serine > threonine > tyrosine. The phosphorylation sites of each family member are the number is between 15-53. Among them, KLK7 contains up to 53 phosphorylation sites, while KLK6 has only 15.
Table 3 Predicted phosphorylation sites of yak KLK protein
Name
|
Serine
|
Threonine
|
Tyrosine
|
KLK1
|
8
|
5
|
3
|
KLK4
|
13
|
2
|
2
|
KLK5
|
20
|
10
|
5
|
KLK6
|
8
|
4
|
3
|
KLK7
|
39
|
11
|
3
|
KLK15
|
15
|
8
|
2
|
Yak KLK protein sequence alignment and phylogenetic tree construction
The amino acid sequences of KLK family in yak were compared by Clustalw online software as shown in Fig. 1a. The amino acid sequence similarity of the KLK family is poor and the alignment score is low, among which the alignment score between KLK1, KLK5, and KLK15 is significantly higher than the overall comparison level. MEGA11 software was used to construct the phylogenetic tree of KLK protein sequences of yak, human, rattus norvegicus, sheep, pig and bos indicus ((Fig. 1b). It can be seen from the figure that KLK5 and KLK1 have high similarity in the same family members, and the relationship between KLK5 and KLK4 between family members is most recent, and they may originate from the same ancestor. However, the segregation of KLK7 among different species is remarkable. Interestingly, yak KLK7 and rattus norvegicus KLK7 are on the same branch, with high similarity.
Prediction of conserved regions of yak KLK protein
Six different motifs of yak KLK protein were predicted by MEME. The specific structures of the motifs are shown in Fig. 2a. The sequence lengths of the motifs are 50, 41, 21, 44, 15, and 16BP, respectively. Among them, KLK4 and KLK7 have less motifs than KLK1, KLK5, KLK6, and KLK15 (Fig. 2b). The frequencies of leucine, serine and alanine are the highest at 9.16%, 7.37%, and 7.31%, respectively. The lowest frequencies of cystine and tryptophan were 2.24%, 1.81%, and 1.33%, respectively.
Yak KLK protein interaction network analysis and enrichment analysis
The interaction network of six KLK proteins was obtained using the STRIN online tool (Fig. 3). KLK5, KLK6, and KLK7 all interact with serine peptidase inhibitor Kazal type 5 (SPINK5). At the same time, the regulatory networks of KLK5 and KLK7 are similar, and both interact with KLK8 and SPINK5. KLK7 and KLK6 interact with corneodesmosin (CDSN). While KLK15 has an independent regulatory network.
After the interaction analysis of KLK proteins, the interacting genes were used for GO enrichment analysis (Fig. 4). The results showed that the KLK1-interacting protein BP was mainly enriched in platelet degranulation, negative regulation of endopeptidase activity, negative regulation of peptidase activity, and negative regulation of proteolysis; CC was mainly enriched in platelet alpha granules, secretory granule lumen, cytoplasmic vesicle lumen, and vesicle lumen; MF is mainly enriched in molecular functions such as endopeptidase inhibitory activity, endopeptidase regulation activity, and peptidase inhibitory activity. GO enrichment analysis of KLK4 interacting proteins showed that its BP was mainly enriched in odontogenesis of dentin-containing tooth, biomineralized tissue development, and regulation of tooth mineralization and other pathways related to tooth development; CC was mainly enriched in the endoplasmic reticulum lumen, collagen-containing extracellular matrix, and platelet alpha granule lumen; MF is mainly enriched in molecular functions such as extracellular matrix structural constituent, endopeptidase inhibitor activity, peptidase inhibitor activity, and endopeptidase regulatory activity. GO enrichment analysis of KLK5 interacting proteins showed that its BP was mainly enriched in skin development-related pathways such as epidermal development, keratinization, keratinocyte differentiation, and epidermal cell differentiation; while CC was only enriched in one entry of lamellar bodies; MF is mainly enriched in molecular functions such as endopeptidase activity, serine-type endopeptidase inhibitor activity, and serine-type endopeptidase activity. GO enrichment analysis of KLK6 interacting proteins showed that its BP was mainly enriched in the negative regulation of endopeptidase activity and the negative regulation of endopeptidase activity; however, there were no enrichment items in cell composition, and MF was only enriched in six pathways related to enzyme function, including serine-type endopeptidase inhibitor activity, enzyme inhibitor activity, peptidase regulator activity, endopeptidase regulator activity, peptidase inhibitor activity, and endopeptidase inhibitor activity. GO enrichment analysis of KLK7 interacting proteins showed that its BP was mainly enriched in epidermal development, extracellular matrix organization, regulation of defense response to bacterium, and keratinocyte differentiation; CC was mainly enriched in cell-cell junction, lamellar body, desmosomes, and cornified envelopes; MF is mainly enriched in molecular functions such as endopeptidase activity, serine hydrolase activity, serine-type peptidase activity, and serine-type endopeptidase activity. Finally, the GO enrichment analysis of KLK15 interacting proteins found that BP was mainly enriched in metabolic pathways such as organic acid catabolic process, carboxylic acid catabolic process, and small molecule catabolic process; MF was mainly enriched in G protein β-subunit binding, vitamin binding, and G-protein α-subunit binding and other signaling pathways. Among them, KLK5 and KLK7 are genes closely related to the development of yak epidermis.
KEGG enrichment analysis of KLK interaction genes using R language and visualization using Graphpad Prism software (Fig. 5), the results show that KLK4, KLK5, and KLK15 are only enriched in one pathway, complement and coagulation cascades, staphylococcus aureus infection and carbon metabolism pathway, KLK1, KLK6, and KLK7 is enriched in multiple pathways, and KEGG enrichment analysis of KLK1 interacting proteins found that they were mainly enriched in the complement and coagulation cascades, inflammatory mediator regulation of TRP channels, and the renin-angiotensin system; the first three pathways of KLK6 enrichment were histidine metabolic pathway, alzheimer ' s disease and β-alanine metabolism. The three pathways with the highest enrichment specificity of KLK7 were thyroid cancer, bacterial invasion of epithelial cells, and apelin signaling pathway.
Analysis of the expression of some KLK genes in yak, mouse and sheep skin tissue
After tSNE cluster analysis of scRNA-seq data downloaded from NCBI, the expression of KLK7 and other genes in skin tissues of yak, mice and sheep was observed (Fig. 6). The results showed that the expression of KLK gene in mice was significantly lower than that in yak and sheep, while the expression of KLK7 in yak was mainly high in cluster1 and cluster5 and the expression of KLK7 was significantly higher than that of KLK5, KLK6, and KLK10. KLK gene family members in mouse back skin tissue were mainly concentrated in cluster7, and the expression level of KLK7 was significantly higher than that of KLK11. KLK7 was mainly highly expressed in cluster1 and cluster3 in Liaoning cashmere goat skin tissue, while KLK11 was more widely distributed than KLK7, and it was distributed in cluster3, cluster4, cluster8, and cluster12.