General characterizations of Kunitz domain protease inhibitors
InterproScan and Motif scan identified 19 and 23 genes encoding KDPIs from the E. multilocularis and E. granulosus genomes, respectively (Table 1). The KDPI family has a typical Kunitz domain of about 50 amino acids in size (Fig. 1) with a special secondary structure formed by 3 disulfide bonds or bridges (Additional file 3: Fig. S1). The echinococcal Kunitz domains contain an average of 52.85 aa (range 47–55 aa) with the majority comprising 53 aa (Fig. 1).
Table 1
Summary of physiological and biological characteristics of Kunitz protease inhibitors in E. multilocularis and E. granulosus
Species
|
E. multilocularis
|
E. granulosus
|
KDPIs/single KDPIs
|
19/16
|
23/21
|
Number of amino acids
|
333.47
|
269.96
|
Molecular weight (Da)
|
37153.2
|
30117.02
|
Isoelectric points
|
7.44
|
7.84
|
No. of trans-domain (%)
|
4(21.05)
|
5(21.74)
|
No. of cysteine in Kunitz domain
|
5.63
|
5.65
|
No. of cysteine in the protein
|
24.89
|
18.13
|
Instability index
|
45.17
|
47.36
|
Stable protein (No/yes)
|
12/7
|
15/8
|
Aliphatic index
|
72.35
|
71.37
|
Grand average of hydropathicity
|
-0.22
|
-0.22
|
Signal peptides (%)
|
17(89.47) *
|
14(60.87)
|
No. of Kunitz Motifs
|
1.79
|
1.17
|
En-t-In(T/C)
|
9/2
|
9/4
|
Note: KDPIs, Kunitz domain protease inhibitors; No. of aa, number of amino acids; No. of tran-domain, percentage of containing transmembrane domains; Aver of cysteine, average of cysteine per sequence; Aliphatic indexes; GRSVY, hydropathic index; En-t-In (T/C), enzyme targeting inhibitors, trypsin inhibitors(T) or chymotrypsin inhibitors(C). |
There are significant difference of signal peptides between E. multilocularis and E. granulosus KDPIs (P < 0.05). |
Among these KDPIs, E. multilocularis has 16 KDPIs containing a single Kunitz domain with the complete proteins having sizes of 75–534 aa. There are 3 proteins containing multi-domains with a maximum 8 Kunitz domains (EmuJ_001181950) in size ranging from 610 to 2425 aa. E. granulosus has 21 single Kunitz domain KDPIs of 75–976 aa in size, and 2 multiple domain KDPIs sized from 878 to 1540 aa. The molecular weights of these KDPIs range from 8.34 kDa to 266.77 kDa with isoelectric points from 4.52 to 10.52 (Additional file 1: Table S1). The majority of single Kunitz domain proteins comprise less than 100 amino acid(Additional file 1: Table S1).
We used the instability index to estimate the stability of the KDPIs. An instability value > 40 is an unstable protein. The index value representing rigidity/flexibility of each peptide varied (26.51–94.41). The average value was 45.17 for E. multilocularis and 47.37 for E. granulosus, suggesting the peptides may be flexible. The analysis showed that E. multilocularis has 12 unstable and 7 stable KDPIs and E. granulosus has 15 unstable and 8 stable KDPIs (Additional file 1: Table S1).
Sequence analysis showed that these echinococcal KDPI sequences contain a high percentage of hydrophobic residues including alanine (A), valine (V), leucine (L) and isoleucine (I). The hydropathic index (Grand Average of Hydropathy: GRAVY) for E. granulosus and E. multilocularis are − 0.223 ± 0.333 and − 0.222 ± 0.312, respectively. Aliphatic indexes (AI) are 71.37 ± 14.08 and 72.36 ± 13.75 for E. granulosus and E. multilocularis, respectively (Table 1 and Additional file 1: Table S1).
The average numbers of negatively charged residues (Asp + Glu) are 38.42 and 28.91, accounting for 8.84% and 9.13% of E. multilocularis and E. granulosus KDPIs, respectively. There are 35.16 and 30.70 positively charged residues (Arg + Lys) in the E. multilocularis and E. granulosus KDPIs accounting for 12.73% and 12.23% of the total amino acids. Neutral amino acid residue are 259.89 and 210.35 aa on average and account for 78.42% and 78.63% of the KDPIs in E. multilocularis and E. granulosus, respectively (Table 1 and Additional file 1: Table S1).
The average aliphatic indexes are 72.36 (51.54–89.78) and 71.37 (49.44-100.73) for the E. multilocularis and E. granulosus KDPIs, respectively. The hydropathicity indexes of E. multilocularis and E. granulosus KDPIs are − 0.222 (ranging from − 0.978 to 0.340) and − 0.223 (ranging from − 0.996 to 0.371) for respectively. The results indicate that the KDPIs in both parasites are likely hydrophilic proteins (Table 1 and Additional file 1: Table S1).
E. multilocularis and E. granulosus have 4 and 5 KDPIs containing transmembrane regions, respectively, and 78.94% and 78.26% of the E. multilocularis and E. granulosus KDPIs are extracellular (Table 1), which matches the GO analysis (Table 2 and Additional file 2: Table S2), indicating that the most KDPIs may involve host and parasite interface responses. The TopPred program indicated that 4 E. multilocularis and 5 E. granulosus KDPIs are located in the cytoplasm (Additional file 1: Table S1) with others, including 15 Em-KDPI sequences and 18 Eg-KDPIs, being extracellular.
Table 2
The expression of Kunitz-type domain protease inhibitors in four stages of E. granulosus showing Hi-seq reads
Gene ID
|
Adt
|
Onc
|
PSC
|
CM
|
SEq. Description
|
Seq.
Lth
|
#GOs
|
GOs
|
EG_07244
|
217
|
0
|
1
|
10
|
serine protease inhibitor
|
106
|
1
|
F: serine-type endopeptidase inhibitor activity
|
EG_08716
|
155
|
0
|
0
|
0
|
kunitz-type protease inhibitor 3-like
|
84
|
3
|
F: peptidase inhibitor activity; F: protein binding; P: transforming growth factor beta receptor signaling pathway
|
EG_10096
|
70
|
0
|
3
|
4
|
kunitz domain-containing
|
98
|
42
|
C: cytoplasmic vesicle; P: apoptosis; P: neuromuscular process controlling balance; P: ionotropic glutamate receptor signaling pathway; P: regulation of epidermal growth factor receptor activity; F: receptor binding; P: regulation o
|
EG_07242
|
46
|
0
|
1
|
5
|
serine protease inhibitor
|
83
|
7
|
F: peptidase inhibitor activity; C: nematocyst; F: ion channel inhibitor activity; F: potassium channel inhibitor activity; F: serine-type endopeptidase inhibitor activity; C: extracellular region; P: pathogenesis
|
EG_07243
|
35
|
0
|
2
|
5
|
wap four-disulfide core domain 6b
|
75
|
2
|
C: cytoplasm; F: serine-type endopeptidase inhibitor activity
|
EG_03480
|
24
|
0
|
1
|
5
|
four-domain proteases inhibitor
|
88
|
1
|
F: peptidase inhibitor activity
|
EG_09490
|
17
|
2
|
14
|
51
|
spon-1 protein
|
976
|
3
|
F: serine-type endopeptidase inhibitor activity; C: proteinaceous extracellular matrix; C: extracellular region
|
EG_03481
|
16
|
0
|
8
|
2
|
trypsin inhibitor
|
242
|
4
|
F: peptidase inhibitor activity; P: multicellular organismal process; F: extracellular matrix structural constituent; C: proteinaceous extracellular matrix
|
EG_08720
|
10
|
0
|
0
|
0
|
kunitz bovine pancreatic trypsin inhibitor domain protein
|
84
|
4
|
F: serine-type endopeptidase inhibitor activity; F: peptidase activity; F: peptidase inhibitor activity; C: extracellular region
|
EG_07944
|
9
|
0
|
22
|
14
|
kunitz bovine pancreatic trypsin inhibitor domain protein
|
539
|
2
|
C: extracellular region; F: hydrolase activity
|
EG_09269
|
8
|
0
|
0
|
1
|
tissue factor pathway inhibitor 2-like
|
92
|
2
|
F: extracellular matrix structural constituent; C: proteinaceous extracellular matrix
|
EG_05317
|
7
|
0
|
19
|
16
|
Kunitz-like protease inhibitor precur
|
1540
|
|
|
EG_08721
|
6
|
108
|
0
|
0
|
serine protease inhibitor- with kunitz and wap domains 1
|
79
|
1
|
C: acrosomal vesicle
|
EG_04958
|
5
|
0
|
1
|
2
|
elegans protein partially confirmed by transcript evidence
|
135
|
2
|
F: serine-type endopeptidase inhibitor activity; P: epidermis development
|
EG_05483
|
5
|
12
|
0
|
2
|
secreted protein with kunitz
|
191
|
2
|
C: extracellular region; F: hydrolase activity
|
EG_09007
|
4
|
0
|
0
|
0
|
mechanosensory abnormality family member (mec-1)
|
86
|
2
|
P: extracellular structure organization; P: mechanosensory behavior
|
EG_09008
|
3
|
0
|
0
|
0
|
acp24a4
|
102
|
2
|
C: extracellular region; F: peptidase activity
|
EG_01779
|
1
|
0
|
13
|
3
|
isoform g
|
878
|
1
|
F: hydrolase activity
|
EG_05482
|
1
|
2
|
0
|
0
|
kunitz bovine pancreatic trypsin inhibitor domain containing protein
|
130
|
1
|
C: extracellular region
|
EG_07266
|
1
|
0
|
0
|
3
|
serine protease inhibitor
|
129
|
3
|
F: binding; F: serine-type endopeptidase inhibitor activity; C: extracellular region
|
EG_05316
|
0
|
0
|
3
|
2
|
kunitz-type protease inhibitor 3-like
|
239
|
1
|
F: peptidase inhibitor activity
|
EG_08718
|
0
|
2
|
0
|
0
|
kunitz domain-containing
|
144
|
4
|
F: peptidase activity; F: serine-type endopeptidase inhibitor activity; F: peptidase inhibitor activity; C: extracellular region
|
EG_09006
|
0
|
0
|
0
|
0
|
single kunitz protease inhibitor
|
89
|
5
|
F: peptidase activity; F: serine-type endopeptidase inhibitor activity; F: peptidase inhibitor activity; C: extracellular region; P: multicellular organismal development
|
Note: Adt, adult worms; Onc, oncospheres; PSC, protoscoleces; CM, cyst membrane. |
Signal peptide analysis showed that there are 17/19(89.47%) E. multilocularis KDPIs having an 18–26 amino acid (aa) signal peptide and 2/19(10.53%) KDPIs without. In contrast, E. granulosus has 14/23 (60.87%) KDPIs containing signal peptide sequences and 9/23 (39.13%) KDPIs without (Table 1). There is significant difference of signal peptides between E. multilocularis and E. granulosus KDPIs (P < 0.05; Table 1).
Cluster and phylogenetic analysis of Kunitz protease inhibitors
Multiple sequence alignment and phylogenetic analysis of the amino acid sequences were used to infer the evolutionary relationships between the E. multilocularis and E. granulosus KDPIs and to make a comparison with other species. Figure 2 shows the different evolutionary distances of the genes of a single Kunitz domain of the KDPIs using the neighbor-joining method. The analysis indicated that the E. granulosus and E. multilocularis Kunitz domain peptides were divided into three branches containing 9 clusters.
Comparison of KDPI genes predicated from the E. granulosus and E. multilocularis genomes
We compared the KDPI genes predicted from the genomes of E. granulosus and E. multilocularis and found that some genes are species-specific. E. multilocularis does not have homologues of E. granulosus sequences EG_07242, EG_07266, EG_07243, EG_09006 and EG_09008; whereas EmuJ_001136700.1 and EmuJ_001137100.1 are specific to E. multilocularis.
The specificity of a protease inhibitor against a protease is mainly determined by the nature of the amino acid residue at position P1 of its active site. It has been shown that Lys(K) and Arg(R) mutants of bovine pancreatic trypsin inhibitor (BPTI) bind to bovine trypsin about 105-fold stronger than BPTI with P1 Tyr(T)[15]. In addition, it has been shown that typical trypsin inhibitors have Arg(R) or Lys(K) at P1, and chymotrypsin inhibitors have Leu (L) or Met (M) at the P1 position [16]. Therefore, the sequence analysis shows that the Em-KDPIs have 8 sequences containing R and 1 sequence containing K whereas the Eg-KDPIs have 8 sequences containing R at P1, which belong to typical trypsin inhibitors. Furthermore, the two tapeworms have 3 or 4 sequences containing L at P1, which are chymotrypsin inhibitors (Fig. 1 and Table 1).
Two D and three D of Kunitz domain protease inhibitors
The majority of the single E. multilocularis and E. granulosus KDPIs are small proteins sized 16-kDa and contain a relatively high percentage of Lys and Arg residues at the C-terminus. Like most Kunitz domain protease inhibitors, the Em- and Eg-KDPIs contain a conserved Kunitz type sequence with 6 cysteine residues forming 3 disulfide bridges (C1-C6, C2-C4 and C3-C5) (Additional file 3: Fig. S1) and these play a key role in the formation of the 2D and 3D structure of these KDPIs. For the single Kunitz domain sequences, the secondary structure prediction revealed 19.01–52.71% and 18.6-60.35% of α-helix and random coil structures in Eg-KDPIs, followed by extended strands and β-turn structure, accounting for 13.1-26.67 and 1.89–10.84%, respectively. Em-KDPIs α-helix and random coil structures account for 19-40.45% and 32.5-55.99% of the protein sequence respectively, followed by extended strands and β-turn, accounting for 8-36.25% and 0-10.71% (Table 3).
Table 3
The secondary structure prediction of the single Eg-KDPIs and Em-KDPIs
Accession number
|
Alpha helix(aa/%)
|
Extended strand(aa/%)
|
Beta turn(aa/%)
|
Random coil(aa/%)
|
EG_03480
|
32/36.36
|
19/21.59
|
3/3.41
|
34/38.64
|
EG_03481
|
46/19.01
|
57/23.55
|
19/7.85
|
120/49.59
|
EG_04958
|
33/24.44
|
28/20.74
|
11/8.15
|
63/46.67
|
EG_05316
|
92/38.17
|
54/22.41
|
17/7.05
|
78/32.37
|
EG_05482
|
54/41.54
|
18/13.85
|
8/6.15
|
50/38.46
|
EG_05483
|
66/34.55
|
41/21.47
|
10/5.24
|
74/38.74
|
EG_07242
|
25/30.12
|
13/15.66
|
9/10.84
|
36/43.37
|
EG_07243
|
20/26.67
|
20/26.67
|
5/6.67
|
30/40
|
EG_07244
|
41/38.68
|
23/21.7
|
2/1.89
|
40/37.74
|
EG_07266
|
68/52.71
|
29/22.48
|
8/6.2
|
24/18.6
|
EG_07944.1
|
138/25.6
|
91/16.88
|
34/6.31
|
276/51.21
|
EG_08716
|
31/36.9
|
11/13.1
|
3/3.57
|
39/46.43
|
EG_08718
|
41/28.47
|
34/23.61
|
8/5.56
|
61/42.36
|
EG_08720
|
25/29.76
|
18/21.43
|
4/4.76
|
37/44.05
|
EG_08721
|
35/44.3
|
13/16.46
|
3/3.8
|
28/35.44
|
EG_09006
|
28/31.46
|
22/24.72
|
9/10.11
|
30/33.71
|
EG_09007
|
23/26.74
|
13/15.12
|
7/8.14
|
43/50
|
EG_09008
|
43/42.16
|
22/21.57
|
7/6.86
|
30/29.41
|
EG_09269
|
37/40.22
|
15/16.3
|
4/4.35
|
36/39.13
|
EG_09490
|
203/20.8
|
145/14.86
|
39/4.0
|
589/60.35
|
EG_10096
|
20/20.41
|
23/23.47
|
3/3.06
|
52/53.06
|
EmuJ_000077700.1
|
65/30.95
|
42/20
|
17/8.1
|
86/40.95
|
EmuJ_000077800.1
|
33/24.26
|
39/28.68
|
10/7.35
|
54/39.71
|
EmuJ_000302900.1
|
128/23.97
|
76/14.23
|
31/5.81
|
299/55.99
|
EmuJ_000419200.1
|
37/40.22
|
15/16.30
|
0/0.00
|
40/43.48
|
EmuJ_000534800.1
|
24/32.00
|
6/8.00
|
5/6.67
|
40/53.33
|
EmuJ_000548800.1
|
23/23.23
|
21/21.21
|
4/4.04
|
51/51.52
|
EmuJ_000549400.1
|
19/19.00
|
27/27.00
|
8/8.00
|
46/46.00
|
EmuJ_001136500.1
|
36/40.45
|
16/17.98
|
2/2.25
|
35/39.33
|
EmuJ_001136600.1
|
31/36.90
|
13/15.48
|
6/7.14
|
34/40.48
|
EmuJ_001136700.1
|
20/25.64
|
18/23.08
|
8/10.26
|
32/41.03
|
EmuJ_001136800.1
|
36/40.45
|
16/17.98
|
2/2.25
|
35/39.33
|
EmuJ_001136900.1
|
21/23.33
|
17/18.89
|
4/4.44
|
48/53.33
|
EmuJ_001137000.1
|
26/30.95
|
16/19.05
|
4/4.76
|
38/45.24
|
EmuJ_001137100.1
|
22/26.19
|
20/23.81
|
9/10.71
|
33/39.29
|
EmuJ_001137300.1
|
20/25
|
29/36.25
|
5/6.25
|
26/32.5
|
EmuJ_001137400.1
|
26/30.59
|
16/18.82
|
4/4.71
|
39/45.88
|
Note: aa/%, number/percentage of of amino acids in each secondary structure |
It is accepted that there is a close relationship between the structure and function of a protein. Therefore, we used SWISS-MODEL to predict 3D structures based on the homology modeling of KDPI templates from PDB (protein database) including single and multiple Kunitz domain proteins.
Three D structure analysis showed that a single Kunitz domain sequence with 3 disulfide bonds has a similar structure containing a α-helix and random coils with similar structures (Fig. 3). Some single Kunitz domain sequences losing the second cysteine (C2) the structure is different from 3 disulfide bonds (Fig. 1 and Table 1).
Expression of E. granulosus KDPIs in different developmental stages
To estimate expression of the KDPIs, Hi-seq techniques were employed to obtain the transcript reads of these genes from total RNA from each of 4 developmental stages of E. granulosus. The transcript read information was published in a previous paper of ours [17].
The transcriptome analysis showed that these Kunitz peptides were differentially expressed in the different developmental stages of E. granulosus (Table 2). All the inhibitors, except EG_09006, were expressed in one or 4 stages of E. granulosus with some being highly and differentially expressed in one or two stages.EG_03480 (extra), EG_03481 (intra), EG_07242 (extra), EG_07243 (intra), EG_07244 (intra), EG_08716 (extra), EG_08720 (extra), EG_09490 (extra) and EG_10096 (extra) were significantly highly expressed in the adult worm stage (Table 2 and Additional file 1: Table S1). EG_08716 is an extracellular protease inhibitor and has 42 predicted GOs, including cytoplasmic vesicle for neuromuscular process controlling balance, ionotropic glutamate receptor signaling pathway, regulation of the activity of epidermal growth factor receptor and synapse, regulation of mitotic cell cycle and translation and cellular copper and calcium ion homeostasis (Additional file 1: Table S1). The expression analysis indicated that this gene may play an important role in adult worm development and against host protease attack. EG_07244 is also an endopeptidase, indicating that the protein has two functions, as a peptidase and as a protease inhibitor in adult worms.
EG_08721 is an extracellular inhibitor and was differentially highly expressed in the oncosphere compared with the other stages, indicating this protease inhibitor plays an important role in oncosphere biology, the only stage for primarily infection and EG_08721 may play an important role in oncosphere against host protease attack which may be a candidate for vaccine development.
Although we activated PSC with pepsin, only three KDPIs (EG_01779 EG_05317 and EG_07944) were slightly elevated in this stage. Importantly, we found that EG_09490, EG_09268 and EG_09490 were highly expressed in the cyst membrane and the proteins expressed by these genes may be potential targets for drug development.