Chameleon sequences are short sequences that adopt various secondary structures under different conditions. How and by what mechanism these sequences change their structure has not been clarified. These sequences create problems in predicting the secondary structure. On the other hand, solving the mechanism of their structure change brings us closer to solving the problem of protein folding. Also, these sequences can be the beginning of partial changes in protein folding and they can make the protein to lead to aggregation/fibrillation. These reasons make studying these sequences important.
In this study, chameleon sequences have been investigated from the perspective of physicochemical characteristics such as hydropathy, charge, CH plots and structural characteristics such as the study of dihedral angles (Phi and Psi angls). Then these features were compared with ordered and disordered sequences.
The results show that chameleon sequences behave similar to ordered sequences in terms of charge and hydropathy. The hydropathy values in chameleon and ordered sequences show that these sequences are almost neutral sequences in terms of hydrophobicity.
From the point of view of structural features, chameleon sequences that have a helix or sheet structure are placed in the Ramachandran diagram with a slight deviations from the helix and sheet regions.
Physicochemical characteristics such as charge and hydropathy are very important in understanding protein structure and flexible regions. It has been shown that charged and/or hydrophilic residues have a greater contribution to the flexible regions. Uveresky created a two-dimensional predictor from the combination of average charge and average hydropathy features that separated intrinsically disordered and ordered proteins. In this work, high charge and low hydropathy will create flexibility 14. In another work, the distribution and pattern of charge was investigated. Intrinsically disordered proteins have opposite charges within a sequence and their net charge is almost neutral, these sequences have a high charge, and about 75 percent of them have a charge fraction above 0.35 24. It has been shown that hydrophobicity and surface charge play a role in the formation of aggregation in IgG antibodies25. The combination of these two features, i.e. charge and hydropathy, can be a simple and somewhat reliable tool to identify the flexibility or rigidity of protein structure 26. We used charge and hydrophobicity to compare chameleon sequences with ordered and intrinsically disordered sequences of similar sizes.
In Fig. 1, the average value and the distribution of ACPR for the 3 sub-groups of Chamseq, ORDs, and IDRs is shown. This diagram is multimodal, i.e. three peak are observed at points 0.2, 0.4, 0.6, and a very small peak at 0.8. This shows that the ACPR value has the highest number of repetitions at these points. Between the three mentioned points, there are areas in which no data is detected. The number of these areas could have increased if longer sequences were had been selected. This special behavior is also seen in ORDs and is a result of how ACPR is calculated. The average value and distribution pattern of ACPR for Chameleon sequences is very similar to ordered sequences and distinct for intrinsically disordered sequences.
The second feature studied was hydropathy, which is one of the most important properties of amino acids, i.e. their hydrophilicity and hydrophobicity. Hydrophobic residues tend to be within protein while hydrophilic residues tend to be at the protein surface and exposed to the solvent. As shown in Fig. 2, chameleon sequences mostly contain hydrophobic sequences because they have positive hydropathy and have hydropathic distribution similar to ORDs. However, there is a slight deviation from the distribution of hydropathy in these sequences compared to the ordered sequences.
IDRs sequences have often low hydropathy, i.e. hydrophilic sequences, so these sequences do not have the conditions to fold27. Therefore, it is expected that more chameleon sequences will be located in the inner parts of the protein. However, considering the frequency of hydropathy in these sequences, which is 0.5, it may be better to consider them as a neutral sequence that changes its structure under special conditions. Thus, when the helix or sheet structure is transformed into a turn structure, which is classified as a coil, hairpin structures are created. The newly modified conformation can be responsible for changes in protein folding 28, changes in protein function, changes in molecular recognitions, and promote protein aggregations leading to various diseases 29. When the structure of the hair pin is placed on the surface of the protein, it can change the protein-protein interactions and lead to protein aggregation. The strong tendency of hair-pin structures in forming aggregates cause them to initiate formation of oligomers and fibrils resulting in aggregates. Several studies have shown that hairpin structures play critical role in the wrong folding of proteins that have strong tendency for fibril formation 29,30. On the other hand, some of hairpin structures can prevent the formation of fibrils 31,32. Hairpin structures therefore, play a dual role. More research is needed into the role of chameleon sequences that have the herpin structure, create fibrils, or prevent them.
The third feature studied in this section was the examination of the charge-hydropathy diagram. We compared the differences between Chameleon sequences and intrinsically disordered and ordered sequences in our study, considering absolute charge and hydropathy. It is clear that the charge-hydropathy distribution in the case of Chameleon sequences and ordered sequences are different from the position occupied by IDRs.
Chameleon sequences have charge -hydropathy distribution centers close to ordered sequences. In fact, the charge -hydropathy ratio and the values of these properties are generally similar to ordered sequences but intrinsically disordered sequences are found in a space with high net charge and low hydrophobicity. This is the third reason to show that chameleon sequences have properties close to ordered sequences that have been able to change structure under certain conditions.
Study of dihedral backbone angles is another way to investigation the nature and characteristics of protein and peptide structure. Dihedral angles provide a description of local conformations.
Local structural information constrains the possible conformations that can be generated from a sequence. Therefore, it narrows the conformational space of a polypeptide chain33. Secondary structures are regular structures in proteins. The secondary structural changes by dihedral angle fluctuation and in a deeper level, happen by formation and breaking of hydrogen bonds. These changes are dynamic phenomenon that actually has to be addressed through simulations and laboratory studies such as spectroscopic methods.
Dihedral angle information is also used to study two conformations of a protein. This is done by dihedral transition analysis. Using it, stable fragment and transition fragment are identified and separated[36]. Dihedral angles create distinct regions in a 2D diagram. The classic Ramachandran diagram has three separate regions: α-ahlix, β-sheet and left -handed α Helix. However, the Ramachandran diagram obtained from experimental data has more areas. Each group of researchers according to their logic has divided the chart into different regions that sometimes are not compatible. There are mainly 3 regions, alpha helices region, beta region, and polypeptide II area (PII), and PII′, γ, γ′, ζ, δ, δ′ regions defined for Ramachandran experimental diagrams, and one low-density region called epsilon introduced by Karplus 23. To study the structure of chameleon sequences and compare them with ordered and disordered sequences, we also used phi and psi angles and Ramachandran diagram. Phi and psi angles of each residue were extracted from the DSSP database and the Ramachandran diagram of the Chameleon sequences was generated. As can be seen in Fig. 4, the Ramachandran diagram was drawn for each amino acid in the Chameleon sequence. While this diagram has many scattered points, the areas mentioned above are clearly visible in Figs. 4. According to Figs. 4A ,B and C, the γ and γ′ the polypeptideII, and the ε contain both the phi and psi angles of the chameleon sequences, which are classified as coils. Between the alpha helix area and the beta sheet area, there is an area that belongs to a variety of turns. Since the turns are in the group of coils in this category, this area is covered by them. Next, the mean of phi and psi angles for each chameleon sequences with directional statistic was calculated and shown in Fig. 5. 22,34,35. The distribution centers of the created clusters of Fig. 5 were also calculated. The mean of phi and psi of chameleon sequences with helix structure are slightly different from the mean of these angles in the classical diagram and are close to the mean values of the experimental diagram. Ordered helix and sheet sequences are sequences that do not have chameleon properties and do not change their structure according to the environment. The values of phi and psi angles relative to ordered helices in theory are (-57,-47) and in experimental data (-63,-43).
The values of phi and psi angles in the ordered sequences selected in this work were ( -66,-33) .Chameleon sequences with helix that can adopt sheet structure or coil structurec (H-EH and H-CH) had phi and psi angles (-64,-40) and (-65,-39) respectively.
As can be seen in Fig. 6, the distribution center of chameleon sequences dihedral angles with helix structure is very close to the distribution center of the ordered helix sequence, but slightly different from the ordered structure also the distrib2ution center of chameleon sequences dihedral angles with sheet structure is very close to the dis2224tribution center of the ordered sheet sequence, but slightly different from the ordered structure. Perhaps this slight deviation of the dihedral angles of the Chameleon sequences from the dihedral angles of the ordered sequences has caused the potential to change the structure of the Chameleon sequences and makes favorable conditions so that chameleon sequences can change their structure. As shown in Fig. 5, the phi and psi angles for each of the Helix and Sheet chameleon sequences are in the allowed regions of the Ramachandran diagram and are quite different from the Ramachandran diagram of the IDRs. The Ramachandran diagram for IDRs is more like this diagram for coils 36. Thus, as can be seen from Figs. 4, 5, and 6, the chameleon sequences are structurally similar to the structures of the ORDs sequences. These diagrams show a slight deviation in the phi and psi angles from the order sequences, allowing the Chameleon sequences to be restructured and to adopt a new secondary structure in a suitable environment.