1.Kidera A, Konishi Y, Ooi T, Scheraga HA. Relation between Sequence Similarity and Structural Similarity in Proteins - Role of Important Properties of Amino-Acids. J Protein Chem. 1985;4(5):265–97.
2.Krissinel E. On the relationship between sequence and structure similarities in proteomics. Bioinformatics. 2007;23(6):717–23.
3.Uversky VN. Intrinsically Disordered Proteins and Their “Mysterious” (Meta)Physics. Front Phys-Lausanne. 2019;7.
4.Rado-Trilla N, Alba MM. Dissecting the role of low-complexity regions in the evolution of vertebrate proteins. Bmc Evol Biol. 2012;12.
5.Chen JW, Romero P, Uversky VN, Dunker AK. Conservation of intrinsic disorder in protein domains and families: I. A database of conserved predicted disordered regions. J Proteome Res. 2006;5(4):879–87.
6.Kumari B, Kumar R, Kumar M. Low complexity and disordered regions of proteins have different structural and amino acid preferences. Mol Biosyst. 2015;11(2):585–94.
7.Mier P, Paladin L, Taman S, Petrosian S, Hajdu-Soltesz B, Urbanek A, et al. Disentangling the complexity of low complexity proteins. Brief Bioinform. 2019;00(00):1–15.
8.Kajava AV. Tandem repeats in proteins: From sequence to structure. J Struct Biol. 2012;179(3):279–88.
9.Paladin L, Hirsh L, Piovesan D, Andrade-Navarro MA, Kajava AV, Tosatto SCE. RepeatsDB 2.0: improved annotation, classification, search and visualization of repeat protein structures. Nucleic Acids Res. 2017;45(D1):D308-D12.
10.Jorda J, Xue B, Uversky VN, Kajava AV. Protein tandem repeats - the more perfect, the less structured. Febs J. 2010;277(12):2673–82.
11.Cerveny L, Straskova A, Dankova V, Hartlova A, Ceckova M, Staud F, et al. Tetratricopeptide Repeat Motifs in the World of Bacterial Pathogens: Role in Virulence Mechanisms. Infect Immun. 2013;81(3):629–35.
12.Schmitz-Linneweber C, Small I. Pentatricopeptide repeat proteins: a socket set for organelle gene expression. Trends Plant Sci. 2008;13(12):663–70.
13.Renault L, Nassar N, Vetter I, Becker J, Klebe C, Roth M, et al. The 1.7 angstrom crystal structure of the regulator of chromosome condensation (RCC1) reveals a seven-bladed propeller. Nature. 1998;392(6671):97–101.
14.Varela M, Diaz-Rosales P, Pereiro P, Forn-Cuni G, Costa MM, Dios S, et al. Interferon-Induced Genes of the Expanded IFIT Family Show Conserved Antiviral Activities in Non-Mammalian Species. Plos One. 2014;9(6).
15.Jacobsen SE, Binkowski KA, Olszewski NE. SPINDLY, a tetratricopeptide repeat protein involved in gibberellin signal transduction Arabidopsis. P Natl Acad Sci USA. 1996;93(17):9292–6.
16.Pellegrini M, Renda ME, Vecchio A. Ab initio detection of fuzzy amino acid tandem repeats in protein sequences. Bmc Bioinformatics. 2012;13.
17.Marcotte EM, Pellegrini M, Yeates TO, Eisenberg D. A census of protein repeats. J Mol Biol. 1999;293(1):151–60.
18.Kajava AV. Review: Proteins with repeated sequence - Structural prediction and modeling. J Struct Biol. 2001;134(2–3):132–44.
19.Jernigan KK, Bordenstein SR. Tandem-repeat protein domains across the tree of life. Peerj. 2015;3.
20.Schaper E, Kajava AV, Hauser A, Anisimova M. Repeat or not repeat?-Statistical validation of tandem repeat prediction in genomic sequences. Nucleic Acids Res. 2012;40(20):10005–17.
21.Sikorski RS, Boguski MS, Goebl M, Hieter P. A Repeating Amino-Acid Motif in Cdc23 Defines a Family of Proteins and a New Relationship among Genes Required for Mitosis and Rna-Synthesis. Cell. 1990;60(2):307–17.
22.D’Andrea LD, Regan L. TPR proteins: the versatile helix. Trends Biochem Sci. 2003;28(12):655–62.
23.Marold JD, Kavran JM, Bowman GD, Barrick D. A Naturally Occurring Repeat Protein with High Internal Sequence Identity Defines a New Class of TPR-like Proteins. Structure. 2015;23(11):2055–65.
24.Gul IS, Hulpiau P, Saeys Y, van Roy F. Metazoan evolution of the armadillo repeat superfamily. Cell Mol Life Sci. 2017;74(3):525–41.
25.Andrade MA, Petosa C, O’Donoghue SI, Muller CW, Bork P. Comparison of ARM and HEAT protein repeats. J Mol Biol. 2001;309(1):1–18.
26.Andrade MA, Bork P. Heat Repeats in the Huntingtons-Disease Protein. Nat Genet. 1995;11(2):115–6.
27.Andrade MA, Perez-Iratxeta C, Ponting CP. Protein repeats: Structures, functions, and evolution. J Struct Biol. 2001;134(2–3):117–31.
28.Espada R, Parra RG, Sippl MJ, Mora T, Walczak AM, Ferreiro DU. Repeat proteins challenge the concept of structural domains. Biochem Soc T. 2015;43:844–9.
29.Schaper E, Gascuel O, Anisimova M. Deep Conservation of Human Protein Tandem Repeats within the Eukaryotes. Mol Biol Evol. 2014;31(5):1132–48.
30.Schuler A, Bornberg-Bauer E. Evolution of Protein Domain Repeats in Metazoa. Mol Biol Evol. 2016;33(12):3170–82.
31.Sonnhammer ELL, Durbin R. A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis (Reprinted from Gene Combis, vol 167, pg GC1-GC10, 1996). Gene. 1995;167(1–2):Gc1-Gc10.
32.Bateman A, Martin MJ, Orchard S, Magrane M, Alpi E, Bely B, et al. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019;47(D1):D506-D15.
33.Ronquist F, Huelsenbeck JP. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003;19(12):1572–4.
34.Steere AC, Drouin EE, Glickstein LJ. Relationship between Immunity to Borrelia burgdorferi Outer-surface Protein A (OspA) and Lyme Arthritis. Clin Infect Dis. 2011;52:S259-S65.
35.Miras I, Saul F, Nowakowski M, Weber P, Haouz A, Shepard W, et al. Structural characterization of a novel subfamily of leucine-rich repeat proteins from the human pathogen Leptospira interrogans. Acta Crystallogr D. 2015;71:1351–9.
36.Azad A, Pavlopoulos GA, Ouzounis CA, Kyrpides NC, Buluc A. HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks. Nucleic Acids Res. 2018;46(6).
37.Cock PJA, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009;25(11):1422–3.
38.Frickey T, Lupas A. CLANS: a Java application for visualizing protein families based on pairwise similarity. Bioinformatics. 2004;20(18):3702–4.
39.Pellegrini M, Marcotte EM, Yeates TO. A fast algorithm for genome-wide analysis of proteins with repeated sequences. Proteins-Structure Function and Genetics. 1999;35(4):440–6.
40.Szklarczyk R, Heringa J. Tracking repeats using significance and transitivity. Bioinformatics. 2004;20:311–7.
41.Heger A, Holm L. Rapid automatic detection and alignment of repeats in protein sequences. Proteins-Structure Function and Genetics. 2000;41(2):224–37.
42.Lo Conte L, Ailey B, Hubbard TJP, Brenner SE, Murzin AG, Chothia C. SCOP: a Structural Classification of Proteins database. Nucleic Acids Res. 2000;28(1):257–9.
43.Soding J, Remmert M, Biegert A. HHrep: de novo protein repeat detection and the origin of TIM barrels. Nucleic Acids Res. 2006;34:W137-W42.
44.Pellegrini M, Renda ME, Vecchio A. TRStalker: an efficient heuristic for finding fuzzy tandem repeats. Bioinformatics. 2010;26(12):i358-i66.
45.Jorda J, Kajava AV. T-REKS: identification of Tandem REpeats in sequences with a K-meanS based algorithm. Bioinformatics. 2009;25(20):2632–8.
46.Newman AM, Cooper JB. XSTREAM: A practical algorithm for identification and architecture modeling of tandem repeats in protein sequences. Bmc Bioinformatics. 2007;8.
47.Xing HT, Fu XK, Yang C, Tang XF, Guo L, Li CF, et al. Genome-wide investigation of pentatricopeptide repeat gene family in poplar and their expression analysis in response to biotic and abiotic stresses. Sci Rep-Uk. 2018;8.
48.Rahire M, Laroche F, Cerutti L, Rochaix JD. Identification of an OPR protein involved in the translation initiation of the PsaB subunit of photosystem I. Plant J. 2012;72(4):652–61.
49.Mularoni L, Veitia RA, Alba MM. Highly constrained proteins contain an unexpectedly large number of amino acid tandem repeats. Genomics. 2007;89(3):316–25.
50.Makabe K, McElheny D, Tereshko V, Hilyard A, Gawlak G, Yan S, et al. Atomic structures of peptide self-assembly mimics. P Natl Acad Sci USA. 2006;103(47):17753–8.
51.Holm L, Sander C. An evolutionary treasure: Unification of a broad set of amidohydrolases related to urease. Proteins. 1997;28(1):72–82.
52.Sarti E, Aleksandrova AA, Ganta SK, Yavatkar AS, Forrest LR. EncoMPASS: an online database for analyzing structure and symmetry in membrane proteins. Nucleic Acids Res. 2019;47(D1):D315-D21.
53.Henikoff S, Henikoff JG. Amino-Acid Substitution Matrices from Protein Blocks. P Natl Acad Sci USA. 1992;89(22):10915–9.
54.Kaisers W. seqTools: Analysis of nucleotide, sequence and quality content on fastq files. 2019.
55.Hold-Geoffroy Y, Gagnon O, Parizeau M. Once you SCOOP, no need to fork. Proceedings of the 2014 Annual Conference on Extreme Science and Engineering Discovery Environment; July 13–18, 2014; Atlanta, GA, USA2014.
56.Mullner D. fastcluster: Fast Hierarchical, Agglomerative Clustering Routines for R and Python. J Stat Softw. 2013;53(9):1–18.
57.Galili T. dendextend: an R package for visualizing, adjusting and comparing trees of hierarchical clustering. Bioinformatics. 2015;31(22):3718–20.
58.Xiao N, Cao DS, Zhu MF, Xu QS. protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences. Bioinformatics. 2015;31(11):1857–9.
59.Pagès H, Aboyoun P, R G, S aD. Biostrings: Efficient manipulation of biological strings. 2.46.0 ed. R2017.
60.Warnes GR, Bolker B, Bonebakker L, Gentleman R, Huber W, Liaw A, et al. Gplots: Various R Programming Tools for Plotting Data. R2016.
61.Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The Protein Data Bank. Nucleic Acids Res. 2000;28(1):235–42.
62.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic Local Alignment Search Tool. J Mol Biol. 1990;215(3):403–10.
63.Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44(D1):D279-D85.
64.Chojnacki S, Cowley A, Lee J, Foix A, Lopez R. Programmatic access to bioinformatics tools from EMBL-EBI update: 2017. Nucleic Acids Res. 2017;45(W1):W550-W3.