In silico analysis of regulatory elements of the bldD gene of antibiotic-producing Streptomyces species

DOI: https://doi.org/10.21203/rs.3.rs-1637072/v1

Abstract

Background: Streptomyces are known for their ability to produce a great variety of antibiotics and other bioactive compounds. The production of these molecules is temporally and genetically coordinated with the bacterial morphological changes. These changes are controlled by transcriptional regulators which coincide with antibiotics production. The bldD gene is identified as one of the key players in the complex morphogenesis and activator of antibiotic production in Streptomyces. Besides the laboratory-based experimental works, Genome mining and in silico analysis of transcription start sites, promoter regions, transcription factors, and their binding sites, CpG islands of the bldD gene of antibiotic-producing Streptomyces species are the fundamental steps to understanding the regulatory mechanisms and its impact on the antibiotic production. 

Results: Our study identified the most important promoters in the upstream coding regions of the bldD gene of the 13 antibiotic-producing Streptomyces species. All, 13/13 (100%) of bldD genes have a single transcription start site (TSS) flanking the coding regions. The MEME algorithm revealed five motifs (MtS1-5), of which Motif 1 (MtS1) has the lowest E value and the key regulatory motifs for bldD genes among the discovered motifs. Using the TOMTOM web program, we identified 13 transcription factors with a capacity to bind MtS1. The analysis of the CpG Island of the bldD gene of the antibiotic-producing Streptomyces species indicated the presence of lower CpG islands. Phylogenetic analysis identified that bldD genes of antibiotic-producing species considered in this study are very closely related to other groups of Streptomyces. 

Conclusions Our study showed that the regulatory elements of bldD genes in antibiotic-producing Streptomyces are located closely upstream of the genes. A detailed understanding of these regulatory elements of the gene that encodes the key activator of antibiotic biosynthesis in Streptomyces species will in enhance the laboratory-based experiments for the production of the antibiotic.

Background

Most of the bioactive ingredients used in medicine, not only today but also in the past, come from natural sources such as microorganisms. The natural products could provide new structures that have biological beneficial properties [1]. The actinomycetes are potential producers of antibiotics and other therapeutically useful compounds [2]. The vast majority of these metabolites (70%) have been isolated from actinomycetes with the remaining 20% from fungi, 7% from Bacillus, and 1–2% from Pseudomonas. Hence, it is known that the actinomycetes are perhaps the most important group of organisms studied extensively for the discovery of drugs and other bioactive metabolites programs [3]. 

Among the actinomycetes groups; streptomycetes are the major antibiotic-producing organisms utilized by the pharmaceutical industry because they produce over many thousands of bioactive compounds, many of which are secondary metabolites that are strong antibiotics [3,4]. Despite significant development in the disciplines of chemical synthesis and engineered biosynthesis of antibacterial chemicals, nature remains the richest and most versatile source of novel antibiotics at a low cost [2]. Although thousands of antibiotics have been identified to date, only a small proportion of them are useful for people and animals due to their toxicity. To resolve this concern, researchers are looking for novel antibiotics that are both effective and do not have harmful side effects. 

Antibiotic resistance is another serious health concern. The need for novel antibiotics is emphasized by the rapid evolution of drug resistance in pathogenic bacteria, particularly multidrug-resistant pathogens [5]. The potential of the genus Streptomyces to produce commercially useful compounds remains critical due to the relatively vast DNA complement of these bacteria [6]. Streptomyces have received more medical and commercial attention for three important reasons: i) they are abundant and prominent in soil; ii) they have a fairly wide phylogenetic distribution; and ii) they are among nature's most capable chemists, producing a remarkable range and diversity of bioactive secondary metabolites [7]. 

Streptomycetes are known for their ability to produce bioactive ingredients that overlap with the developmental program. Therefore, it is important to understand the signals and mechanisms that trigger them. Extensive genome analyzes revealed the regulators required for development to begin and activation of drug production. These regulators of the growth and activator of active metabolites are known as BldD because mutations in the genes encoding   this regulatory factor , bldD prevent the growth of the reproductive aerial hyphae that give colonies their fuzzy appearance and deplete the production of antibiotics [8,9,10]. 

In many Streptomyces strains, genome sequencing has led to the discovery of multiple potential gene clusters engaged in secondary metabolite synthesis. A large gene cluster comprising a cluster-situated regulator (CSR) expressing gene is commonly used to biosynthesize each antibiotic. Pleiotropic regulators keep track of developmental status, food availability, and a variety of stressors before sending signals to the CSR genes, which control antibiotic production [11]. An antibiotic regulatory network has to be elucidated in order to find new approaches to enhance antibiotic production and arouse cryptic antibiotic synthesis. Various degrees of transcriptional regulators tightly govern the commencement of morphological development, which is often associated with antibiotic production, in response to environmental and physiological changes [12,13,14]. In many situations, one or more cluster-situated regulators (CSRs) control the transcription of structural genes within antibiotic biosynthetic gene clusters. CSRs, in turn, are subject to a complex regulatory network of higher-level authorities [15]. 

Since recent years, in silico analysis of gene sequences and their products are becoming common methods for identifying gene expression patterns and sequences responsible for the synthesis of novel drugs. This also led to the identification of numerous new medicinal products. A number of computational tools have been developed to assist researchers in this discipline. The majority of tools are based on the in silico study of specific genes and gene products [16]. Therefore, the aim of this computational study was to analysis of promoter regions and regulatory elements of the transcription-regulatory bldD gene from antibiotic-producing Streptomyces species. 

Results

Determination of the Promoter Positions/Transcription Start Site (TSS)  

Determining the location of the transcription start sites and promoter region in a given gene is vital for the study of the mechanism of gene regulation. The core promoter is a minimum promoter region that is capable of initiating basal transcription. It contains a transcription start site (TSS) and typically spans from −60 to +40 relative to the TSS [17]. In this study, we included ≥1kb upstream of the coding regions to locate the transcription start site (TSS) using the NNPP tools set. The results of our analysis indicated bldD genes 13/13 (100%) from antibiotic-producing Streptomyces species included in this study have only one TSS. The TSS of 11/13 (86.61%) is located less than 100bp upstream of the start codon of the bldD gene (Table 1). This indicated that the transcriptional regulators of the bldD gene in antibiotic-producing Streptomyces species are located closest to the gene's start codon. The BPROM program utilizes a linear discriminant function (LDF) to make a prediction based on the characteristics in the -200 to +50 bps region of the TSS [17,18,19], where higher LDF indicates a high probability of expression of the gene. As a result, 150 bp were included to locate the -10 box positions (highly conserved regions) and -35 box positions (less conserved regions) of the gene. Accordingly, the core promoters of the bldD genes of 58470067, 66853606, 61473082, and [69807580 & 69764486] have an LDF threshold of 2.18, 2.14, 2.04, and 2.03, respectively whilst the core promoter genes of [58431103 & 24306276] have the lowest LDF thresholds of 1.18 and 1.47, respectively.  

Table-1

 Identified TSS, distance from gene start codon, LDF value determined using the NNPP toolset version 2.2 and BPROM with the minimum standard predictive promoter score and cut off value of 0.8. 

Gene Name  

Gene  ID

Chromosome locations 

Number of predicted/TSS 

 TSS position

-10 box at positions 

-35 box at positions

Linear Discriminant Function (LDF) value 

bldD

6213046

NC_010572

1

97

82

61

1.99

bldD

24306276

NC_013929.1

1

122

107

86

1.47

bldD

15149186

NC_020990.1

1

122

107

86

1.68

bldD

66853606

NZ_CP048261.1

1

98

83

62

2.14

bldD

63978737

NZ_CP070242.1

1

97

82

61

1.99

bldD

61473082

NZ_CP065253.1

1

89

74

53

2.04

bldD

58431103

NZ_KV757141.1

1

97

82

61

1.18

bldD

58470067

NZ_BBQG01000011.1

1

95

80

60

2.18

bldD

69878388 

NZ_CP086102.1

1

97

82

61

1.99

bldD

69863271 

NZ_CP018074.1

1

98

83

62

1.87

bldD

69807580

NZ_JAGJBY010000001

1

89

74

53

2.03

bldD

69764486

NZ_CP043317.1

1

87

72

51

2.03

bldD

57807597

NZ_JABSUS0100000.1

1

97

82

61

1.99

 

Identification of Common Motifs and Transcription Factors (TFs)  

Using MEME software; conserved motifs for bldD genes of 13 antibiotics producing Streptomyces species were analyzed. For each promoter region, five candidate motifs were identified (Table 2). The presence of common motifs that serve as binding sites for transcription factors that affect the expression of the gene was determined. The motif which has the least E-value (MtS1) has been submitted to the TOMTOM. Our analysis showed that the sequence of the 5’ promoter regions share equal (100%) common motif binding sites. All of the identified motifs equally shared the binding site distributions (100%); however, they showed variation based on statistical expectation value (E-value). Besides, the MtS2, MtS5, MtS1, MtS3, and MtS4 contain 19, 18, 17, 16, 13, and 11 binding site matches motif provided database, respectively 

Table 2

 List of discovered motifs, number of promoter-containing motifs, number of binding sites and total number of binding site matches the bldD gene via motif provided in motif database. 

Discovered motifs 

Number of promoter containing motifs 

E value 

Motif width 

Number of motifs binding site 

Total number of binding site matches in data base 

MtS1

13 (100%)

1.0e-216

50

13

16

MtS2

13 (100%)

1.3e-215

50

13

19

MtS3

13 (100%)

7.7e-209

50

13

13

MtS4

13 (100%)

8.1e-202

50

13

11

MtS5

13 (100%)

1.6e-197

50

13

18

In addition, MEME generated thirteen candidate motifs distributed from the position of TSS (+1) to upstream of ≥1 kb. All candidate motifs were distributed in the positive strand with high binding sites. The distributions and the binding site of MtS1 range from -200 to -700 upstream of the transcription start site positions and have high binding sites as well as located closest to the TSS positions. While MtS2 lie in the -500 to -1000 range and they are distant from the TSS positions. Besides, MtS3 lie in the -600 to -1000 range and they are distant from the TSS positions. In addition, 53/65 (81.53%) of the identified motifs were found within the range of +1 to -700. From this study, it is possible to suggest that the transcription regulatory factors BldD bind to the motif closest to the TSS positions and activate antibiotic synthesizing genes (Fig. 1).

Transcription factors (TFs) are essential regulatory patterns that control gene expression. Using TOMTOM, we compared the matching MtS1 with the publicly accessible prokaryotic motif database. The analysis results showed numerous matching motifs between MtS1 and the internationally registered motifs. We identified 11 transcription factors associated with MtS1 which includes Putative DNA-binding protein, integrating host factor subunit alpha, RNA polymerase sigma54 factor, positive regulatory protein of alginate biosynthesis, AraC family transcriptional regulator, nucleoid-associated protein EspR, sigma factor PvdS, macrodomain Ter-Protein, RNA polymerase Sigma 70 family protein (Table 3). The transcription factors play different molecular and biological functions in different groups of organisms. Our study revealed that most transcription factors share a common function in different microorganisms. Notably, the predominant biological function includes DNA-binding transcription activator activity and binding of transcription cis-regulatory region. In addition, positive and negative regulations of transcription and their roles as DNA template are also some of the common feature of transcription factors. 

Table-3

 List of matching candidate transcription factors (TFs) which could bind to common MtS1 and motif GO terms for motif MtS1 

Organisms Name 

Transcriptions factor/proteins

Gene Name

Functions 

E-value 

Gene expressions database 

GO  Molecular function 

GO - Biological processes

Streptomyces coelicolor A3(2)

 

Putative DNA-binding protein

SCO1489

DNA-binding transcription repressor activity, Nucleotide binding, Sequence-specific DNA binding & Transcription cis-regulatory region binding

Negative regulation of transcription, DNA-templated

 

5.84e-02

Collectf/ 

EXPREG_00000fc0

 

Pseudomonas putida (strain ATCC 47054 

Integration host factor subunit alpha

ihfA

DNA-binding transcription activator activity, DNA-binding transcription repressor activity & transcription cis-regulatory region binding

DNA recombination & Regulation of translation

 

1.20e-01

Collectf/

 EXPREG_000006f0

Vibrio cholerae serotype O1 (strain ATCC 39315)

RNA polymerase sigma-54 factor

VC_2529

DNA binding, DNA-binding transcription activator activity, DNA-directed 5'-3' RNA polymerase activity & sigma factor activity

DNA-templated transcription and initiation

2.71e+00

Collectf/ 

EXPREG_000016e0

Pseudomonas aeruginosa (strain ATCC 15692

Positive alginate biosynthesis regulatory protein

algR

DNA-binding transcription activator activity, DNA-binding transcription repressor activity, Phosphorelay response regulator activity sequence-specific DNA binding & Transcription cis-regulatory region binding

 

Alginic acid biosynthetic process, Bacterial-type flagellum-dependent swarming motility, Negative regulation of transcription, DNA-templated, Positive regulation of cell motility, Positive regulation of single-species biofilm formation ,Positive regulation of transcription, DNA-templated, Regulation of response to reactive oxygen species Regulation of transcription, DNA-templated & Type IV pilus-dependent motility

3.44e+00

 

 

 


 Collectf/

 EXPREG_000009d0

 

Xanthomonas oryzae pv. oryzae

AraC family transcriptional regulator

hrpXo 

DNA-binding transcription activator activity & transcription cis-regulatory region binding

positive regulation of transcription, DNA-templated

 

4.72e+00

Collectf/ 

EXPREG_000017f0

Streptomyces coelicolor (strain ATCC BAA-471

AraC-family transcriptional regulator

SCO2792

DNA-binding transcription factor activity & Sequence-specific DNA binding

 

Positive regulation of transcription, DNA-templated

4.33e+00

Collectf/ 

EXPREG_00001770

Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv)

Nucleoid-associated protein EspR

espR

DNA binding & Identical protein binding

 

Regulation of protein secretion , Regulation of transcription, DNA-templated & Response to host immune response

5.32e+00

 

Collectf/ 

EXPREG_00000c30

Pseudomonas putida (strain ATCC 47054 

Integration host factor subunit alpha

ihfA

DNA-binding transcription activator activity, DNA-binding transcription repressor activity & Transcription cis-regulatory region binding

DNA recombination & Regulation of translation

 

8.00e+00

Collectf/

EXPREG_000006f0

Pseudomonas aeruginosa (strain ATCC 15692

Sigma factor PvdS

pvdS

DNA-binding transcription activator activity, Sigma factor activity& Transcription cis-regulatory region binding

 

Cellular response to iron ion, DNA-templated transcription, initiation, Positive regulation of secondary metabolite biosynthetic process, Positive regulation of transcription, DNA-templated & Regulation of transcription, DNA-templated

8.14e+00

 

Collectf/

EXPREG_000004b0

Escherichia coli (strain K12)

Macrodomain Ter protein

matP

Sequence-specific DNA binding

Cell cycle, Cell division , Chromosome organization, Chromosome segregation & Regulation of transcription, DNA-templated

8.19e+00

Collectf/

EXPREG_000007b0

 

Pseudomonas syringae pv. tomato (strain ATCC BAA-871/ DC3000)

 

RNA polymerase sigma-70 family protein

 

PSPTO_2133

DNA-binding transcription activator activity, sigma factor activity & transcription cis-regulatory region binding

DNA-templated transcription, initiation, positive regulation of transcription, DNA-templated & response to stimulus

 

8.19e+00

Collectf/

EXPREG_00001150

  

Determinations of Transcription Factors Binding Sites (TFBS) 

Transcription Factor Binding Sites (TFBS) are also crucial for understanding gene expression regulations [20]. Thirteen antibiotic-producing Streptomyces species bldD gene promoter sequences were entered into GLAM2 and GLAM2 HTML output was clicked and searched for TFBS. As depicted in Fig.2, 49 nucleotide base pairs TFBS were identified. Alignment was also conducted to check the presence of deletion and insertion among the 13 antibiotic-producing Streptomyces. Consequently, the aligned columns have no deletion or insertion and bldD genes of antibiotic-producing Streptomyces were ungapped. In addition, GLAM2 analysis showed that 8/13 (61.53%) bldD gene indicates a high marginal value, 91.6. These species have a strong motif and better matches to the overall motifs, suggesting a high transcription binding site with the transcription regulatory factor BldD. In contrast, bldD 69863271 showed a lower marginal value, 75.2, suggesting the species has a weak motif and fewer matches to the overall motifs, as well as signifying the lower gene expression level.

Comparisons of the Candidate Motifs to the Database Motifs 

The thirteen candidate motifs were compared to the motifs in the motif database (Collectf and EXPREG). Our studies showed that all candidate motifs 13/13 (100%) share the same TFBS with VqsM_P.aeruginosa and RpoN_V.cholerae. Additionally, of the 13 candidate motifs, 9/13 (69.23%) share the same transcription factor binding site (TFBS) with PhoP_Y.pestis, and of these, 52% are activated and 47% are repressed (Table 4). The exemplification logos for the optimal comparison of the IHF P.putida sequence and MtS I of the target motifs with the discovered motif in the database are depicted in (Fig 3). The candidate motif having the same TFBS with the publicly available motif database (Collectf and EXPREG) could suggest that the bldD gene plays a role in the antibiotic production of the Streptomyces species. 

Table 4

 The candidate motifs in the collectf and EXPREG databases match the sequence enriched motif of with E-values ≤ 10. 

Alternate name for the motif provided in the motif database file

Regulatory Mode 

E-value 

Number of primary sequences matching the motif

Motif database file

 

Activ. (%)

Rep. (%)

Dual (%)

NS (%)

PhoP_Y.pestis

52

47

0

0

1.92e-2

9 / 13 (69.2%)

EXPREG_00000050

IHF_P.putida

55

45

0

0

6.92e-2

8 / 13 (61.5%)

EXPREG_00000700

ArgR_P.aeruginosa

61

27

0

11

1.01e-1

11 / 13 (84.6%)

EXPREG_00000470

Fur_V.cholerae

0

100

0

0

1.01e-1

11 / 13 (84.6%)

EXPREG_000008b0

ToxT_V.cholerae

81

18

0

0

2.19e-1

7 / 13 (53.8%)

EXPREG_00000240

OmpR_Y.pestis

100

0

0

0

2.19e-1

7 / 13 (53.8%)

EXPREG_00001000

CcpA_S.suis

17

25

0

57

2.19e-1

7 / 13 (53.8%)

EXPREG_00001810

CRP_V.vulnificus

100

0

0

0

3.03e-1

12 / 13 (92.3%)

EXPREG_00001030

Fur_N.gonorrhoeae

60

10

0

30

4.05e-1

10 / 13 (76.9%)

EXPREG_00000ec0

VqsM_P.aeruginosa

7

0

0

92

4.38e-1

13 / 13 (100.0%)

EXPREG_00001670

RpoN_V.cholerae

100

0

0

0

4.38e-1

13 / 13 (100.0%)

EXPREG_000016e0

Lrp_E.coli

1

1

0

97

4.54e-1

12 / 13 (92.3%)

EXPREG_00000840

Zur_N.meningitidis

15

84

0

0

1.42e0

10 / 13 (76.9%)

EXPREG_000016a0

IHF_P.putida

100

0

0

0

1.93e0

9 / 13 (69.2%)

EXPREG_000006f0

PvdS_P.aeruginosa

100

0

0

0

3.44e0

8 / 13 (61.5%)

EXPREG_000004b0

Fur_A.ferrooxidans

0

63

0

36

4.02e0

4 / 13 (30.8%)

EXPREG_00000370

Vfr_P.aeruginosa

41

11

0

47

4.02e0

4 / 13 (30.8%)

EXPREG_00000b50

PhhR_P.putida

90

10

0

0

4.02e0

4 / 13 (30.8%)

EXPREG_00001190

Fur_P.aeruginosa

0

100 

0

0

5.11e0

11 / 13 (84.6%)

EXPREG_00000c80

CsgD_E.coli

33

22

0

44

5.81e0

9 / 13 (69.2%)

EXPREG_00000b00

LexA_P.difficile

0

0

0

100

6.02e0

6 / 13 (46.2%)

EXPREG_00000120

CRP_E.coli

82

17

0

0

6.02e0

6 / 13 (46.2%)

EXPREG_00000850

H-NS_V.cholerae

0

100 

0

0

6.38e0

13 / 13 (100.0%)

EXPREG_00001730

CcpA_C.difficile

9

36

0

53

6.73e0

5 / 13 (38.5%)

EXPREG_00000d10

LasR_P.aeruginosa

98

1

0

0

9.24e0

3 / 13 (23.1%)

EXPREG_000009b0

OxyR_P.aeruginosa

3

0

0

96

9.24e0

3 / 13 (23.1%)

EXPREG_00001560

AdpA_S.coelicolor

100

0

0

0

9.40e0

9 / 13 (69.2%)

EXPREG_00001770

 

Activ: activations, Rep: repression NS: non specified,  IHF: integrated host factors, ArgR: arginine responsive regulators, Fur: ferric uptake regulators, OmpR: Outer Membrane Proteins regulators, CcpA: catabolite control protein A, CRP: Cyclic AMP-cAMP receptor protein, VqsM:   Virulence and quorum sensing modulator protein, RpoN: RNA polymerase sigma-54 factor, Lrp: leucine-responsive regulatory protein, Zur: Zinc uptake regulator, PvdS: siderophore pyoverdine, Vfr:   virulence factor regulator, PhhR: phenylalanine hydroxylase regulators,  CsgD: Curlin subunit gene D, H-NS: Histone-Like Nucleoid Structuring Protein, OxyR: oxygen regulators.  

 

Analysis of CpG Islands

Two techniques were used for the analysis of the CpG Island: The first is the offline tool CLC Genomics Workbench Version 8.5, with which the restriction enzyme sites MspI with fragment sizes between 40 and 220 bp were searched for parameters. Accordingly, the result showed that among the 13 Streptomyces species containing bldD genes, only one species 1/13 (7.69%) (GI: 69878388) has a single cleavage site; whereas the remains have multiple cleavage sites (Table 5). And the second algorithm is Takai and Jones, and the possible CpG island regions and CpG island density are shown in (Fig. 4). Our study revealed that only 1 putative CpG Island was detected for each gene sequence. However, the percentage of GC content varies between species. Consequently, the study showed that the GC content of the genes ranges from 68 to 73%. The GC contents of the bldD in antibiotic-producing gene of Streptomyces species (GI; 61473082 & 24306276) were 73 and 68 percent and were the highest and lowest, respectively. 

Table-5

 Identification of MSpI cutting sites and fragment sizes (40 and 220) for bldD gene of streptomyces species  

Gene ID

Nucleotide positions of MspI enzyme cutting sites

Fragment size (40-220)

6213046

Multiple cut (29, 64, 78, 115, 243, 402, 434, 443, 467, 686, 819, 849, 876, 890, 1109, 1121, 1205, 1269, 1278, 1309, 1313, 1322, 1350, 1534, 1569, 1588, 1619, 1670, 1676, 1684, 1713, 1826, 1841)

128,159,219,133,219,84,64,184,51,113

24306276

Multiple cut (19, 55, 100, 112, 384, 719, 729, 810, 840, 854, 867, 881, 1100, 1112, 1206, 1432, 1523, 1577, 1608, 1618, 1659, 1669, 1673, 1702, 1847, 1976, 1987, 1992)

45,81,219,94,91,54,145,129

15149186

Multiple cut (49, 130, 143, 175, 184, 265, 427, 460, 467, 488, 569, 599, 613, 626, 784, 790, 859, 871, 973, 1013, 1035, 1042, 1050, 1057, 1072, 1099, 1136, 1143, 1207, 1282, 1364, 1418, 1432, 1444, 1471, 1500, 1514, 1633, 1645, 1782, 1799, 1804, 1809, 1832, 1861, 1945, 1959, 1987, 1992)

 

81,81,162,81,158,69,102,40,64,75,82,54,119,137,84

66853606

Multiple cut (12, 29, 53, 68, 343, 368, 627, 660, 667, 688, 799, 813, 826, 1059, 1071, 1155, 1212, 1230, 1253, 1262, 1527, 1612, 1663, 1670, 1674, 1700, 1715, 1825)

 

111,84,57,85,51

63978737

Multiple cut (17, 53, 95, 110, 382, 727, 747, 808, 838, 852, 865, 879, 1098, 1205, 1433, 1524, 1578, 1609, 1660, 1670, 1674, 1724, 1977, 1987, 1992)

 

42,61,219,107,91,54,51,50

61473082

Multiple cut (20, 56, 113, 291, 385, 447, 590, 669, 730, 738, 811, 841, 868, 1026, 1032, 1068, 1101, 1113, 1255, 1273, 1278, 1294, 1502, 1556, 1587, 1609, 1638, 1652, 1681, 1794, 1963, 1977)

57,178,94,62,143,79,61,73,158,142,54,113,169

58431103

Multiple cut (26, 62, 234, 302, 396, 421, 601, 680, 720, 741, 761, 822, 852, 893, 1112, 1124, 1245, 1302, 1423, 1432, 1514, 1568, 1650, 1696, 1928)

172,68,94,180,79,40,61,41,219,57,121,82,54,82,46

58470067

Multiple cut (167, 228, 248, 309, 339, 353, 366, 380, 599, 706, 762, 1024, 1078, 1109, 1160, 1170, 1174, 1203)

61,61,219,107,56,54,51

69878388

Single cut (34, 67, 79, 474, 478, 513, 532, 614, 992)

82

69863271

Multiple cut (2, 137, 144, 203, 223, 384, 430, 562, 566, 711, 757, 767, 865, 899, 933, 940, 956, 966, 972, 988)

135,59,161,46,132,145,46,98

69807580

Multiple cut (15, 88, 109, 149, 244, 576, 660, 690, 704, 731, 875, 950, 1132, 1187, 1207, 1299, 1530, 1585, 1637, 1680, 1712, 1761, 1822, 1882)

73,40,95,84,144,75,182,55,92,55,52,43,49,61,60

69764486

Multiple cut (31, 71, 166, 206, 370, 482, 510, 591, 621, 648, 662, 812, 881, 893, 1029, 1101, 1110, 1231, 1322, 1376, 1458, 1504, 1633, 1849)

40,95,40,164,112,81,150,69,136,72,121,91,54,82,46,129,216

 57807597

Multiple cut (185, 217, 502, 530, 611, 655, 668, 682, 901, 913, 1031, 1065, 1149, 1153, 1183, 1367, 1413, 1448, 1467, 1549, 1563, 1592, 1876, 1881, 1892)

81,44,219,118,84,184,46,82


Phylogenetic Tree Construction and Analysis 

The nucleotide sequence of the bldD gene from 13 antibiotic-producing Streptomyces species and 20 other related Streptomyces species were combined; aligned and then a family tree was created. Four main criteria were used for the possible way of reading, comparing, and interpreting species relationships and divergences, such as comparison of the distance between branch tips; Number of nodes between species, comparison of time with common ancestors, and several common monophyletic groups. A random anchor (a stretch of 3108 nucleotides) Kitasatospora setae strain KM-6054 23S was used to foot the distance between antibiotic-producing and other Streptomyces. A combined analysis of the data yielded a single significant cladogram, obtaining ten clusters and two clades. Consequently, our analysis revealed that the Streptomyces species containing bldD genes that produce antibiotics fall into clusters II, IV, V, VI, VII and cluster X (Fig. 5).

Discussion

The genus Streptomyces is considered to be an important source of bioactive compounds. Several regulatory proteins play a critical role in the activation of an antibiotic biosynthetic gene, of which the BldD functions at the top of the regulatory cascade that controls Streptomyces development and activation of antibiotic production [21]. In additions, several laboratory-based in vitro studies have indicated the significant role of BldD in the regulation of antibiotic production [22]. In Streptomyces coelicolor for example, BldD is a transcriptional regulator required for morphological development and antibiotic synthesis [22] .The in silico-based analysis of the transcription regulator elements of the bldD gene in the antibiotic-producing Streptomyces species could enhance a better understanding of the drug development and facilitate the implementation of laboratory-based in vitro experiments.

In this study, using the NNPP and BPROM web-based programs; we identified the promoter region and TSS that are closely located upstream of the coding regions of the bldD gene of antibiotic-producing Streptomyces species. Our study also identified that bldD gene from all antibiotic producing Streptomyces considered in this study has a single TSS located closer to the start codon of bldD genes, suggesting that the genes are expressed from a single TSS. The availability of the TSSs in close vicinity of the protein-coding region found in this study is consistent with the one reported by Lee et al.(2022) [23]; who showed that most TSSs in Streptomyces were located within 5–100 bp upstream of the start codon. This suggests that the bldD gene in antibiotic-producing Streptomyces regulates the expression of the target gene through the closely located regulators. Our result is also comparable to the study conducted by Jeong et al. (2016 )[24]; who revealed that a total of 68 TSSs mapped to 18 of the 28 secondary metabolic gene clusters identified in the S. coelicolor genome, where they identified an average of 1 TSS for every 2.3 protein-coding genes. TSSs located from 500 bp upstream to 150 bp downstream of the respective annotated start codon of each ORF have been classified as primary (P) or secondary (S) TSS [23]. The regulation of gene expression at the transcriptional level is a fundamental process found in all biological systems [25,26].

Transcription factors are proteins that bind to DNA-regulatory sequences (enhancers and silencers), usually localized in the 5-upstream region of target genes, to modulate the rate of gene transcription. This may result in increased or decreased gene transcription, protein synthesis, and subsequent altered cellular function [27]. In the promoter regions of transcription units, TFs attach to short DNA sequence motifs commonly known as binding sites. Position-specific scoring matrices (PSSMs) can be used to represent all different binding sites recognized by the same TF as a single consensus sequence. The probability of obtaining a particular nucleotide at a particular site is represented by such matrices, which can be represented using a logo representation [20]. Bacterial transcriptional regulators are classified into ~ 50 families on the basis of sequence alignment and structural and functional criteria [26]. Our study showed that five significant motifs for each bldD gene of Streptomyces species in the promoter sequence regions. The existence of common motifs acting as TFBS has also been identified. As a result, Motif 1 (MtS1) has the lowest E value and key regulatory motifs for bldD genes among the five motifs discovered.

The comparative analysis of motif (MtS1) with the known Prokaryotic motif databases showed the matching of the MtS1 with 11 TFs including Putative DNA-binding protein, integrating host factor subunit alpha, RNA polymerase sigma 54 factor, positive regulatory protein of alginate biosynthesis, AraC family transcription regulator, nucleoid-associated protein EspR, sigma factor PvdS, macrodomain Ter protein and RNA polymerase Sigma 70 family protein are some of the transcription factors associated with motif one. The functional role of these TFs in the bacteria includes metabolism, virulence and pathogenesis, replication and regulation of several transcriptional processes (Table 3). Analogous results were also reported in other bacterial species [10,9,26], suggesting the conserved nature of the TFs across the prokaryotes. The identification of the TFs associated with the metabolism and regulator suggest the potential role of identified motif in the regulation and activation of secondary metabolite in the Streptomyces.

Consolidating our results, Fang et al. (2018) revealed a novel AraC-family transcriptional regulator, SAV742, is a global regulator that negatively controls avermectin biosynthesis and cell growth, but positively controls morphological differentiation [28]. Avermectins are useful anthelmintic antibiotics produced by Streptomyces avermitilis. In addition, AraC family members was reported as one of the key transcriptional factors in Streptomyces playing a role in the control of genes involved in important biological processes such as carbon source utilization, morphological differentiation, secondary metabolism, pathogenesis and stress responses [26]. Fang et al. (2018) also identified the regulatory role of the AraC-family transcriptional regulator BfvR (YPO1737 in strain CO92) in biofilm formation and virulence of Yersinia pestis biovar Microtus [29].

Recently, nucleoid-associated proteins have also been found to influence the expression of specialized metabolic clusters [30,31]. Leucine-responsive regulatory protein2 (Lsr2) is a small nucleoid-associated protein found throughout the actinobacteria having similar role to the well-studied Histone-like nucleoid structuring protein (H-NS), in that it preferentially binds AT-rich sequences and represses gene expression [31,32,33]. In Streptomyces venezuelae, Lsr2 represses the expression of many specialized metabolic clusters, including the chloramphenicol antibiotic biosynthetic gene cluster, and deleting lsr2 leads to significant upregulation of chloramphenicol cluster expression [32]. Bacteria including Streptomyces are also known to utilize protein ADP-ribosylation. Lalić et al. (2016) determined the crystal structure and characterized both biochemically and functionally the macrodomain protein SCO6735 from Streptomyces coelicolor that possesses the ability to hydrolyze PARP-dependent protein ADP-ribosylation [34]. The expression of this protein is induced upon DNA damage and that deletion of this protein in S. coelicolor increases antibiotic production.

The cis-regulatory element like the CpG islands has also been examined and identified since the bldD gene contains CpG islands. If a CpG island is present within the 5 kbps sequence of a promoter, it is classified as CpG rich, otherwise as CpG poor. Therefore, in our case the CpG Island in each bldD gene of streptomyces has been appeared at ≥ 2kbps upstream of the coding regions. Therefore, the proportion of CpG-rich promoters is higher in our study. It is generally accepted that promoter regions correlate with CpG islands. CpG islands are regions of DNA longer than 200 bps with a G + C content of at least 50% and a number of CpG dinucleotides that is at least 60% of that which is due to the G + C content would be expected [18]. The principal difference between CpG island and non-CpG island promoters is how their transcription is repressed or modulated. Non-CpG island promoters maintain transcription repression by cytosine methylation at CpG dinucleotides [35]. Methylation of DNA is thought to regulate transcription both directly and indirectly. CpG methylation can directly repress transcription by preventing binding of some transcription factors (TFs) to their recognition motifs [36]. Interestingly, our result showed that a single CpG islands were detected for each kind of bldD gene and suggested the expression of the gene transcriptions is strongly expressed. Thus, the bldD gene has the potential to produce important antibiotics. This is, because, the gene has CpG islands and less repression. The proportion of GC contents, on the other hand, differed between bldD gene antibiotics producing streptomyces species. In addition, the study found that the GC content of the species ranged from 68 to 73%. In addition, Streptomyces species containing the bldD gene are very closely related to other groups of Streptomyces. As a result, not only the Streptomyces species containing the bldD gene, but also the other groups of Streptomyces may have the potential to produce important antibiotics (Fig. 5).

Conclusions

Gene mining and in silico analysis of gene sequences are very important for predicting gene expression patterns. In addition, it is important to identify the gene responsible for the synthesis of new drugs. Our study revealed that the bldD genes have TSS and promoter regions in close vicinity to the gene start codon. In addition, our study also identified matching TFs with the MtS1, one of the key motif identified in this study. The phylogenetic analysis of the bldD, in the antibiotic producing and other Streptomyces species indicated the close relatedness of this species, suggesting their evolutionary familiarity. Therefore, this computational study can serve as a basis to undertake laboratory based experiments to produce essential antibiotics.

Methods

Determination of the Promoter Positions/Transcription Start Site (TSS)  

Promoter regions are intrinsic DNA elements located upstream of genes and required for their transcription by RNA polymerase (RNAP). Some of the first approaches to mapping promoters were based on using position weight matrices (PWMs) of -10 and -35 box motifs, taking into account the distribution of spacer length between motifs and their distance from TSSs [37]. Correct identification of promoters is a crucial step in studying gene expression in bacteria. Here, we consider promoters as the core elements recognized by the sigma subunit of RNAP. This sigma factor recognizes an approximately -35 bp consensus region with two key elements, the 10-box (with the consensus motif TATAAT) and the -35 box (TTGACA) separated by 17±2 bp [38]. Besides the core promoter region, other cis-regulatory elements may play a relevant role in regulating gene expression [38]. In this study, the sequences of bldD genes from 13 Streptomyces species were retrieved in February 2022 from the National Center for Biotechnology Information, NCBI (http://www.ncbi.nlm.nih.gov Nucleotide Database) [39]. To determine promoter positions/TSS, ≥1Kb nucleotide base pairs upstream of the coding regions of the bldD gene were entered into the NNPP version 2.2 toolset and ≥150bp were entered to BPROM. All TSSs of each species gene were screened using the NNPP toolset and BPROM algorithms [18]. 

Identification of common motifs and transcription factors (TFs)  

Analysis of the common motifs for bldD genes was conducted using Multiple Em for Motif Elicitation (MEME) software version 3.5.4 (http://meme-suite.org/tools/meme) http://meme.sdsc.edu) using the, sequence ≥1kb upstream of the promoter positions or TSS [37,40]. The MEME usually finds the most statistically significant (low E-value) motifs and the E-value of a motif is based on its log likelihood ratio, width, sites, the background letter frequencies, and the size of the training set. Then, the search results page was linked to the MEME output in HTML format and the smallest expected value (E-value) was considered for further analysis. The MEME output for each theme was forwarded with a button to send that theme directly to TOMTOM, web-based searching motif comparison programs against a database of known motifs [41]. For this analysis, CollecTF (Bacterial TF motifs) and EXPREG were used as reference database binding motifs. The rank of the primary sequences was compared to all ‘ab initio’ motifs discovered by Sequence Enrichment Analysis (SEA) and the enrichment p-values were used to determine the motifs rank. As a result, the parameters for detecting motif site distribution are set to zero or one site per sequence (ZOOPS), the maximum number of motifs is 13, the motif E-value threshold is unlimited, the minimum motif width is 5, the maximum motif width was 50 [42]. 

Determinations of Transcription Factors Binding Sites (TFBS) 

Sequence motifs are important tools in molecular biology and can describe identify features in DNA, RNA, and protein sequences such as transcription factor binding sites, splice sites, and protein-protein interaction sites [43]. Several algorithms have been developed to discover motifs, as well as algorithms for searching databases for matches to a given motif or motifs. Gapped Local Alignment of Motifs (GLAM) is one of the specialized algorithms for DNA    motif   discovery   and    also   important   for    identifying   functional   site    motifs.   As   a    result,   thirteen   antibiotic-producing   Streptomyces   species    bldD   gene    promoter   sequences   were    entered   into   GLAM2    and   clicked   on    GLAM2   HTML   output    and   searched   for    transcription   factors   binding    site (TFBS)   [43].    

Analysis of CpG Islands 

The CpG island of the genes was determined using two algorithms : the first algorithm was the offline tool CLC Genomics Workbench Version 8.5 (https://clc-genomics workbench.software.informer.com/8.5/) which is used for searching the restriction enzyme MspI cutting sites (with fragment sizes between 40 and 220 bp parameters), and the second algorithm is the Kuo et al. algorithm http://dbcat.cgm.ntu.edu.tw/); which have search criteria of GC content greater than or equal to 55 percent, Observed CpG/Expected CpG ratio 0.65 [44]. 

Phylogenetic Tree Construction and Analysis 

In addition to the 13 bldD gene, of the 13 antibiotic producing Streptomyces species considered in this study, others 20 related Streptomyces were collected by considering the lowest E-values and then aligned using Muscle Multiple Alignment Tools. The phylogenetic tree was constructed through UPGMA methods in MEGA 6.0 platform using aligned sequences from prokaryotes  [45,46]. With the help of significant aligned sequences from prokaryotes, the phylogenetic relationship of Streptomyces species contains the bldD genes was inferred. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (100 replicates) has been shown next to the branches. The evolutionary distances were computed using the Maximum Composite Likelihood Method and are in the units of the number of base substitutions per site. This analysis involved 34 nucleotide sequences including the random anchor (a stretch of 3108 nucleotides) Kitasatospora setae strain KM-6054 23S. All positions containing gaps and missing data were eliminated (complete deletion option). 

Availability of Data and Materials

The datasets generated and/or analysed during the current study are available in the International Nucleotide Sequence Database Collaboration (INSDC) member; NCBI repository.

https://www.ncbi.nlm.nih.gov/gene/?term=bldD+transcriptional+regulator+BldD. 

The anchor datasets generated and/or analysed during the current study are available in the International Nucleotide Sequence Database Collaboration (INSDC) member; NCBI repository; with accession number NC_016109.1

 

Abbreviations

bldD

Bald Gene

BRPOM

Bacterial Promoter 

CpG

Cytosine-Phosphate-Guanine

CSRs

Cluster-Situated Regulators

GLAM2

Gapped Local Alignment Motif 2

NCBI

National Center Of Biotechnology Information

NNPP

Neural Network Promoter Prediction

TFBS

Transcriptions Factor Binding Site 

TF

Transcription Factors

TSS

Transcriptions Start Site

Declarations

Acknowledgement 

The authors would like to acknowledge Adama Science and Technology University, Applied Biology Department. 

Funding 

Not applicable

Affiliations

Department of Applied Biology, Institute of Pharmaceutical Science, Adama Science and Technology University, Adama, Ethiopia

Sisay Demisie & Ketema Tafess

Contributions

SD and KT comprehended and designed the research plans and KT supervised the manuscript. SD drafted the manuscript and did the computational study. And both performed the computational data analysis and revised the manuscript. Both authors read and approved the final manuscript.

Corresponding author

Correspondence to Sisay Demisie

Ethics declarations 

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

References

  1. S. J. Dancer, “How antibiotics can make us sick: The less obvious adverse effects of antimicrobial chemotherapy,” Lancet Infect. Dis., vol. 4, no. 10, pp. 611–619, 2004, doi: 10.1016/S1473-3099(04)01145-4.
  2. H. S. Chaudhary, B. Soni, A. R. Shrivastava, and S. Shrivastava, “Diversity and versatility of actinomycetes and its role in antibiotic production,” J. Appl. Pharm. Sci., vol. 3, no. 8 SUPPL, 2013, doi: 10.7324/JAPS.2013.38.S14.
  3. S. Ramesh and N. Mathivanan, “Screening of marine actinomycetes isolated from the Bay of Bengal, India for antimicrobial activity and industrial enzymes,” World J. Microbiol. Biotechnol., vol. 25, no. 12, pp. 2103–2111, 2009, doi: 10.1007/s11274-009-0113-4.
  4. P. R. Jensen, P. G. Williams, D. C. Oh, L. Zeigler, and W. Fenical, “Species-specific secondary metabolite production in marine actinomycetes of the genus Salinispora,” Appl. Environ. Microbiol., vol. 73, no. 4, pp. 1146–1152, 2007, doi: 10.1128/AEM.01891-06.
  5. A. J. Alanis, “Resistance to antibiotics: Are we in the post-antibiotic era?,” Arch. Med. Res., vol. 36, no. 6, pp. 697–705, 2005, doi: 10.1016/j.arcmed.2005.06.009.
  6. D. I. Kurtböke, “Biodiscovery from rare actinomycetes: An eco-taxonomical perspective,” Appl. Microbiol. Biotechnol., vol. 93, no. 5, pp. 1843–1852, 2012, doi: 10.1007/s00253-012-3898-2.
  7. E. A. Barka et al., “Correction for Barka et al., Taxonomy, Physiology, and Natural Products of Actinobacteria,” Microbiol. Mol. Biol. Rev., vol. 80, no. 4, pp. 1–43, 2016, doi: 10.1128/mmbr.00044-16.
  8. J. R. McCormick and K. Flärdh, “Signals and regulators that govern Streptomyces development,” FEMS Microbiol. Rev., vol. 36, no. 1, pp. 206–231, 2012, doi: 10.1111/j.1574-6976.2011.00317.x.
  9. M. A. Schumacher, W. Zeng, K. C. Findlay, M. J. Buttner, R. G. Brennan, and N. Tschowri, “The Streptomyces master regulator BldD binds c-di-GMP sequentially to create a functional BldD2-(c-di-GMP)4 complex,” Nucleic Acids Res., vol. 45, no. 11, pp. 6923–6933, 2017, doi: 10.1093/nar/gkx287.
  10. C. D. de. Hengst, N. T. Tran, M. J. Bibb, G. Chandra, B. K. Leskiw, and M. J. Buttner, “Genes essential for morphological development and antibiotic production in Streptomyces coelicolor are targets of BldD during vegetative growth,” Mol. Microbiol., vol. 78, no. 2, pp. 361–379, 2010, doi: 10.1111/j.1365-2958.2010.07338.x.
  11. G. Liu, K. F. Chater, G. Chandra, G. Niu, and H. Tan, “Molecular Regulation of Antibiotic Biosynthesis in Streptomyces,” Microbiol. Mol. Biol. Rev., vol. 77, no. 1, pp. 112–143, 2013, doi: 10.1128/mmbr.00054-12.
  12. M. J. Choudoir, C. Pepe-Ranney, and D. H. Buckley, “Diversification of secondary metabolite biosynthetic gene clusters coincides with lineage divergence in Streptomyces,” Antibiotics, vol. 7, no. 1, pp. 1–15, 2018, doi: 10.3390/antibiotics7010012.
  13. L. Liu et al., “AveI, an AtrA homolog of Streptomyces avermitilis, controls avermectin and oligomycin production, melanogenesis, and morphological differentiation,” Appl. Microbiol. Biotechnol., vol. 103, no. 20, pp. 8459–8472, 2019, doi: 10.1007/s00253-019-10062-3.
  14. G. P. Van Wezel and K. J. McDowall, “The regulation of the secondary metabolism of Streptomyces: New links and experimental advances,” Nat. Prod. Rep., vol. 28, no. 7, pp. 1311–1333, 2011, doi: 10.1039/c1np00003a.
  15. Y. Yan, N. Liu, and Y. Tang, “Recent developments in self-resistance gene directed natural product discovery,” Nat. Prod. Rep., vol. 37, no. 7, pp. 879–892, 2020, doi: 10.1039/c9np00050j.
  16. N. Ziemert, M. Alanjary, and T. Weber, “The evolution of genome mining in microbes-a review,” Nat. Prod. Rep., vol. 33, no. 8, pp. 988–1005, 2016, doi: 10.1039/c6np00025h.
  17. V. V. Solovyev and I. A. Shahmuradov, “PromH: Promoters identification using orthologous genomic sequences,” Nucleic Acids Res., vol. 31, no. 13, pp. 3540–3545, 2003, doi: 10.1093/nar/gkg525.
  18. J. Wang, L. H. Ungar, H. Tseng, and S. Hannenhalli, “MetaProm: A neural network based meta-predictor for alternative human promoter prediction,” BMC Genomics, vol. 8, pp. 1–13, 2007, doi: 10.1186/1471-2164-8-374.
  19. V. Solovyev and A. Salamov, “The Gene-Finder computer tools for analysis of human and model organisms genome sequences.,” Proc. Int. Conf. Intell. Syst. Mol. Biol., vol. 5, pp. 294–302, 1997.
  20. M. Vahed, M. Vahed, and L. X. Garmire, “BML: a versatile web server for bipartite motif discovery,” Brief. Bioinform., vol. 23, no. 1, pp. 1–11, 2022, doi: 10.1093/bib/bbab536.
  21. M. A. Schumacher et al., “The MerR-like protein BldC binds DNA direct repeats as cooperative multimers to regulate Streptomyces development,” Nat. Commun., vol. 9, no. 1, pp. 1–12, 2018, doi: 10.1038/s41467-018-03576-3.
  22. H. Yan et al., “BldD, a master developmental repressor, activates antibiotic production in two Streptomyces species,” Mol. Microbiol., vol. 113, no. 1, pp. 123–142, 2020, doi: 10.1111/mmi.14405.
  23. Y. Lee et al., “Genome-scale analysis of genetic regulatory elements in Streptomyces avermitilis MA-4680 using transcript boundary information,” BMC Genomics, vol. 23, no. 1, pp. 1–16, 2022, doi: 10.1186/s12864-022-08314-0.
  24. Y. Jeong et al., “The dynamic transcriptional and translational landscape of the model antibiotic producer Streptomyces coelicolor A3(2),” Nat. Commun., vol. 7, pp. 1–11, 2016, doi: 10.1038/ncomms11605.
  25. D. Sun, C. Liu, J. Zhu, and W. Liu, “Connecting metabolic pathways: Sigma factors in Streptomyces spp.,” Front. Microbiol., vol. 8, no. DEC, pp. 1–7, 2017, doi: 10.3389/fmicb.2017.02546.
  26. A. Romero-Rodríguez, I. Robledo-Casados, and S. Sánchez, “An overview on transcriptional regulators in Streptomyces,” Biochim. Biophys. Acta - Gene Regul. Mech., vol. 1849, no. 8, pp. 1017–1039, 2015, doi: 10.1016/j.bbagrm.2015.06.007.
  27. S. D. Minchin and S. J. W. Busby, “Transcription Factors,” Brenner’s Encycl. Genet. Second Ed., vol. 1, pp. 93–96, 2013, doi: 10.1016/B978-0-12-374984-0.01552-7.
  28. D. Sun, J. Zhu, Z. Chen, J. Li, and Y. Wen, “SAV742, a Novel AraC-Family Regulator from Streptomyces avermitilis, Controls Avermectin Biosynthesis, Cell Growth and Development,” Sci. Rep., vol. 6, no. October, 2016, doi: 10.1038/srep36915.
  29. H. Fang et al., “BfvR, an AraC-Family Regulator, Controls Biofilm Formation and pH6 Antigen Production in Opposite Ways in Yersinia pestis Biovar Microtus,” Front. Cell. Infect. Microbiol., vol. 8, no. October, pp. 1–11, 2018, doi: 10.3389/fcimb.2018.00347.
  30. X. Zhang, S. N. Andres, and M. A. Elliot, “Interplay between Nucleoid-Associated Proteins and Transcription Factors in Controlling Specialized Metabolism in Streptomyces,” MBio, vol. 12, no. 4, 2021, doi: 10.1128/mBio.01077-21.
  31. B. R. G. Gordon et al., “Lsr2 is a nucleoid-associated protein that targets AT-rich sequences and virulence genes in Mycobacterium tuberculosis,” Proc. Natl. Acad. Sci. U. S. A., vol. 107, no. 11, pp. 5154–5159, 2010, doi: 10.1073/pnas.0913551107.
  32. E. J. Gehrke et al., “Silencing cryptic specialized metabolism in Streptomyces by the nucleoid-associated protein Lsr2,” Elife, vol. 8, pp. 1–28, 2019, doi: 10.7554/eLife.47691.001.
  33. J. M. Chen et al., “Lsr2 of Mycobacterium tuberculosis is a DNA-bridging protein,” Nucleic Acids Res., vol. 36, no. 7, pp. 2123–2135, 2008, doi: 10.1093/nar/gkm1162.
  34. J. Lalić et al., “Disruption of macrodomain protein SCO6735 increases antibiotic production in streptomyces coelicolor,” J. Biol. Chem., vol. 291, no. 44, pp. 23175–23187, 2016, doi: 10.1074/jbc.M116.721894.
  35. N. P. Blackledge and R. J. Klose, “CpG island chromatin: A platform for gene regulation,” Epigenetics, vol. 6, no. 2, pp. 147–152, 2011, doi: 10.4161/epi.6.2.13640.
  36. Yimeng et al., “乳鼠心肌提取 HHS Public Access,” Physiol. Behav., vol. 176, no. 3, pp. 139–148, 2017, doi: 10.1126/science.aaj2239.Impact.
  37. M. Henrique and A. Cassiano, “Benchmarking Bacterial Promoter Potentialities and Limitations,” vol. 5, no. 4, 2020.
  38. “Benchmarking available bacterial promoter prediction tools: potentialities and limitations 2 3 4,” pp. 1–24, 2020.
  39. S. A. Solovyev V, “Automatic annotation of microbial genomes and metagenomic sequences. In metagenomics and its applications in agriculture, biomedicine and environmental studies (Li RE, ed).,” Nov. Sci. Publ., no. January, pp. 61–78, 2011.
  40. T. L. Bailey, J. Johnson, C. E. Grant, and W. S. Noble, “The MEME Suite,” Nucleic Acids Res., vol. 43, no. W1, pp. W39–W49, 2015, doi: 10.1093/nar/gkv416.
  41. T. L. Bailey et al., “MEME Suite: Tools for motif discovery and searching,” Nucleic Acids Res., vol. 37, no. SUPPL. 2, pp. 202–208, 2009, doi: 10.1093/nar/gkp335.
  42. C. E. Grant and T. L. Bailey, “XSTREME: Comprehensive motif analysis of biological sequence datasets,” bioRxiv, p. 2021.09.02.458722, 2021, [Online]. Available: https://www.biorxiv.org/content/10.1101/2021.09.02.458722v1%0Ahttps://www.biorxiv.org/content/10.1101/2021.09.02.458722v1.abstract.
  43. M. C. Frith, N. F. W. Saunders, B. Kobe, and T. L. Bailey, “Discovering sequence motifs with arbitrary insertions and deletions,” PLoS Comput. Biol., vol. 4, no. 5, 2008, doi: 10.1371/journal.pcbi.1000071.
  44. H. C. Kuo et al., “DBCAT: Database of CpG islands and analytical tools for identifying comprehensive methylation profiles in cancer cells,” J. Comput. Biol., vol. 18, no. 8, pp. 1013–1017, 2011, doi: 10.1089/cmb.2010.0038.
  45. S. Kumar, G. Stecher, and K. Tamura, “MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets,” Mol. Biol. Evol., vol. 33, no. 7, pp. 1870–1874, 2016, doi: 10.1093/molbev/msw054.
  46. L. Newman, A. L. J. Duffus, and C. Lee, “Using the free program MEGA to build phylogenetic trees from molecular data,” Am. Biol. Teach., vol. 78, no. 7, pp. 608–612, 2016, doi: 10.1525/abt.2016.78.7.608.