Escherichia coli Strains, Plasmids and Media
All E. coli strains and media used in this study are presented in Table 2. Escherichia coli BL21(DE3) was used for the expression of the recombinant FDH from Thiobacillus sp. KNK65MA (TsFDH). Two FDH knockout strains, JW3866 and JW4040 were purchased from Keio Collection.
Strains and plasmids
E. coli Bl21(DE3)
E. coli JW3866
E. coli JW4040
[lon] ompT gal (λ DE3) [dcm] ∆hsdS
ApR, T7 promoter, lac operator
pET-21α, containing TsFDH gene from Thiobacilusis sp KNK65MA
M9 medium with glycerol as carbon source and LB medium as a complex medium all containing 30 µg kanamycin were used for measuring bacterial growth. For BL21 with pET+TsFDH, the same media with ampicillin were used. In all samples containing pET+TsFDH, IPTG (0.5 mM final concentration) was added to the medium. The metabolic reactions consuming or producing formate (map01200 and C00058) were obtained from KEGG (20) (https://www.genome.jp/pathway/map01200+C00058). Using the results from KEGG pathway search for all the carbon fixation reactions, the contributing FDHs were identified. The kinetic parameters including Kcat and Km of FDHs (EC: 18.104.22.168) for efficient formate formation were obtained from Brenda enzyme data bank (21) and the published articles were reviewed and compared in different bacteria (Online Resource 1). This approach revealed some interesting FDHs with relatively better kinetic parameters. Although, the results obtained by TsFDH might be quite satisfactory, we assume there are still some FDHs that deserve the attention for replacing the indigenous FDHs of E. coli for improving the growth efficiency. Our mentioned assumption is based on the ambiguity of assay conditions for some of the reported FDHs and lack of a gold standard for the kinetics comparisons. Scanning the kinetic parameters for a desired FDH suggested the Thiobacillus sp. KNK65MA (7).
Amino acid and nucleotide sequences of Thiobacilusis sp KNK65MA formate dehydrogenase were obtained from UniProt (accession # Q76EB7). cDNA of TsFDH was synthetized in pET21a by ZistEghtesadMad based on reference sequence (Q76EB7). Two knockout strains of K12 Escherichia coli, JW 4040 and JW 3866, with the deletion of fdhF and fdhD genes, respectively were purchased from Dharmacon. The stocks of the Knockout E. coli strains were cultured on LB broth and M9+Glycerol media followed by incubation at 37°C for 24 hours (22).
The expression strain BL21 was also used as control to compare the growth rates. All strains were cultured at the same time under the same conditions on LB broth media at 37°C and 200 rpm. Competent cells of the BL21, E. coli 4040, and E. coli 3866 were prepared as previously mentioned (22). pET21, a plasmid containing a fusion gene to express format dehydrogenase of Thiobacillus sp. KNK65MA (pET+TsFDH), was transformed in competent BL21cells, E. coli JW4040 and E. coli JW3866 on LB Agar with Amp (100 mg / ml) followed by incubation overnight at 37°C. The colonies containing plasmid were selected and cultured on a 10 ml LB broth with Amp as a primary culture and incubated at 37°C, 200 rpm for 24 hours. Then the culture was carried out in 200 ml of the LB broth containing Amp (100 mg / ml) and they were incubated at 37°C and at 200 rpm for 24 hours. Also, the bacteria BL21, E. coli JW 4040 and E. coli JW3866 lacking plasmid were simultaneously cultivated and incubated on LB broth or M9-Glycerol + 50 µg/ml Kanamycin under identical conditions with plasmid-containing samples.
Media and culture conditions
M9 medium + glycerol containing 30 µg kanamycin was used for measuring bacterial growth for BL21 with pET+TsFDH. The same media with ampicillin were also used. In all samples containing pET+TsFDH, IPTG (0.5 mM final concentration) was added to the medium.
Bacteria were grown in batch cultures at 37oC in shaker incubator in 50 mL flasks. 1000 µL samples were taken in triplicate at indicated time intervals and the absorbance was measured at 600nm. During incubation, plasmid-free bacteria and plasmid-containing ones were sampled at different times, namely 2h, 4h, 6h, 8h, 10h, 12h and 24hr. To determine the growth rate of bacteria at the above time intervals, using a spectrophotometer at 600 nm wavelength, optical absorption, cell growth and growth rates were measured.
In silico analysis
In order to achieve a deeper insight into our observations, we applied correlation analysis, PCA and linear model analysis. We looked for an E. coli RNA dataset which could reflect the maximum possible transcriptional variations so that we would be able to calculate significant correlations between the genes. Moreover, the number of genes involved in the gene expression profile was important to calculate as many correlations as possible. With this aspiration, we fetched an E. coli RNA-seq dataset comprising 152 RNA-seq count samples under 34 different growth conditions (GEO accession GSE94117). These samples were taken from both exponential and stationary phases. One unique aspect of this highly pertinent dataset is the fact that it is sampled under 34 different growth conditions leading to a wider range of differentially expressed genes because of the different metabolic needs (23).
Using Python version 3.6.1, 152 samples of RNA-seq count files were merge. The counts were converted into count per million (CPM) and were log2 transformed. The resulting data were z-score transformed per gene across all samples. Quality control was performed as sample-level box plots before and after data preprocessing (Additional file 1: Fig. S3 and S4).
The Spearman rank-order correlation coefficient, being a nonparametric measure, examines the monotonic relationship between the ordinal values of the variables. Contrary to the Pearson correlation, the Spearman's rank correlation is not based on the assumption that the variables are normally distributed. Spearman correlation coefficient spans between -1 and +1 with 0 indicating no correlation. Correlation coefficients of -1 or +1 imply perfect monotonic relationship. Using the spearman function from the sub-package scipy.stats (24) the correlations between each of the two gene knockouts and the rest of the genes were calculated. The negatively correlated genes were chosen for fdhD and fdhF, all of which with p-values and FDRs less than 0.01 and "Spearman's ρ" < -0.2. Among these anti-correlated genes, the top 20 ones were chosen for further analysis. Since the top anti-correlated genes were already of p-values and FDRs below the specified threshold, there was no need for fishing out the differentially expressed genes in advance. In other words, the significant gene expression co-variations are reflected in the significant Spearman correlation coefficients.
PCA as a dimensionality reduction technique was used to compare the gene expression profile dispersion of the bacteria based on the diversity of the expression levels of their genes. To this aim, pca function from mixOmics R package was used (25).
Linear regression analysis
Scatter plots of the top 40 anti-correlated genes were generated against each of the knoked out genes. Linear regression models were fit to the data points to show the overall trendlines (26).
BioCyc (27) database of microbial genomes and metabolic pathways was used to find the pathways each of the anti-correlated genes are attributed to.