A Rational Design of Pseudomonas putida KT2440 capable of Anaerobic Respiration

Pseudomonas putida KT2440 is a metabolically versatile, HV1-certiﬁed, genetically accessible, and thus interesting microbial chassis for biotechnological applications. However, its obligate aerobic nature hampers production of oxygen sensitive products and drives up costs in large scale fermentation. The inability to perform anaerobic fermentation has been attributed to insuﬃcient ATP production and an inability to produce pyrimidines under these conditions. Addressing these bottlenecks enabled growth under micro-oxic conditions, but does not lead to growth or survival under anoxic conditions. Here, a data-driven approach was used to develop a rational design for a P. putida KT2440 derivative strain capable of anaerobic respiration. To come to the design, data derived from a genome comparison of 1628 Pseudomonas strains was combined with genome-scale metabolic modelling simulations and a transcriptome dataset of 47 samples representing 14 environmental conditions from the facultative anaerobe Pseudomonas aeruginosa . The results indicate that the implementation of anaerobic respiration in P. putida KT2440 would require at least 61 additional genes of known function, at least 8 genes encoding proteins of unknown function, and 3 externally added vitamins.


Introduction
Pseudomonas putida KT2440 is a HV1-certified [1], genetically accessible [2,3,4,5,6,7] and metabolically versatile [8,9] species, which makes it an interesting adaptable industrial workhorse [10,11,12]. However, its strict aerobic lifestyle is an industrial disadvantage [4,13,14,15,16] as the strict requirement for dissolved O 2 results in increased costs of large-scale cultivation and may lead to unstable production rates due to inadequate local oxygen supply caused by oxygen fluctuations. Its strict aerobic nature also excludes production of O 2 -sensitive enzymes, pathway intermediates or target products.
Most Pseudomonas species are facultative anaerobes and can use an inorganic compound such as nitrate as alternate electron receptor. This includes species that are closely related to the P. putida KT2440 strain, such as P. fluorescens and P. denitrificans. Only one Pseudomonas species is capable of anaerobic fermentation: Pseudomonas aeruginosa [17,18,19,20]. P. aeruginosa is capable of arginine fermentation and pyruvate fermentation, although the latter only leads to prolonged survival under anoxic conditions, not to growth [18,19,20].
The relatively short evolutionary distance between P. putida KT2440 and facultative anaerobic Pseudomonas species suggests that through the implementation of a rational engineering cycle, this strain can be adapted to a facultative anaerobic lifestyle. A Design, Build, Test, Learn engineering cycle [21] was performed in earlier work [22] in an attempt to obtain an P. putida KT2440 strain capable of anaerobic fermentation. Using genome metabolic models (GSMs) iJP962 and iJP746 combined with a protein domain comparison (PDC) between six aerobic Pseudomonas putida strains including KT2440 and six facultative anaerobic Pseudomonas strains, three key enzymes were selected and included in the final design: acetate kinase (encoded by ackA), dihydroororotate dehydrogenase (pyrK-pyrD B ) and ribonucleotide triphosphate reductase class III (nrdD-nrdG). This design was built and the resulting recombinant strain showed growth under micro-oxic conditions [22]. Earlier work already described an increase in survival rates upon introduction of solely acetate kinase [4,14], and since the model predictions used in the design only considered full anoxic conditions, survival rates of the recombinant strains under anoxic conditions need to be tested.
Here, we (i ) determined the survival rates of the recombinant strains under anoxic conditions, (ii ) identified limitations for anaerobic growth through respiration, and (iii ) composed a new design for a recombinant P. putida KT2440 capable of anaerobic respiration. In pursuit of this goal we expanded upon earlier work using the current wealth of genome data available on P. putida and other Pseudomonas species by inclusion of 1628 strains in an extensive comparison of the protein domain content [23]. Random forest, a machine learning method, was used to identify key protein domains associated with "anaerobic growth". Transcriptome data of the Pseudomonas aeruginosa type strain PA14 cultures grown in 14 different conditions [24] were also taken into account and integrated with previous and newly obtained GSM simulation results to arrive to a final design.

Anoxic survival experiment
Oxygen gradients served to allow the recombinant strains to grow in micro-oxic conditions as described in [22]. Anoxic cultivation of P. putida KT2440 recombinants unpassed or passed over oxygen gradients was performed at 30 • C in 50 ml glass 20 mm aluminium crimp cap vials with rubber stoppers (Glasgerätebau Ochs Laborfachhandel e.K.) in 30 ml DeBont GA with 1 mg/l resazurin and with 50 µg/ml kanamycin as selection marker for recombinant strains. Were indicated, a 1000x diluted vitamin mix was added (0.02 g/l biotin, 0.2 g/l nicotinamide, 0.1 g/l paminobenzoic acid, 0.2 g/l thiamin, 0.1 g/l panthotenic acid, 0.5 g/l pyridoxamine, 0.1 g/l cyanocobalamine, 0.1 g/l riboflavine). Before inoculation, the vials were gas exchanged with CO 2 /N 2 . Inoculation was done with aerobically pre-cultured bacterial sample at an OD 600 of 0.05. Approx. 8 h after inoculation, the resazurin became completely colourless, indicating full anaerobic conditions. Samples were taken using sterile CO 2 flushed 1.5" Needles (BD Microlance) and 3-5 ml syringes (Ther-moFisher) to avoid O 2 exposure. Anoxic conditions were ensured as the resazurin turned from colourless to bright pink within seconds in extracted samples. Survival rates were analysed by colony forming units (CFU) determination. A dilution series was made and five drops of 10 µl per dilution were applied onto LB-agar plates without selection marker, which were incubated o/n at 30 • C. Colonies were counted manually, and photos were taken of the plates. Gram-staining was performed to ensure culture purity, according to manufacturers' instructions (Gram-staining kit Machery-Nagel, Germany).

Statistical analysis
Experiments were independently repeated six times with biological triplicates in each separate experiment. Figures represent the mean values of corresponding biological triplicates and the standard deviation. The level of significance of the differences when comparing results was evaluated by means of analysis of variance (ANOVA), with α=0.05.

Genome annotation
Information on the oxygen requirements of 16989 Pseudomonas strains was obtained from the Gold database [26]. Per species, extensive literature research was performed to validate their aerobicity (Data S5). 1628 Genomes of facultative anaerobic and strict anaerobic strains from the Pseudomonas genus were obtained from the European Nucleotide Archive repository in March 2015 [27]. All genomes were de-novo annotated in SAPP [28] using Prodigal for gene prediction (version 2.6) [29], 2010] and InterProScan version 5.4-47.0 [30] for functional annotation using Pfam [31].

Comparisons of protein domain content
The positions (start and end on the protein sequence) of the protein domains and their order in a protein when multiple domains were present, were used to identify domain architecture (i.e. combinations of protein domains). Protein domain architectures were labeled by the ordered list of Pfam identifiers as described in [32]. Protein domain architectures identified in each genome sequence were stored in a matrix, from this a binarized domain architecture presence-absence matrix was extracted and used as an input for principal component analysis using the standard R-package prcomp and hierarchical clustering using the standard R-package hclust.

Gene persistence
The persistence of a gene in a taxonomic group or group of genomes can be defined as where N (orth) is the number of genomes carrying a given ortholog and N is the number of genomes considered [23]. For the set of 1628 considered genomes. Orthologous genes were identify through identity of protein domain architectures taking into account copy number. Resulting protein domain contents were analysed through protein domain comparison (PDC).

Feature selection using random forest
The random forest classification algorithm was used to classify the genome sequences in aerobic and facultative anaerobic species with the goal to identify the domains (features) responsible for the separation in these two groups (feature selection). Three hundred randomly selected genomes from aerobic and anaerobic Pseudomonas species were selected to train random forest models. The process was repeated one hundred times. The resulting 100 different models were used to weigh 5831 protein domains from both aerobic and anaerobic Pseudomonas species. Variable selection was used to identify the most influential domains for classification in aerobic and facultative anaerobic strains, yielding 100 Gini coefficients, representing the importance of a protein domain for separation per protein domain. Gini coefficients were combined into the cumulative Gini coefficient. The resulting protein domains were separated into aerobic/anaerobic specific protein domains before further analysis.
Transcriptome data analysis A publicly available P. aeruginosa transcriptome data set was retrieved from GEO database (accession number GSE55197) [24]. This dataset contains 47 samples corresponding to 14 environmental conditions, including changes in growth temperature, growth stage, osmolarity, concentration of ions in the media, and surface attachment and anaerobic respiration. For every gene the log 2 fold change of its expression values was calculated in comparing every possible conditions with anaerobic respiration. Missing or infinity values arising from genes with very low counts in some condition(s) were imputed to 0 or ± 4, according to the significance of the differential expression (False discovery rate, fdr < 0.05). Normalization, fold change computations and differential expression analysis were performed using the R package DESeq [33].

Results
Insertion of acetate kinase in P. putida KT2440 Previous designs to obtain P. putida strains surviving anoxic conditions were conceptually based on the hypothesis that anoxic survival was prevented by a lack of energy conservation and redox balancing [4,13,14,16,15]. Expression of the acetate kinase gene from P. aeruginosa and E. coli was reported to result in extended survival under anoxic conditions [4,14]. Expression of the acetate kinase gene (ackA) from E. coli combined with class I dihydroorotate hydrogenase (pyrK-pyrD B ) and class III ribonucleotide triphosphate reductase (nrdD-nrdG) from L. lactis successfully led to growth under micro-oxic conditions [22]. Tested strains were Pseudomonas putida KT2440 with acetate kinase (pS2213 ackA) unpassed (p+0) or passed three consecutive times over oxygen gradients (p+3), and Pseudomonas putida KT2440 with acetate kinase, dihydroororotate dehydrogenase and ribonucleotide triphosphate reductase type II (pS2213 ackA-(pyrK-pyrD B)-(nrdD-nrdG) unpassed (p+0) or passed three consecutive times over oxygen gradients (p+3).
To determine the tolerance of a P. putida KT2440 negative control carrying an empty plasmid and the recombinant strain enriched with ackA, pyrK-pyrD B and nrdD-nrdG to anoxic conditions and to analyse the effect of an adaptation over oxygen gradients as performed earlier [22], an anoxic survival experiment of 18 days was performed. After inoculation at a standardized cell density under oxic conditions, cultures were incubated overnight in capped gas-exchanged vials in oxygen-depleted medium (see Materials and Methods). The survival rate was determined by performing colony forming unit (CFU) counts at set time points over a period of 18 days, with T 0 being the start of the experiment in anoxic conditions ( Figure 1, supplementary Figures S1, S2, S3, S4). The results showed that in anoxic conditions there is no significant difference in survival rates between the negative control and any of the recombinant strains tested (ANOVA α = 0.05). Under these conditions, only the positive control, E. coli BW25113 harbouring an empty plasmid, survived. Design requirements for a P. putida KT2440 derivative strain capable of anaerobic respiration The failure of the previous, fermentative, design [22] to grow under anoxic conditions could be explained by the heavy reliance on the two state of art genome-scale models (GSMs) used in this design, which currently do not include an accurate representation of the complete redox balance and its intricate involvement in the metabolism. Additionally, while the protein domain comparison performed in this study showed apparent differences between aerobic and anaerobic strains in availability of protein domains, this analysis was performed on a limited set of strains.
Many facultative anaerobic Pseudomonas species are incapable of anaerobic fermentation, but rather perform anaerobic respiration. The close phylogenetic distances between some of these facultative anaerobic Pseudomonas species and P. putida KT2440 may suggest that acquiring a facultative anaerobic lifestyle via anaerobic respiration would require less genetic changes. To come to a rational design of P. putida KT2440 capable of anaerobic respiration, the previous methods were thus expanded upon by (i ) using significantly more facultative anaerobic and aerobic Pseudomonas strains for domain analysis, (ii ) inclusion of iJN1411, the latest metabolic reconstruction of P. putida KT2440 [34], and (iii ) incorporation of an elaborate transcriptome analysis of anaerobic respiration of P. aeruginosa strains grown under anoxic conditions in comparison with 13 other otherwise aerobic growth conditions [24]. Inclusion of such transcriptome data would show gene regulation due to growth under anoxic conditions, improving the design as it complements genome based methods.
For protein domain comparisons, the Pfam domain content of P. putida KT2440 was compared with 1627 other Pseudomonas strains with fully sequenced genomes. For each strain a literature search was performed to determine oxygen requirements, yielding 344 obligate aerobic strains including KT2440 and 1284 facultative anaerobic strains. Strain specific differences in protein domain content were visualized using principal component analysis (PCA), and hierarchical clustering using domain presence/absence as input ( Figure 2). Both the PCA and the hierarchical clustering show a separation between a number of the facultative anaerobic strains and the rest of the considered strains (among which P. putida KT2440). However, it should be noted that only a small fraction of the total variance is explained by the first two principal components. This separation is also apparent in the dendrogram, suggesting that significant differences could be found in protein domain content.
We assumed that domains essential for anaerobic respiration are highly persistent in facultative anaerobic strains, but show a lower persistence in obligate aerobic strains. The strategy to obtain this protein domain core is outlined in Figure 3. A "long list" of anaerobic protein domains was generated by comparing domain persistence between aerobic versus anaerobic strains. First a 95% persistence threshold was applied, to obtain a "domain core" of domains present in at least 95% of the genomes of "aerobic" strains and in the "anaerobic" strains analysed. These aerobic and anaerobic domain cores were used as input for subsequent comparative analysis and for the first list split into "shared between aerobic and anaerobic species" (Shared domain core), "specific for aerobic species" (Aerobe specific domain core) and "specific for anaerobic species" (Anaerobe specific domain core creating a long list of 427 anaerobe specific protein domains. A second long list was created by the same input but searching for the reverse, a separation based on domains with a very low persistency in aerobic or anaerobic strains. For this a no more than 1% threshold was applied creating the second long list of 167 anaerobe specific protein domains. The dendrogram presented in Figure 2 indicated a possible early branch split between a large group of exclusively anaerobic Pseudomonas strains and a mixed group, including P. putida KT2440, containing 138 facultative anaerobic and 87 obligatory aerobic Pseudomonas strains (Figure 2 panel C). Using this split two "restricted" lists were built by comparing domain persistence as outlined above but now evaluating only Pseudomonas strains present in the mixed branch. For the restricted lists, a 90% threshold, and a 1% persistence threshold were used creating two anaerobic species specific protein domains lists of 170 and 248, respectively. The four different lists enriched in protein domains essential for anaerobic growth were compared to each other and manually further annotated. Results are summarized in Table 1 and Figure 3.
As outlined in the Materials and Methods section, the domain content of the facultative anaerobic and the obligatory aerobic Pseudomonas strains were used to train a random forest classifier with the goal to identify those domains (features) that are mostly responsible for the classification. Gini coefficients and cumulative Gini coefficients for each domain are provided in Data S9. From the 5831 domains that were used as input for the classifier, 5 were seen to have a cumulative Gini coefficient ≥100, as summarized in Table 1. Gini scores were added as weight to the four protein domain list derived above.
Transcriptome data obtained from P. aeruginosa PO14 grown under 14 different environmental conditions including anoxic conditions [24] was re-analyzed for genes that were consistently differentially expressed during anaerobic respiration (see the Materials and Methods section for details). By calculating for every gene the log2fold change of its expression values in every possible conditions compared with anaerobic respiration, 175 protein domains were identified. A heatmap was used to visualize up-and down-regulated genes under anoxic conditions. Regulation due to anoxic growth was considered to be significant when the same behaviour (up-or downregulation) was observed in at least 7 of the 13 pair-wise comparisons and a fold change of at least 4 was observed in at least three of these comparisons. Protein domain architectures corresponding to the selected locus tags were identified. Based on the differential expression and similar efforts in literature [13] 22 genes encompassing 35 protein domains were selected.
Genome-scale models were used to simulate anoxic conditions. The absence of any reaction products impeding growth due to the simulated lack of oxygen were pinpointed and traced back to a list of genes that either need oxygen as a substrate or that cannot be made without oxygen present, and the resulting substrates that could thus not be produced. Genes and substrates were manually verified to be essential for growth (Table 1). Domains with a cumulative Gini coefficient ≥20 360 Domains with a cumulative Gini coefficient ≥100 5 Figure 3 Overview of in silico approaches to identify limitations to anaerobic respiration in P. putida. A) Comparative genomics workflow. Genomes of the P. putida group and the anaerobic Pseudomonas group were systematically annotated using SAPP [23,28], the protein domains were extracted, and both all domains or only the domains common to all anaerobic Pseudomonas species (the core domains) were selected using a 95% persistence threshold. Analysis was performed on the whole set of genomes (left) or a genome cluster of closely related strains (right). Each of these methods resulted in a list of protein domains related to an aerobic lifestyle (purple) or an anaerobic life style (light green). B) Transcriptome analysis. C) GSM simulations. GSM iJP962 [5] and iJN1411 [34] were expanded with indicated reaction sets and tested for anaerobic growth under anaerobic conditions. Colours indicate final implementation in the design (green). Model and genome base predictions were combined to obtain a final design.

Design Considerations
To obtain further insight in the requirements to build a P. putida KT2440 derivative strain capable of anaerobic respiration, a comparison was made between the different lists obtained (Table 1) and previous efforts [22,4,14,13] resulting in an extensive overview of the many hurdles that need be overcome to build a P. putida KT2440 strain capable of anaerobic respiration. Lists were compared by evaluating the function of each gene starting with the encoded domain annotation, checking for domain co-existence in operonic structures, comparing metabolic functions with GSM data, and with gene regulation. The weight of each protein domain was determined using the random forest analysis (Data S9). In this way the list could be reduced to 69 genes to be included into the design and a supplement of 3 vitamins to the medium that are deemed essential for P. putida KT2440 to enable anaerobic respiration.
The selected genes can be separated into various categories based on their functions: Nitrogen metabolism (49 domains in 37 genes), Hydrogenases (18 domains in 16 genes), Cytochrome C (3 domains in 3 genes), Pyrimidine and amino acid biosynthesis (4 domains in 2 genes if 3 vitamins added), ATP production (3 domains in 3 genes), and Domains of Unknown Function (indirectly associated with anaerobic respiration) (8 domains).

Nitrogen metabolism
Of the 61 known genes found vital for anaerobic respiration, 37 are either directly or indirectly involved in nitrogen metabolism. With nitrate as the final electron acceptor in anaerobic respiration, compared to other final electron acceptors such as sulfate, iron(III), manganese(II), or selenate, the largest amount of energy can be conserved [35]. P. putida KT2440 lacks the nitrate/nitrite respiration pathway, which was resolved in earlier studies by inserting either a Nir-Nar or a Nor plasmid [13]. This resulted in extended survival under anoxic conditions, but not growth. Our transcriptomics and protein domain analysis indicated that the combination of both the Nir-Nar and the Nor operon are required ( Table 2). The operons include genes required for energy conservation, cofactor biosynthesis, amino acid biosynthesis, nitrogen metabolism, nitrate-, nitrite-and nitrogen transporters, nitrate-, nitrite-, nitric oxide-and nitrous oxide reductases and several regulatory proteins ( Table  2). Of the 49 protein domains or 37 genes we identified within this category, only 15 genes had been previously found (narK1, narK2, narG, narH, narJ, narI narX, narL, nirF, nirQ, nirM, nirS, nirJ, nirL within nir-nar operon, norC, norB, norD, nosR within the Nor operon) [13].
Previously unidentified genes in this category include many transporters and alternative mechanisms to tap indirect sources of nitrate or nitrite. Pseudomonas species capable of anaerobic respiration use these alternatives when nitrate or nitrite is scarce. Only genes uniformly present in species capable of anaerobic respiration and strongly associated with those of the nitrogen metabolism were considered for the design. Allantoicase (or allantoate amidinohydrolase) participates in purine metabolism, facilitating the use of purines as secondary nitrogen sources under nitrogen-limiting conditions resulting in the production of ammonia and carbon dioxide using the uricolytic pathway, which is absent in P. putida [36]. A second example of an enzyme required for sourcing secondary nitrogen sources is methylaspartate ammonia-lyase. This enzyme catalyses the second step of glutamate fermentation, a process in which L-threo-3methylaspartate is converted to mesaconate and ammonia. Ureohydrolases facilitate the ammonia to urea conversion, with urea as the principle product of nitrogen excretion.  [29,30]. Printed in bold are classes of genes, the genes belonging to that class listed directly underneath.

Hydrogenases
Included in the list are 16 hydrogenases. Hydrogenases catalyse the reversible oxidation of molecular hydrogen, fulfilling a regulatory role in balancing the redox state. The redox state of the cell and the availability of O 2 are regulatory signals in facultative anaerobic species [37].
[FeFe]-And [NiFe]-hydrogenases are widely distributed under anaerobic species. These hydrogenases are only produced under anoxic conditions, and most [NiFe]-hydrogenases are inactivated by oxygen, only to be re-activated under reducing conditions [38].
Hydrogen oxidation is coupled to the reduction of electron acceptors (such as oxygen, nitrate, sulphate, carbon dioxide and fumarate). P. putida KT2440 lack hydrogenases necessary for the reduction of nitrogen compounds, and the necessary hydrogenase chaperones, assembly, maturation and formation proteins ( Table 3).
Of the 16 proteins vital for maintaining the redox balance in anaerobic conditions only transcriptional regulator DNR has been found in previous work [13].  [29,30] Cytochrome C Included in the list are 3 C-type cytochromes. C-type cytochromes account for a vital step in ATP bio-generation via the proton motive force (Table 4). Aerobically, the cytochrome BC1 complex requires oxygen as electron acceptor, yielding H 2 O. Anaerobically, cytochrome C 551 (NirN), C 552 and cytochrome C oxidase CBB Q transfer electrons to nitrate reductase (NirS) and nitric-oxide reductase (NorB-NorC). The importance of NirN and NirC (the precursor of NirN) was demonstrated in [13] (Table 2).
In addition, cobalamin-independent methionine synthase is important. This methionine synthase is a precursor of C 551 that can be produced without using vitamin B12 (see Pyrimidine and amino acid biosynthesis, Table 5). This might be a key component for anaerobic growth, since both the protein domain analysis and the GSM iJN1411 [34] predict that, amongst other vitamins, the active form of vitamin B12 can only be bio-generated in the presence of oxygen in P. putida KT2440.
The PDC also indicates the need for cytochrome C 552, and for cytochrome C oxidase CBB Q and its maturation protein (Table 2, 4). The enzyme cytochrome C nitrite reductase (C 552) catalyses the six-electron reduction of nitrite to nitrogen as one of the key steps in denitrification, nitrogen is then reduced to ammonium in the nitrogen fixation pathway, where it participates in the anaerobic energy metabolism of dissimilatory nitrate ammonification. Expression of cytochrome CBB Q oxidase allows agronomic important diazotrophs to sustain anaerobic respiration [39].  [29,30] Pyrimidine and amino acid biosynthesis Included in the list are 2 genes involved in pyrimidine and amino acid synthesis, and additional bottlenecks that can be solved by adding 3 vitamins to the medium. Earlier GSM simulations with iJP962 indicated that alternate genes must be inserted for dihydroorotate dehydrogenase and ribonucleotide triphosphate reductase type II for pyrimidine and ultimately DNA and RNA biosynthesis [22]. Both the protein domain analysis and GSM simulations using the iJN1411 metabolic model predict that cobalamin (vitamin B12), pyridoxal-5-phosphate (vitamin B6) and menaquinone (vitamin K2) cannot be produced under anoxic conditions.
Crespo et al. showed that class II RNRs depend on adenosylcobalamin or vitamin B12 (cobalamin) to generate its radical independently of oxygen [40]. Cobalamin is a complex essential cofactor for many enzymes mediating methylation, reduction and intramolecular rearrangements, and for methionine synthase. There is a recognised distinction between aerobic and anaerobic generation of cobalamin [41,42]. The routes differ in terms of cobalt chelation (via CobNST complex in the aerobic pathway, via precorrin-2 with CbiK in the anaerobic pathway) and oxygen requirements. The enzymes CobI, CobG, CobJ, CobM, CobF, CobK, CobL, CobH, CobB and CobNST form the aerobic pathway. CbiK, CbiL, CbiH, CbiF, CbiG, CbiD, CbiJ, CbiET, CbiC and CbiA form the anaerobic route [41,30,43]. Surprisingly, the protein domain comparison yielded none of the enzymes of the anaerobic pathway for vitamin B12 synthesis, but instead CobT and CbtB, both described as important for the aerobic pathway [41]. According to the extensive analysis, these specific protein domains linked to these genes are not present in aerobic species analysed but only in anaerobic species. It was found that in the anaerobic bacterium Eubacterium limosum, CobT functions as an activator for a range of lower ligand substrates including DMB, determining cobamide diversity. The specific function of CbtB is unknown [41,42].
Vitamin B6 is required for a wide variety of processes [44]. There are many vitamin B6-dependent proteins involved in amino acid biosynthesis, amino acid catabolism, antibacterial functions, iron metabolism, carbon metabolism, nucleotide utilization, cofactors for biotin, folate and heme, NAD biosynthesis, cell wall metabolism, tRNA modification, regulation of gene expression and biofilm formation.
Vitamin K2 is responsible for electron transport during anaerobic respiration. However, knock-out experiments in E. coli showed that upon loss of menaquinone and vitamin K1 only 3% of theorethical yield was obtained, but this was instantly revived to 44% upon supplementing of vitamin K1 or vitamin K2 [45], indicating vitamin K1 can partially make up for the loss of vitamin K2.
Rather than inserting all missing genes, in a minimal design setup, these vitamins can be supplemented to the medium (indicated in Table 5 with * ). To determine any immediate effect on growth or survival rates, vitamin supplementation through the medium was tested, monitoring performance of all recombinant strains under anoxic conditions. This was done parallel to a survival experiment without vitamin mix added. No difference in growth rates or survival rates was found ( Figure S4, Figure S5, Data S10, Data S11).

ATP Generation
Of the 61 genes of known function required for anaerobic respiration, 3 are involved in ATP generation. Both the protein domain analysis, transcriptomics data and metabolic modelling with iJP962 and iJN1411 indicate that ATP production remains one of the main bottlenecks to tackle. Earlier work has come to the same conclusion and tackled this by insertion of genes for acetate production or ethanol production [4,14]. Our protein domain analysis has elucidated specific ATPases that only occur in anaerobic strains, providing an alternative to ATP production by fermentation (Table 6).

Domains of Unknown Function
The protein domain analysis resulted in 270 unique protein domains of unknown function occurring in the genomes of anaerobic strains but not in aerobic strains. Based on contextual information, 8 were identified as important for anaerobic respiration. These were included in the design (Table 7). Similarly, 28 protein domains of unknown function were associated with virology factors or immunity, and could be excluded from the design. This leaves 244 protein domains of which the function is unknown and which can thus not be completely excluded from this design.

Discussion
No extended survival under anoxic conditions after acetate kinase integration Our previous rational design [22] was based on two genome-scale models and genome domain comparison analysis of six facultative anaerobic Pseudomonas species compared to six obligatory aerobic Pseudomonas putida species. Under micro-oxic conditions, the addition of acetate kinase, dihydroorotate dehydrogenase and class II ribonucleotide triphosphate reductase lead to growth. In our hands there was no extended survival under anoxic conditions of the recombinant strains upon introduction of ackA. It is extremely challenging to acquire anoxic conditions. Both the medium and the headspace must be treated to completely remove oxygen from the start of the experiment, otherwise oxygen depletion takes up to 12h. Further, the medium must be prepared with L-cysteine or sodium thioglycollate to actively remove oxygen. Without these precautions, the medium can be very easily oxygenated. Small stopper-capped vials are preferred strongly over screw-cap vials, in which oxygen leaks frequently occurred [22]. Resazurin staining indicates when levels drop below a detectable level (determined with micro-electrode at 0.01 g/l dissolved oxygen, as seen in previous work [22]), but does not distinguish micro-oxic conditions from anoxic conditions.
The lack of improvement in survival rates can easily be explained when contemplating the novel design assembled in this research, as numerous essential factors such as an alternative electron acceptor or an anaerobically active cytochrome-C are missing.

Technical design issues
To enable an anaerobic lifestyle, previous designs included the introduction of between 3 and 24 genes in P. putida KT2440 genome [13,14,4,22] but our in silico methods suggests that at least three times more genes are required. Novel methods developed specifically for integration of large operons or multiple genes like yTREX [46] allow incorporation of up to 14 genes at one time in P. putida.
The 69 genes in our design does not take into account the 244 unknown genes, which complicate the task even further. Without knowing their exact function, these genes cannot entirely be excluded from the design. At least eight of these are somehow associated with survival and/or growth in anoxic conditions [30]. The crucial roles that genes of unknown function might play was demonstrated by Hutchison and colleagues [47], who in their attempt to make a minimal bacterial genome, unexpectedly found 149 genes of unknown function to be essential for growth.
Many of the genes found in the design are closely linked to metal transport, including many hydrogenases and genes for pyrimidine and amino acid biosynthesis.
It should be considered that changes in oxygen availability drastically alters metal bioavailability as extensively reviewed in [48].
The new design compared to previous designs We elucidated that for anaerobic growth both the nir-nor and nar operons are vital. There do exist Pseudomonas species that naturally have only one of these operons and are capable of nitrate to nitrite transformation. However, these strains respire nitrogen under oxic conditions only, andd have been shown to be incapable of growth in anoxic conditions [49,50]. Building upon that, if P. putida KT2440 would be enriched with both the denitrification pathway and the nitrogen fixation pathway it could reduce nitrate or nitrite to ammonium, which can then be assimilated to organic compounds, transforming P. putida KT2440 in a diazotroph of agronomic importance [39].
The most prevalent anaerobic dissimilatory nitrate respiration regulator DNR is one of the key hydrogenases obtained from the protein domain comparison. In the facultative anaerobic E. coli, knock-out fnr mutants, an ortholog of dnr, were unable to grow under anoxic conditions. By DNA microarray technology it was shown that in E. coli 49% of the genes which differ in expression under anoxic and oxic conditions are regulated by FNR [37]. The two-component aerobic respiratory control system (ArcA and ArcB) controls gene transcription in E. coli under anoxic conditions. Mutations in this system are known to affect expression of over 30 operons. Most of these are repressed under anoxic conditions, but cytochrome C oxidase and pyruvate formate lyase are activated. In E. coli, ArcA and FNR are deemed essential for anaerobic activation [51]. In an anaerobic respiratory design of P. putida KT2440, it is debatable whether regulatory genes are required. We deem this advisable, in order to maintain optimal functionality of this strain under oxic conditions next to gaining the anaerobic respiration treat. These genes are thus included in the final design. However, the necessary fine-tuning of the expression levels of the regulatory genes would pose its own challenge.
We argue that for a lifestyle shift from a strict aerobic lifestyle in P. putida KT2440 to an anaerobic respirative one, all these genes are required. However, an increase of strain performance under micro-oxic conditions or prolonged survival rates under anoxic conditions significantly improves strain robustness in large scale bioreactors with fluctuating oxygen levels. Hence, each step towards an anaerobic lifestyle may substantially ease processes in large scale bioreactors. For enhanced performance under micro-oxic conditions, it was demonstrated that increasing ATP production alone through acetate production is enough [22]. For prolonged survival rates, however, these key elements include both Nir-Nar and Nor operons for denitrification and nitrogen fixation, cytochrome C 552, and external supplementation of the lacking vitamins. This conclusion is supported by previous findings that energy supply and redox balancing are the main bottlenecks in an anaerobic lifestyle [22,4,13,14,15,16].

Conclusion
Increased ATP generation by insertion of acetate kinase via a plasmid does not lead to prolonged survival rates of Pseudomonas putida KT2440 under anoxic conditions. This proves that increased performance under micro-oxic conditions does not guarantee prolonged survival under anoxic conditions. A P. putida KT2440 strain capable of anaerobic respiration would require the insertion of at least 69 genes into the genome and a supplement of 3 vitamins to the medium. The conversion of a strict aerobic species to a facultative anaerobic lifestyle by anaerobic respiration is a much more elaborate process than was thought before. Especially the function of DUFs and their role in anaerobic respiration must be researched, as it remains unknown how many of these should be added to this design.