Identification of alkaline phosphatase in the genomes of marine bacteria and global expression among all APases
A. mediterranea DE was predicted to encode four extracellular APases: PhoA, PhoD, PhoX, and PafA (UniProtKB accession numbers F2G8N3, F2G754, T2DMD9 and F2G994, respectively) as described below. In addition, regulatory elements before the gene help identify those involved in P uptake. In E. coli and a diversity of bacteria, the expression of Pi scavenging genes is regulated by the pho regulon. Through the concerted action of a σ factor, σ70, genes with a typical pho box in their promoter sequence are regulated by the Pi concentration 32. Pho box sequences were searched for with EMBOSS fuzznuc version 6.6.0.0 and the pho box sequences known for E. coli 32,33. Regulatory pho regulon sequences were also predicted using MEME version 5.5.4 34 and FIMO version 5.1.0 35 both with default parameters, within 500 nucleotides before each gene. Identified sequences with a predicted phobox within 500 nucleotides of the gene in the set of Alteromonas genomes are shown in Extended Data Fig. 1.
Each of the APases was quantified in the global ocean using 187 metatranscriptomes from the Tara Ocean expedition, corresponding to 109 stations 36 and following the methods described in Baltar et al., 2023 37. Briefly, reference phylogenetic trees were predicted, and environmental peptide sequences were placed on the tree to assign a taxonomic label. The reference phylogenetic trees were constructed with a set of prokaryotic marine genomes as described in Baltar et al., 2023, along with peptides from eukaryotic genomes in EukProt proteins, version 3 38. To make the phylogenetic trees for each gene, an HMM profile is needed to search for peptides in the genome database. Initial searches in the metatranscriptome sequences were carried out with the same HMM and Blastp before the phylogenetic placement of the environmental peptide sequences on the reference trees. Depending on the gene, these HMMs were either Pfam or ad hoc HMM profiles, as described below. The contribution to the expression of phosphatase genes in the global ocean was quantified using the OM-RGC table of abundances (https://www.ocean-microbiome.org) 36 as detailed in Baltar et al., 202337. The OM-RGC contains genes and associated tables of estimated abundance for the metatranscriptomes in the Tara Ocean database for each sample and after normalization, as detailed in Sunagawa et al., 2015 39. Since the Tara Ocean gene catalog included a functional profile annotated by the Kyoto Encyclopedia of Gene and Genomes (KEGG) and Clusters of Orthologous Genes (COG), a comparison between the quantification of gene expression is shown in Fig. 3.
For PhoA, the reference phylogenetic tree was built using the peptide sequences in our genome database retrieved with its corresponding Pfam profile, PF00245, as described in Baltar et al., 2023. The corresponding Pfam profile for PhoD (PF09423; PhoD-like phosphatase) yielded too many peptide sequences, some of which were likely to be paralogs. For example, a search with the complete genomes in our database predicted some PhoD-like peptides that did not follow any of the secretion pathway and are therefore expected to be cytoplasmic as predicted with SignalP version 6.0 (Extended Data Fig. 2). In this case, a HMM profile was designed for PhoD peptides enriched in TAT and TAT lipoprotein signal peptides that cluster together in the phylogenic tree. Similarly, since too many peptide sequences were retrieved with the corresponding Pfam (PF05787; Bacterial protein of unknown function, DUF839) from our genome database, instead, the PhoX peptides were chosen by their corresponding KOfam, K07093 family, in KEGG orthology using HMMER, version 3.3.2, with its adaptive score threshold 40. K07093 corresponds to COG3211 (PhoX Secreted phosphatase, PhoX family) and the number of sequences thus retrieved from our genome database was smaller than with the Pfam, although the diversity and number of sequences were still too high, suggesting that it also included paralogs. Therefore, another HMM profile was designed for sequences that clustered together and included mostly peptides with a predicted TAT and TAT lipoprotein signal peptide (Extended Data Fig. 3–4).
PafA did not correspond to any profile in the KO database. Its corresponding Pfam, PF01663, retrieved too many peptides in our genome database. In this case, an HMM was designed to target PafA in A. mediterranea DE, Elizabethkingia meningoseptica (previously Chryseobacterium meningosepticum) and related sequences classified as PafA in Lidbury et al., 2022 16. For both A. mediterranea DE and E. meningoseptica PafAs there is experimental evidence of their APase activity in this study and in Lidbury et al., 2022, as well as a crystallographically determined structure for E. meningoseptica PafA 41. The profile HMM retrieved peptides with a predicted Sec and lipoprotein signal peptide (Extended Data Fig. 3–4). All peptides annotated as “alkaline phosphatase PafA” in RefSeq, 10,424 sequences (RefSeq Release 221, November 13, 2023), were targeted by the HMM designed for PafA.
Cloning and expression
Four APase gene sequences from A. mediterranea DE were used for de novo biosynthesis. For the transformation step, BL21 DE3 competent cells were transformed with pET29 plasmid encoding one of the four APase genes. Thawed cells were mixed with the DNA and incubated on ice for 15 minutes. Heat shock was performed at 42°C for 45 seconds, followed by a 2-minute incubation on ice. Next, 400 µl of SOC media (at room temperature) was added, and the cells were incubated at 37°C and agitation at 250 rpm for at least 1 hour. To initiate expression, 50 µl of the transformation reaction was added to culture tubes containing 5 ml of Luria Broth (LB) media and 5 µl of kanamycin (KAN; final concentration 50 µg/ml). The tubes were then incubated overnight at 37°C and 250 rpm. For the expression, 1L of TB media supplemented with KAN (final concentration 50 µg/ml) and 1.5% w/v of lactose was inoculated with a 4 ml of the overnight preculture. The cells were grown at 25°C and 220 rpm for 24 hours. Afterward, the cells were harvested by centrifugation at 4°C and 4000 rpm for 30 minutes. To prepare the extract, the pellet was resuspended in 100 ml of lysis buffer and subjected to sonication. A sample from the whole cell-lysate (WC) was taken. The intracellular soluble fraction was obtained by centrifugation at 20000 × g for 30 minutes at 4°C, and a sample from the soluble fraction (SN) was collected. In the purification process, the cleared lysate (SN) was applied to a 5 ml HisTrap column, pre-equilibrated with IMAC buffer A, at a flow rate of 2 ml/min for IMAC purification with the His Tag. The column was washed with IMAC buffer A until UV readings returned to baseline, followed by a 45 ml wash with 4% IMAC B buffer. Elution of the protein was carried out using varying concentrations of imidazole: 50, 100, 150, 250, and 520 mM. After assessing the purity of different fractions by SDS-PAGE, the 45 ml wash with 4% IMAC B buffer and the 520 mM elution fractions were combined and digested with 3C protease to remove the His tag. The digestion process took place overnight at 4°C in dialysis buffer to decrease the imidazole concentration. The next day, the buffer was changed to a fresh dialysis buffer, and dialysis continued for an additional 2 hours. Next, the digested and dialyzed protein solution was injected into a 5 ml HisTrap column, equilibrated with IMAC buffer A. After loading at a rate of 2 ml/min, the column was washed with IMAC buffer A until UV readings returned to baseline, followed by a 45 ml wash with 4% IMAC B buffer. The elution of the 3C enzyme and the cleaved His tag was achieved using 500 mM imidazole. This step allows binding of the His tag to the column and recovery of the His-tagged 3C protease, while the flow-through should contain the digested protein of interest. To further purify the protein, the flow-through from the IMAC purification was concentrated to a final volume of 5 ml, corresponding to approximately 65 mg of protein. The concentrated protein solution was then injected into a HiLoad16/600 Superdex200 pg column at a flow rate of 1 ml per minute. Fractions of 1.8 ml were collected, and the purity of each fraction was assessed using SDS-PAGE.
Determination of enzyme kinetic parameters of APase
All APases and enzymatic analysis with Alteromonas were tested for P-monoesterase, P-diesterase, P-triesterase and sulfatase activities using p-nitrophenol based substrates: p-nitrophenyl phosphate disodium salt hexahydrate, bis(p-nitrophenyl) phosphate sodium salt, Paraoxon-methyl and 4-nitrophenyl sulfate potassium salt, respectively. Measurements were carried out according to previously described methods 24. All experiments were performed using filter sterilized (0.2 µm) artificial seawater, specifically Aquil* culture medium at a pH of 8.1 without major nutrients (P, N, Si) or vitamins 42. To ensure the presence of all required cofactors, including trace metals, we supplemented the medium with the highest concentrations observed in the open ocean MnCl2 (5 nM), FeCl3 (2 nM), NiCl3 (12 nM), ZnSO4 (9 nM), Na2SeO3 (2.3 nM) and Na2MoO4 (105 nM) 43. For each substrate and substrate concentration, blank controls were run without the addition of any enzyme, and the obtained values were subtracted from the corresponding substrate results. The enzyme concentration varied depending on the substrate: 1 nM for p-nitrophenyl phosphate and bis(p-nitrophenyl) phosphate, and 10 nM for Paraoxon-methyl and 4-nitrophenyl sulfate. Kinetic parameters, kcat, KM and the catalytic efficiency kcat/KM values, were calculated as described in Srivastava et al. 24. Calculations and visualizations were done in R.
Alteromonas mediterranea growth and sample preparation
A. mediterranea strain DE was purchased from the Leibniz-Institut Deutsche Sammlung von Mikroorganismen und Zellkulturen (DSMZ No.: 17117). The main salts for the medium were prepared according to the Aquil* medium formula 44. The media contained 50 µM of carbon in the form of glucose, NH4Cl (10 µM), KH2PO4 (1 µM for the control and 0.1 µM for P limitation), vitamin B12 (0.37 nM) and the following trace metals: MnCl2 (5 nM), FeCl3 (2 nM), NiCl3 (12 nM), ZnSO4 (9 nM), Na2SeO3 (2.3 nM) and Na2MoO4 (105 nM). Trace metal concentrations used in this study reflect the highest levels observed in the open ocean 43. Additionally, for testing growth conditions with monoesters as a sole P source (Extended Data Fig. 5) adenosine-monophosphate (AMP; Sigma-Aldrich) was added. 1 µM of AMP (C10H14N5O7P) already contains 10 µM of carbon and 5 µM of nitrogen. Therefore, to maintain a C:N:P ratio of 50:10:1, only 40 µM of carbon in the form of glucose and 5 µM of NH4Cl were added. The AMP + Pi culture was prepared with an additional 1 µM of KH2PO4. Cell abundance was measured after fixing with glutaraldehyde (2% v/v final concentration) via flow cytometry using an Accuri™ C6 Plus Flow Cytometer (Becton Dickinson, Franklin Lakes, USA) and stained with 1x SYBR® Green I (Sigma-Aldrich). The cultures were inoculated to a starting cell density of 1 × 104 cells/l. All different treatments were cultivated in biological triplicates of 400 ml and in a shaker (100 rpm) at 15°C. Cells were harvested during early stationary phase (after 38 hours, Extended Data Fig. 5). Enzymatic measurements were conducted as described above with whole cells. Only cultures that showed a clear distinction in growth and enzymatic potential (Control and P-limited, Extended Data Fig. 5, Extended Data Table. 1) were further collected for protein extraction. Approximately 200 ml per treatment was utilized and filtered through a 0.22-µm pore-size polycarbonate membrane (47 mm diameter). The lysis buffer was prepared with the following concentrations: 50 mM Tris-HCl adjusted to pH 8.0; 150 mM NaCl; 1% Triton X-100; 0.1% SDS; 1 mM EDTA; and one protease inhibitor cocktail tablet (Thermo Scientific). The filters were covered with lysis buffer and subjected to three freeze-thaw cycles alternating between − 80°C and + 80°C. For precipitation, ice-cold EtOH was added in a 9:1 ratio, followed by storage at -20°C overnight, and pelleted down by centrifugation for 30 minutes at 14,000 rpm at 4°C. The pellet was air-dried at room temperature for 30 minutes. The protein pellet was re-suspended in 50 µl of 50 mM TEAB buffer. To this mixture, 1 µl of the reduction buffer (with a final DTT concentration of 20 mM) was added, followed by incubation at 60°C for 30 minutes. After cooling to room temperature, 1 µl of the alkylation buffer (with a final Iodoacetamide concentration of 20 mM) was added, and the mixture was incubated at room temperature in the dark for 30 minutes. Subsequently, trypsin digestion was performed with a concentration of 1 µg trypsin per 100 µl sample (MERCK Trypsin Sequencing Grade) and incubated at 37°C for 16 hours. The reaction was halted by adding trifluoroacetic acid with a final concentration of 1%. Desalting was performed using Pierce™ C18 Tips following Thermo Scientific's instructions. After drying the peptides in a speed vacuum, the samples were stored at -80°C until analysis.
Proteomics
Online nano-reversed phase (RP)-liquid-chromatography high-resolution mass spectrometry (MS)
Purified tryptic peptides were dissolved in a 0.1% formic acid (FA) solution (v/v) prepared with high purity water (MilliQ). Subsequently, approximately 1µg of peptides was separated using an online reversed-phase (RP) HPLC (Thermo Scientific Dionex Ultimate 3000 RSLC nano LC system) connected to a benchtop Quadrupole Orbitrap (Q-Exactive Plus) (QE Plus) mass spectrometer (Thermo Fisher Scientific). The online separation was performed on an Easy-Spray analytical column (PepMap RSLC C18, 2 µm, 100 Å, 75 µm i.d. × 50 cm, Thermo Fisher Scientific) with an integrated emitter. The column was heated at 55°C. The flow rate was set to 300 nl/min. The LC gradient followed a two-hour gradient method and was set to 5–50% buffer B (v/v) [79.9% ACN, 0.1% formic acid (FA), 20% high MilliQ] for 105 minutes and then to 80% buffer B over 5 more minutes. Buffer A consisted of 0.1% (v/v) of FA in MilliQ.
The LC eluent was introduced into the mass spectrometer through an Easy-Spray ion source (Thermo Scientific). The emitter was operated at 1.9 kV. Mass spectra were measured in positive ion mode applying a top ten data-dependent acquisition (DDA) approach. A full mass spectrum was acquired at a resolution of 70,000 at m/z 200 [AGC target at 1e6, maximum injection time (IT) of 120 ms]. The scan range was set from 400 to 1,800 m/z. Following the MS scan, an MS/MS scan was performed at a resolution of 17,500 at m/z 200 [Automatic Gain Control (AGC) target at 1e5, 3.0 m/z isolation window and maximum IT of 80 ms]. For MS/MS fragmentation, stepped normalized collision energy (NCE) values were employed: 27%, 30% and 33% for higher energy collisional dissociation (HCD). A dynamic exclusion time of 50 seconds was implemented. Unassigned precursors as well as + 1, +7, + 8, and > + 8 charged precursors were excluded from selection. An intensity threshold of 6.3e3 was set, and isotopes were excluded as well.
Raw data were analyzed using MaxQuant 2.4.2.0 searching against Alteromonas mediterranea DE UniProt proteome (downloaded on 15.6.2023; 3935 entries). Parameters: 20 ppm precursor tolerance for the first peptide search (FTMS), 0.5 Da precursor tolerance for the main search (ITMS), 0.01 Da match tolerance for ITMS scanned fragment ions; maximum 3 of the following variable modifications allowed per peptide: oxidation of methionine and acetylation of the N-terminus, maximum two missed cleavages allowed. RAW data were also matched against contaminants. LFQ intensities were log2-transformed, samples were considered statistically significant when Student’s t-test p ≤ 0.05.