A Simple and Rapid System for Proteomic Analysis of the Archaeon Candidatus Vulcanisaeta moutnovskia

This protocol describes a rapid protein extraction method for the archaeon Candidatus Vulcanisaeta moutnovskia, which can be also implemented for other archaea. The utilization of two different methods for protein extraction constitute the main step of the protocol. Method I involves the extraction with a multi-chaotropic lysis buffer containing a non-denaturing zwitterionic detergent, most ecient for extracting cytosolic proteins. Method II involves a denaturing anionic detergent allowing total disruption of the membranes and capable of extracting both membrane (hydrophobic) and non-membrane (water-soluble, hydrophilic) proteins. The big advantage of the methods is to use general laboratory chemicals to make powerful extraction buffers, resulting in high quality and quantity of proteins. The methods probably are usable for any other archaea or microbial cells, and takes about 14-22 h. Following extraction and further protein digestion, 1D-nano Liquid Chromatography Electrospray Ionization Tandem Mass Spectrometric (LC ESI-MSMS) analysis with Triple TOF 5600 and Orbitrap technologies were used for protein identication and further quantication.


Introduction
Gene expression is routinely quanti ed by measuring mRNA levels. However, there is uncertainty about how closely levels of mRNAs relate to levels of their corresponding proteins; several papers suggest only a modest correlation using transcriptomic and proteomic techniques 1,2 . This is why the analysis of proteins that are actively synthesized by a microbe is now extensively recognized, as last step of the functional hierarchy, to provide associations with functions that become part of the active metabolism.
Many protocols have been devised for extracting and analyzing proteins from diverse samples. However, case by case adaptations of those protocols are needed, in order to guarantee a correct representation of cytosolic and membrane bound proteins, the last one having common and inherent problems of extractions. Here we developed a two-step extraction method that can be universally used. It is based on the utilization of a rst "soft" lysis buffer containing 3-[(3-cholamidopropyl)dimethylammonio]-1propanesulfonate hydrate (CHAPS) to extract mostly cytosolic proteins. membrane-bound proteins.
Second, to improve the extraction e ciency of membrane proteins, we additionally used a new extraction method in which a multi-chaotropic lysis buffer containing 7 M urea, 2 M thiourea, 100 mM triethylamonium bicarbonate, a protease/phosphatase inhibitor cocktail, and most importantly 5 % sodium dodecyl sulfate (SDS), was used. The addition of 5% SDS guarantee the extraction of membranebound proteins. Indeed, denaturing detergents such as SDS totally disrupt membranes, bind to both membrane (hydrophobic) and non-membrane (water-soluble, hydrophilic) proteins and denature proteins by breaking protein-protein interactions. Non-denaturing detergents such as CHAPS used in the rst step have rigid and bulky nonpolar heads that do not penetrate into water-soluble proteins; consequently, they generally do not disrupt membranes or do at low extend 3  The binary culture 768-28 (referred to as "the binary culture") containing "Candidatus V. moutnovskia" was used for testing the protocol 4 . Cultures growing robustly on medium with yeast extract and sulphur or sulphate, were used.
I. Proteomic analyses based on non-denaturing detergent.
Cell lysis and In-Gel protein digestion (stacking gel). Cell pellets were dissolved in lysis buffer (8 M urea, 2 M thiourea, 5% CHAPS, 2 mM TCEP-HCl and protease inhibitor). Homogenisation of the cells was achieved by ultrasonication (10 strokes, low amplitude) on ice. After homogenisation, the lysed cells were centrifuged at 20,000×g for 10 min at 4 °C, and the supernatant containing the solubilised proteins was used for LC-MS/MS experiment. Total protein concentration was determined using Pierce 660nm protein assay (Thermo). An aliquot of every sample was diluted with enough loading sample buffer and then applied onto 1.2 cm wide wells of a conventional SDS-PAGE gel (1mm-thick, 4% stacking, and 12% resolving). The run was stopped as soon as the front entered 1 cm into the resolving gel, so that the whole proteome became concentrated in the stacking/resolving gel interface. The unseparated protein bands were visualised by Coomassie staining, excised, cut into cubes (1 mm 2 ), deposited in 96-well plates and processed automatically in a Proteineer DP (Bruker Daltonics, Bremen, Germany). The digestion protocol used was based on 66 with minor variations: gel plugs were washed rstly, with 50 mM ammonium bicarbonate and secondly with ACN prior to reduction with 10 mM DTT in 25 mM ammonium bicarbonate solution, and alkylation was carried out with 55 mM IAA in 50 mM ammonium bicarbonate solution. Gel pieces were then rinsed rstly with 50 mM ammonium bicarbonate and secondly with ACN, and then were dried under a stream of nitrogen. Proteomics Grade Trypsin (Sigma Aldrich) at a nal concentration of 16 ng/μl in 25% ACN/50 mM ammonium bicarbonate solution was added and the digestion took place at 37°C for 4 h. The reaction was stopped by adding 50% ACN/0.5% TFA for peptide extraction. The eluted tryptic peptides were dried by speed-vacuum centrifugation and then desalted onto StageTip C18 Pipette tips (Thermo Scienti c) until the mass spectrometric analysis.
Liquid Chromatography Electrospray Ionization Tandem Mass Spectrometric (LC-ESI-MS/MS). A 1 µg aliquot of each sample was subjected to 1D-nano LC ESI-MSMS analysis using a nano liquid chromatography system (Eksigent Technologies nanoLC Ultra 1D plus, SCIEX, Foster City, CA) coupled to high speed Triple TOF 5600 mass spectrometer (SCIEX, Foster City, CA) with a Nanospray III source. The analytical column used was a silica-based reversed phase Acquity UPLC M-Class Peptide BEH C18 Column, 75 µm × 150 mm, 1.7 µm particle size and 130 Å pore size (Waters). The trap column was a C18 Acclaim PepMapTM 100 (Thermo Scienti c), 100 µm × 2 cm, 5 µm particle diameter, 100 Å pore size, switched on-line with the analytical column. The loading pump delivered a solution of 0.1% formic acid in water at 2 µl/min. The nano-pump provided a ow-rate of 250 nl/min and was operated under gradient elution conditions. Peptides were separated using a 250 minutes gradient ranging from 2% to 90% mobile phase B (mobile phase A: 2% acetonitrile, 0.1% formic acid; mobile phase B: 100% acetonitrile, 0.1% formic acid). The injection volume was 5 µl.
Data acquisition was performed with a Triple-TOF 5600 System (SCIEX, Concord, ON). Data was acquired using an ion spray voltage oating (ISVF) 2300 V, curtain gas (CUR) 35, interface heater temperature (IHT) 150, ion source gas 1 (GS1) 25, declustering potential (DP) 100 V. All data was acquired using information-dependent acquisition (IDA) mode with Analyst TF 1.7 software (SCIEX, USA). For IDA parameters, 0.25 s MS survey scan in the mass range of 350-1250 Da were followed by 35 MS/MS scans of 100 ms in the mass range of 100-1,800 (total cycle time: 4 s). Switching criteria were set to ions greater than mass to charge ratio (m/z) 350 and smaller than m/z 1,250 with charge state of 2-5 and an abundance threshold of more than 90 counts (cps). Former target ions were excluded for 15s. IDA rolling collision energy (CE) parameters script was used for automatically controlling the CE.

II. Proteomic analyses based on denaturing detergent.
Lysis and protein extraction. Cell pellets were dissolved with multi-chaotropic lysis buffer containing 7 M urea (USB Corporation, Cleveland, OH), 2 M thiourea (Sigma-Aldrich), 5 % SDS (Sigma-Aldrich), 100 mM triethylammonium bicarbonate (TEAB) (Thermo Fisher Scienti c) and a protease/phosphatase inhibitor cocktail (Thermo Fisher Scienti c). Samples were reduced and alkylated by adding 5 mM TCEP-HCl and ultrasonic lab homogenizer (Hielscher Ultrasonics). The homogenate was centrifuged at 16,000 × g for 15 min at 4 °C, and the supernatant containing the solubilized proteins was used for further analysis. Protein concentration was estimated by Pierce 660nm protein assay (Thermo Fisher Scienti c).
S-Trap TM Digestion. Protein digestion in the S-Trap lter (Proti , Huntington, NY, USA) was performed following the manufacturer's procedure with slight modi cations. Brie y, 20 µg of protein of each sample was diluted to 40 µL with 5% SDS. Afterwards, 12% phosphoric acid and then seven volumes of binding buffer (90% methanol; 100 mM TEAB) were added to the sample ( nal phosphoric acid concentration: 1.2%). After mixing, the protein solution was loaded to an S-Trap lter in two consecutive steps, separated by a 2 min centrifugation at 3000 x g. Then the lter was washed 3 times with 150 μL of binding buffer.
Finally, 1 µg of Pierce MS-grade trypsin (Thermo-Fisher Scienti c) in 20 μL of a 100 mM TEAB solution was added to each sample in a ratio 1:20 (w/w) and spun through the S-Trap prior to digestion. Flowthrough was then reloaded to the top of the S-Trap column and allowed to digest in a wet chamber at 37°C overnight. To elute peptides, two step-wise buffers were applied (1)  analysis using a Thermo Easy-nLC 1000 HPLC system (Thermo Fisher Scienti c) coupled online to a Q Exactive HF Orbitrap (Thermo Fisher Scienti c). Peptides were eluted onto a 50 cm × 75 μm Easy-spray PepMap C18 analytical column at 45°C and were separated at a ow rate of 250 nL/min using a 210 min gradient ranging from 2 % to 95 % mobile phase B (mobile phase A: 0.1% formic acid (FA); mobile phase B: 100 % acetonitrile (ACN), 0.1 % FA). The loading solvent was 2 % ACN) in 0.1 % FA and injection volume was 10 µl.
Data acquisition was performed using a data-dependent top-15 method, in full scan positive mode, scanning 380 to 1800 m/z. Survey scans were acquired at a resolution of 60,000 at m/z 200, with Automatic Gain Control (AGC) target of 3 × 10 6 and a maximum ll time (IT) of 40 ms. The top 15 most intense ions from each MS1 scan were selected and fragmented via higher energy collisional dissociation (CID). Resolution for HCD spectra was set to 15,000 at m/z 200, with AGC target of 2 × 10 5 and maximum ion injection time of 120 ms. Isolation of precursors was performed with a window of 2 m/z and the normalized collision energy (NCE) was 20. Precursor ions with single, unassigned, or six and higher charge states from fragmentation selection were excluded.

MS/MS data analysis and determination of protein abundances.
MS/MS data obtained for individual samples, using either Method I or II, were processed using Analyst® TF 1.7 Software (SCIEX) (for Method I) and Proteome Discoverer v2.4 (Thermo Fisher Scienti c) (for Method II). Raw data le conversion tools generated mgf les which were also searched against protein database that included "Candidatus V. moutnovskia" and Thermoproteus uzoniensis protein sequences from Uniprot/Swissprot Knowledgebase (last update: 2020/03/04, 4.573 sequences) using the Mascot Server v. 2.5.1 (Matrix Science, London, UK) (for Method I) or Mascot Server v2.7.0.1 (Matrix Science, London, UK) (for Method II). Search parameters were set as follows: enzyme, trypsin; allowed missed cleavages, 2; carbamidomethyl (C) as xed modi cation and acetyl (Protein N-term), Gln to pyro-Glu (Nterm Q), Glu to pyro-Glu (N-term E) and Oxidation (M) as variable modi cations. Peptide mass tolerance was set to 25 ppm and 0.05 Da for fragment masses derived from Method I and ± 10 ppm for precursors and 0.02 Da for fragment masses from Method II. The con dence interval for protein identi cation was set to ≥ 95% (p<0.05) and only peptides with an individual ion score above the 1% False Discovery Rates (FDR) at spectra level were considered correctly identi ed. The threshold of only one identi ed peptide per protein identi cation was used because FDR controlled experiments counter-intuitively suffer from the two-peptide rule 5 . For estimating the FDR, a target-decoy (TD) search strategy is used. The database search for this approach is performed together (concatenated) on the true (target) as well as null (decoy) database. A decoy database is constructed by simply reversing the target database. It is the simplest approach to calculate FDR and assumes that the number of false Peptide Spectrum Matches (PSMs) in decoy search (D) will be equal to the number of false PSMs in target search (T) above a given threshold score. Adding up the false hits in decoy and target, the number of false positives is therefore double of the decoy count above threshold. Therefore FDR is calculated as double of the number of false positives divided by the total hits: FDR = 2× /( + ).
To rank the protein abundance in each sample, the Exponentially Modi ed Protein Abundance Index (emPAI) was used in the present study as a relative quantitation score of the proteins in a complex mixture based on protein coverage by the peptide matches in a database search result 6 . Although emPAI is not as accurate as quanti cation using synthesized peptide standards, it is quite useful for obtaining a broad overview of proteome pro les. In this work we used normalized emPAI values (nemPAI), obtained from the emPAI values by dividing each individual value by the sum of all emPAI values in a given experiment 6 . t-Student's tests were used for pairwise comparisons of the relative abundance of proteins of interest using a script in R programming language to automatize the process.
Proteomics data repository.
The mass spectrometry proteomics data have been deposited in the ProteomeXchange Consortium via the PRIDE 7 partner repository with the dataset identi er PXD012750 and 10.6019/PXD012750.

Troubleshooting
To avoid liquid leakage from the S-Trap column, a customized yellow tip with 9 Empore 3M C18 disks (Sigma-Aldrich) was placed at the bottom tip of the S-Trap column during digestion.

Anticipated Results
The main purpose of this research was to access a simple protocol by which cytosolic and membrane bound proteins would be possible for the archaeon Candidatus Vulcanisaeta moutnovskia, and can be performed with having access to general chemical resources. We have been successful in generating proteomic results from the binary sulphate-reducing archaeal culture 768-28, grown with sulphate and with sulphur as electron acceptors.