A novel MALDI-TOF mass spectrometry sample preparation strategy for improved proteomic biotyping of clinically signicant Mycobacteria

A precise routine method is essential for the rapid identication of pathogenic mycobacterial species to support our publicly maintained health care management systems to eciently treat and control and emerging tuberculosis pandemic that is threatening populations of global third world economies. To date, many conventional and more recently developed molecular genotyping methods are employed to this end. However, current technologies are limited in respect in that they are time consuming nature and dependent on highly skilled technical personnel expertise. Matrix-assisted laser desorption/ionization time-of-ight mass spectrometry (MALDI-TOF MS) has been recently reported as reliable, economical, and highly ecient method for both bacteria, yeast and to a limited extent for mycobacterial strain identication. Unlike other microbes, a major impediment exists in the ecient extraction of cellular proteins from mycobacteria, especially due to their complex cell wall associated lipid structure. In this study, the manufacturer prescribed mycobacterial sample preparation method for MALDI TOF MS is modied and optimised to generate more ecient cellular protein extraction for ecient biotyping. In this regard, the newly developed sample preparation method is inclusive of a glass bead cellular disruption/delipidation followed by a chloroform-methanol solvent extraction event. Interestingly, the data from this locally based study shows that newly developed method generates unique and highly reproducible mass spectra proles. A new and independent main spectral prole reference library (CMEFA-MSP) representing clinically relevant American Type Culture Collection (ATCC) mycobacterial strains and clinical isolates was established and subsequently used to unequivocally identify 110 (n = 100) blind-coded clinical mycobacterial isolates to the species level that displayed log score values of ≥ 2.3. This strongly suggests that MALDI-TOF MS when used in conjunction with the CMEFA sample preparation protocol has potential as a simple and cost-effective alternative for the unambiguous identication of clinically important mycobacteria. The following aspects relating to the collection, cultivation and identication of the 39 clinical isolates of mycobacteria used in this study were performed by the NHLS (National Health Laboratory Service) TB reference pathology laboratory based at Inkosi Albert Luthuli Central Hospital (IALCH), Durban, South Africa. This study was approved by the University of KwaZulu-Natal Biomedical Research Ethics Committee (BREC) for the use of stored isolated clinical samples (BEO85/12). Our study was performed in line with Declaration of Helsinki and International Committee of Harmonisation good clinical practice. The isolates are presented in Table 1 and it must be highlighted that this screening process only sourced clinically relevant mycobacteria associated with patient disease in the A large number of sputum samples representing putative clinical mycobacterial infections were collected from different regional hospitals of South Africa. Sputum samples (1 ml) were decontaminated with an equal volume N-acetyl-L-cysteine-sodium hydroxide (NALC-NaOH). All putative mycobacterial specimen manipulations were performed in a biological safety cabinet class-II equipped with high-eciency particulate air lters (HEPA). The samples were allowed to stand at room temperature for 15-20 min. Sodium phosphate buffer (pH 6.8; 48 ml) was added to samples and centrifuged at 3,000 x g for 15 min. The sediment was retained with approximately 3 ml of supernatant. The samples were vigorously vortexed and 100 µl of the suspension was subjected to auramine-O staining to conrm the presence of acid-fast bacilli 37 .


Introduction
Members of the Mycobacteriaceae family are responsible for a wide range of diseases in humans and lower animals 1 . In humans, these diseases range from super cial infections of the skin caused by M. marinum [2][3][4] , to pulmonary disease caused by both MTB and NTM. Tuberculosis (TB) still remains a disease of great antiquity and was one of the rst diseases for which the World Health Organization (WHO) declared a global public health emergency in 1993 3 . Presently, it is estimated that one-third of the world's population is infected with MTB 5 . An estimated 8 to 9 million new cases occur each year, with 2 to 3 million deaths 5,6 . Pulmonary TB is characterized by prolonged cough, hemoptysis, chest pain and dyspnea whilst symptoms of systemic and disseminated disease include fever, malaise, weight loss, weakness and night sweats [7][8][9] .
A widespread occurrence of NTM-related diseases due to Mycobacterium avium Complex (MAC), M. kansasii and other NTM's have been observed in the period of HIV-AIDS 3,10,11 . During this period, a salient observation was seen in patients who had CD4 counts of less than 50/ml presenting with bloodstream associated MAC infections. Of note, NTM's particularly M. avium, M. kansasii, and M. intracellulare produce pulmonary disease in humans that may be indistinguishable from that caused by the members of the MTBC (M. tuberculosis and M. bovis) 3 . There is a signi cant difference in the treatment of NTM related pulmonary infections and tuberculosis caused by MTB. As a result, a species-level identi cation of mycobacteria is necessary for the selection of appropriate antimicrobial therapy 12,13 .
A microscopic evaluation of an auramine stain, although widely used in low-income countries for the diagnosis of TB, is unable to con rm the presence of viable mycobacteria 14 . An over-diagnosis of a treatment failure may possibly be the nal outcome after performing a sputum smear microscopic on a sputum sample received from a patient on treatment 14 . Routinely employed laboratory methods for the species identi cation of mycobacteria such as growth rate, biochemical tests, analysis of antibiotic resistance patterns and HPLC (high pressure liquid chromatography) analysis of fatty acid constituents of the bacterial cell wall are time-consuming, requiring as long as 6-12 weeks to con rm a positive identi cation 3,15,16 . According to Buchan et al., HPLC is insu cient to accurately differentiate between closely related species such as M. chelonae/M. abscessus, M. avium/M. intracellulare, and species within the M. mucogenicum group, which may be important to patient care decisions 12,17 . In addition to the above, mycobacterial isolates from patients with chronic infections often lose their characteristic phenotypes, thus hindering the performance of several conventional diagnostic methods 3 .
In most pathology TB laboratories in South Africa, commercially available Genotypic PCR-based identi cation assays coupled with reverse line hybridization are widely used for the detection of polymorphisms within the 16S or 23S rRNA. These methods have been widely utilised for the species-speci c identi cation and differentiation of more than 30 clinically relevant mycobacterial species 13,18 and when coupled with other diagnostic modalities have been shown to greatly reduce the diagnostic turnaround (TAT) to a week. However limitations such as misidenti cation, inability to discriminate closely related members of the MTBC as well as the emergence of new species, limits these diagnostic assays to only detecting a few mycobacteria 13,19,20 .A more expensive approach involving sequencing of the 16S rRNA has been widely accepted as the reference method for species identi cation 21 . However, the analysis of sequence data can be quite cumbersome and the retrieval of valid identi cation results requires the use of a comprehensive quality-controlled database 22 . As a result, many researchers are looking for a cost effective, rapid and reproducible method to differentiate between a wider range of mycobacterial species.
Matrix-assisted laser desorption ionization-time of ight mass spectrometry (MALDI-TOF MS) has been recently recognized for its ability to identify bacteria to a species level through the analysis of the protein composition of a bacterial cell 18, 23 .A wide variety of prokaryotes (clinically and environmentally relevant Gram positive and negative bacteria) have been characterized using MALDI-TOF MS approach. The principle of MALDI-TOF MS for the identi cation of bacterial species is based on the ability of the assay to measure the exact sizes of peptides and proteins through the generation of a unique mass spectral pro le 18,24 . Various factors including a low reproducibility of the results, variations in the sample preparation protocol and a limited availability of reference data have made it impossible to routinely utilize the application of MALDI-TOF MS for the speciation of mycobacteria in a clinical laboratory 22 . Unlike other bacteria, the presence of mycolic acids and the pathogenicity of mycobacteria renders reliable inactivation and reproducible sample extraction protocols challenging, which might partly explain the great variances in the ndings among researchers [25][26][27] . However recent advances in mass spectrometry have shown that comparative proteomic pro ling can be employed in the identi cation of mycobacterial species 22,28,29 .
Earlier studies by Hettick and co-workers 22 using MALDI-TOF MS analysis of whole-cell acetonitrile/tri uoroacetic acid (ACN/TFA) extracted protein samples demonstrated that that both sample types were capable of resolving the identi cation of 6 ATCC-typed mycobacterial strains. Thereafter, Pignone and coworkers used the whole-cell sample preparation method and were able to clearly distinguish between 36 strains of the 37 screened ATCC-typed mycobacteria. In a study undertaken by Saleeb and coworkers, a modi ed ethanol-formic acid (EFA) protein extraction protocol using glass beads (EFAGB) was used to construct a mycobacterial mass spectral reference database containing 42 clinically relevant ATCC-typed strains. The database was then used albeit with varied success to biotype 104 mycobacterial clinical isolates. The above studies have used different sample preparation methods for the proteomic pro ling of mycobacteria using the MALDI-TOF MS.
Numerous reports have adopted MALDI-TOF MS technology for the identi cation of Mycobacterium spp., but the methods and results have been greatly varied 27,[30][31][32][33][34]46,47 . To date there seems to be no agreement on which sample preparation method is preferred in the discrimination of mycobacteria. Even though, the applications of MALDI-TOF MS for the discrimination of clinically relevant mycobacteria using a reference library database have been documented in a few studies studies 12,20,22,28,35,36, the coverage of Mycobacterium spp. included in different commercial database platforms are variable. Furthermore, there is variation on which sample preparation method to use since whole-cell deposits, cell lysates, or crude bacterial extracts have all been previously described. Given the pathogenic nature of mycobacteria, cellular extraction has been a more preferred method option as opposed to whole cell deposits since this method minimises contact with the infectious organism.
Thus far and to the best of our knowledge a study demonstrating the application of MALDI-TOF MS for the identi cation of clinically relevant mycobacterial isolates in KwaZulu-Natal, South Africa has not yet been conducted. Therefore, the present study is the rst of the relatively few studies presented to evaluate a newly developed chloroform-methanol-ethanol-formic acid (CMEFA) cellular extraction protocol for the identi cation and differentiation of clinically signi cant mycobacterial isolates using MALDI-TOF MS. The MALDI-TOF MS bioinformatics and statistical algorithms such as pattern-matching algorithm, cluster analysis, and principal component analysis (PCA) were also evaluated during the identi cation and discrimination of clinical mycobacterial isolates. The reference spectral database library (CMEFA-MSP) was challenged using blind-coded clinically relevant mycobacterial isolates sourced from an independent facility. We present the MALDI-TOF MS-derived proteomic pro ling data on mycobacteria employing the EFA, EFAGB and a new organic solvent delipidation-modi ed EFA (CMEFA) sample preparation protocol.

Mycobacterial strains and clinical isolates
Ten different mycobacterial species consisting of American Type Culture Collection (ATCC) strains and clinical isolates (n = 110) were used in this study (Table 1). All the ATCC strains and clinical isolates (n = 3) per species were used for the initial local database creation. The ATCC strains were purchased as a KWIK-STIK device (Microbiologics, USA) consisting of lyophilised pellets. The clinical isolates were obtained from the Department of Medical Microbiology, National Health Laboratory Services (Inkosi Albert Luthuli Central Hospital); well characterised using the Genotype CM (Common Mycobacteria), AS (Additional Species) and MTBC (Mycobacterium tuberculosis Complex) assays (Hain-Lifescience GmbH, Nehren Germany).

Media and Culture Preparation of ATCC strains
The mycobacterial ATCC strains were purchased as lyophilised pellets in the form of a KWIK-STIK (Microbiologics, USA) device packaged within a pouch. Mycobacterial pellets contained within the quick stick device was allowed to equilibrate to room and rehydrated according to manufacturer's guidelines. The rehydration process was performed in a Level II Biosafety cabinet, with full personal protective equipment, including a N95 respiratory mask. A swab was heavily saturated with the uid containing the mycobacterial organism and then transferred to a drug free Middelbrook 7H11 agar plate (Becton Dickinson, USA). The Middelbrook 7H11 agar plate was inoculated by gently rolling the swab over one-third of the plate. The culture plate was incubated at 37°C for four and twenty-one days to cultivate fast (M. fortuitum) and slow growing (M. gordonae and M. tuberculosis) mycobacterial strains, respectively. Once con uent growth was obtained, a single loop-full was transferred into a Micro Bank cryovial (Pro-Lab Diagnostics, Canada) and stored at -70°C as per manufacturer's instruction.
The following aspects relating to the collection, cultivation and identi cation of the 39 clinical isolates of mycobacteria used in this study were performed by the NHLS (National Health Laboratory Service) TB reference pathology laboratory based at Inkosi Albert Luthuli Central Hospital (IALCH), Durban, South Africa. This study was approved by the University of KwaZulu-Natal Biomedical Research Ethics Committee (BREC) for the use of stored isolated clinical samples (BEO85/12). Our study was performed in line with Declaration of Helsinki and International Committee of Harmonisation good clinical practice. The isolates are presented in Table 1 and it must be highlighted that this screening process only sourced clinically relevant mycobacteria associated with patient disease in the Durban metropolis of KwaZulu-Natal, South Africa Collection and culturing of clinical isolates of mycobacteria A large number of sputum samples representing putative clinical mycobacterial infections were collected from different regional hospitals of Durban, South Africa. Sputum samples (1 ml) were decontaminated with an equal volume N-acetyl-L-cysteine-sodium hydroxide (NALC-NaOH). All putative mycobacterial specimen manipulations were performed in a biological safety cabinet class-II equipped with high-e ciency particulate air lters (HEPA). The samples were allowed to stand at room temperature for 15-20 min. Sodium phosphate buffer (pH 6.8; 48 ml) was added to samples and centrifuged at 3,000 x g for 15 min. The sediment was retained with approximately 3 ml of supernatant. The samples were vigorously vortexed and 100 µl of the suspension was subjected to auramine-O staining to con rm the presence of acid-fast bacilli 37 .
Mycobacterial growth indicator tubes (MGIT) supplemented with a BBL MGIT OADC enrichment and BBL MGIT PANTA antibiotic mixture (Becton Dickinson, USA) were inoculated with decontaminated sputum sediments and incubated in the Bactec 960 mycobacterial detection system (Becton Dickinson, USA). Only MGIT-positive cultures were further assessed for the presence of an acid-fast bacilli (AFB) using a Ziehl-Neelsen (ZN) stain. All AFB positive MGIT cultures were then sub-cultured onto drug free solid Middelbrook 7H11 agar and incubated at 37°C. Genotypicbased identi cation of isolates was performed on cultures that displayed con uent growth between 4 and 21 days.
Genotypic-based screening of clinical isolates of mycobacteria Speciation of isolates was performed according to manufacturer's instructions using the Genotype CM (Common Mycobacteria), AS (Additional Species) and MTBC assays (Hain-Life science GmbH, Nehren Germany). The protocol consisted of DNA extraction, PCR ampli cation, reverse hybridization of the ampli ed PCR products to speciesspeci c oligonucleotide probes (probe sequences immobilized on nitrocellulose strips) followed by chemical detection. Speci c banding patterns composed of clear-cut hybridisation and colorimetric staining signals on a nitrocellulose membrane strip were employed for strain identi cation as per manufacturer's guideline which stipulates the theoretical banding patterns of different mycobacterial species.

MALDI-TOF MS Proteomic Pro ling of Clinical Isolates
Standard ethanol-formic acid (EFA) sample preparation protocol A disposable 10 µl inoculating loop was used to obtain a single mycobacterial colony grown on Middelbrook 7H11 agar plate. The colony was suspended in 600 µl HPLC grade distilled water contained in a 1.5 ml screw cap "O" ring tube (Sarstedt, Germany). The tube containing the cell suspension was vortexed for approximately 60 s and heat inactivated at 98 ο C for 30 min in a heating block. The heat inactivated cell suspension was then centrifuged at 13 000 x g for 5 min in a tabletop centrifuge with an aerosol-tight rotor. The supernatant was discarded, and the pellet was re-suspended in 300 µl of HPLC grade distilled water and 900 µl of absolute ethanol (HPLC grade). The cell suspension containing absolute ethanol was vortexed for 2 min at 13 000 x g and the supernatant was discarded using a pipette. An additional centrifugation step was performed to completely remove residual ethanol. After the pellet was air dried at room temperature (RT) for 5 min, depending on the volume of the pellet (ranging from 20-30 µl), 40-80 µl of 70% formic acid was added and the tube was thoroughly vortexed for 30 s. An equivalent volume of absolute acetonitrile (HPLC grade) was added to the suspension and homogenized by vortexing for 10 s. The homogenized suspension was centrifuged for 2 min and the supernatant was transferred to a sterile 1.5 ml screw cap "O" ring tube. One microliter of the supernatant from each isolate was spotted onto MTP 384 ground steel target plate (Bruker Daltonics, Germany) and allowed to air-dry. One microliter of the Bruker Bacterial Test Standard (BTS) containing an extract of Escherichia coli DH5 alpha was spotted onto a separate spot for instrument calibration. The applied samples were air-dried and overlayed with a 1 µl aliquot of freshly prepared saturated matrix solution [10 mg of α-cyano-4-hydroxycinnamic acid (HCCA) dissolved in 1 ml of a solvent mixture containing 50% acetonitrile, 47.5% water and 2.5% tri uoroacetic acid]. The samples containing matrix were dried for 5 min. The samples on the ground steel target plate were subjected to MALDI-TOF MS analysis.

Ethanol-formic acid-glass bead (EFAGB) sample preparation protocol
Micro-glass beads were used to disrupt the cell envelope of mycobacteria to facilitate the release of cytoplasmic proteins. The EFAGB mycobacterial sample preparation protocol for MALDI-TOF MS analysis was performed as per Saleeb and co-workers 35 Chloroform-methanol ethanol-formic acid (CMEFA) extraction protocol The newly developed CMEFA protocol 38 was evaluated for its effective delipidation of mycobacterial cells. Lipids and mycolic acids that constitute the mycobacterial cell envelope serve as barriers and hinder the protein extraction process resulting in fewer proteins being available for accurate identi cation by MALDI-TOF MS. The CMEFA sample preparation protocol for mycobacteria was as follows: a 10 µl disposable inoculating loop was used to obtain a single mycobacterial colony from a Middelbrook 7H11 agar plate containing growths of either an ATCC mycobacterial strain or a clinical isolate. The colony was suspended in 600 µl of HPLC grade distilled water contained in a 1.5 ml screw cap "O" ring tube (Sarstedt, Germany). The tube containing the cell suspension was vortexed for approximately 60 s and then heat inactivated at 98 ο C for 30 min. The heat inactivated cell suspension was then centrifuged at 13 000 x g for 5 min in a tabletop centrifuge with an aerosol-tight. The delipidation process of mycobacterial cells was initiated by the addition of 600 µl of a chloroform/methanol (1/1, v/v) solvent mixture. The cell suspensions were vigorously vortexed for 60 s, centrifuged at 13,000 x g for 5 min and the supernatants discarded. The delipidation treatment was repeated twice more to enhance the removal of lipids and mycolic acids. The supernatant was discarded, and the pellet was re-suspended in 300 µl of HPLC grade distilled water followed by 900 µl of absolute ethanol (HPLC grade). The cell suspension was vortexed for 2 min at 13 000 x g and the supernatant was carefully removed. An additional centrifugation step was performed to completely remove all residual ethanol. The pellet was air dried at room temperature for 5 min, depending on the volume of the pellet (ranging from 20-30 µl), 40-80 µl of 70% formic acid was added and the suspension was thoroughly vortexed for 30 s. Acetonitrile (50 µl) was added to the suspension and tubes were vortexed for another 30 s. The samples were centrifuged for 2 min at 13 000 x g. A 1µl aliquot of the supernatant was spotted on an MTP 384 ground steel target plate (Bruker Daltonics, Germany). The applied sample was air-dried and subsequently coated with 1 µl of freshly prepared saturated HCCA matrix solution. The extracted samples containing matrix were allowed to air-dry for 5 min before mass spectral analysis.

MALDI-TOF MS instrumentation
The spotted MTP 384 ground steel target plate was inserted into the tray of the Auto ex III smart beam MALDI-TOF MS instrument (Bruker Daltonics, Germany) and processed using the Flex control software. Instrument parameters for ion source 1 and ion source 2 were set at 20 kV and 18.57 kV respectively. The lens was set at 8.5 kV and a pulsed ion extraction of 250 ns was used. Spectra were acquired in a linear positive mode with a mass-to-charge (m/z) range of 2 to 20 kDa.

Acquisition of MALDI-TOF MS data
Prior to sample analysis, the MALDI-TOF MS instrument was calibrated with the Bruker bacterial test standard (BTS) (Bruker Daltonics, Germany) with a mass range of 3.6 to 17 kDa. The "MBT_FC.par" standard Flex Control method was selected for both analysis and internal calibration of BTS. Each mycobacterial ATCC-typed strain or clinical isolate was spotted at 6 positions on the MALDI target plate and measured four times per spot. Thus, twenty-four spectra were generated using replicate cultures; this was repeated on three separate occasions thereby generating 72 spectra per mycobacterial sample. The spectra ful lled the speci cations of Bruker Daltonics for construction of a main spectral pro le (MSP) in that each spectrum must contain a minimum of 25 peaks with a resolution >400, with a minimum of 20 of the peaks having a resolution >500.A main spectral pro le (MSP) consisting of a minimum of 24 of the most reproducible spectra were individually created using the Biotyper 3.0 software for all mycobacterial samples.

Principle component analysis (PCA)
Principle component analysis (PCA) was employed to statistically evaluate MALDI-TOF mass spectra. PCA is a broadly used mathematical technique for reducing the dimensionality of multivariate data whilst preserving most of the variance. PCA was performed using ClinProTools Software 2.2 (Bruker Daltonics, Germany). Complex relationships between samples of mycobacterial isolates were explored and illustrated with the use of a PCA plot. This model provides insights into similarities or dissimilarities between samples of mycobacterial isolates. For example, samples that cluster closer together are more similar than samples that are further apart.

Biotyping of blind-coded isolates using MALDI-TOF MS
To access the accuracy and reliability of the CMEFA-based MALDI-TOF MS strategy in the identi cation of mycobacteria, a blind coded batch of ten mycobacterial species (Table1) comprising American Type Culture Collection (ATCC) strains and clinical mycobacterial isolates (n=110) were obtained from Department of Medical Microbiology, NHLS, Inkosi Albert Luthuli Central Hospital. Samples for mass spectral analysis of the blind-coded isolates were prepared using the CMEFA extraction method. Following acquisition of the mass spectra from blindcoded samples, they were processed into their respective MSP's. An MSP was then identi ed or speciated by Bruker Daltonics Biotyper 3 software by comparison to those contained within the MSP reference library of ATCC-typed mycobacterial strains and clinical isolates. Biotyping was performed from CMEFA-derived samples of three independent cultures of each blind-coded isolate. The results from the pro le-matching process were expressed as log (score) values and ranged from 0 to 3 as shown in Table 2.

Results
To minimize the health risk associated with the handling of pathogenically viable mycobacteria for MALDI-TOF MS analysis, all mycobacterial culture samples were heat inactivated prior to their extraction (EFA, CMEFA or EFAGB). The integrity of this inactivation measure was validated as no detectable mycobacterial growth was observed after 6 weeks of incubation at 37°C on Middelbrook 7H11 agar plates. This inactivation procedure effectively rendered the mycobacteria non-viable and non-infectious signi cantly reducing the risk to laboratory personnel and therefore such inactivated samples were deemed safe to work with in an open laboratory 35,40 .
The MALDI-TOF MS analysis of samples generated using the EFA protocol as suggested by the manufacturer yielded mass spectral pro les with no discernible peaks and resolution (results not shown, similar to Figure 1C). These were ineffective in differentiating mycobacterial species and its use was deemed not satisfactory for the creation of a local mass spectral database that can be employed for the envisioned consistent biotyping of mycobacteria ( Table  3).
The EFAGB protocol as suggested by Saleeb and co-researchers 35 to include mechanical disruption of mycobacterial cell envelope glycopeptidolipids (GPL) with micro-glass beads, yielded protein mass spectral pro les that were characterised by in certain instances by increases peak numbers with low intensities. This was especially evident for majority of the ATCC-typed mycobacterial strains (M. fortuitum, M. abscessus, M. gordonae and M. bovis) when compared to that of M. tuberculosis (Figure 1). The presence of smaller proteins with a mass/charge (m/z) ratio of <3.5 kDa was detected in most instances and with little to no detection in the higher mass/charge range. As presented in Table 3, the EFAGB sample preparation method is limited in consistently yielding spectral pro les for mycobacterial species identi cation as only a single strain was positively identi ed. Most importantly proteins with abundant mass peaks throughout the range (2-10 kDa) were not consistently detected and as such it was discontinued as did align with the intended outcome of this study.
MALDI-TOF MS samples prepared using the newly developed CMEFA protocol that incorporates a novel delipidation step of the mycobacterial cell wall distinctly and consistently produced spectra with stronger protein signals, more abundant bacterial protein peaks and low signal-to-noise ratios across the entire 2 to 10 kDa mass/charge range ( Figure 2). The superior effectiveness of the CMEFA-derived samples in the biotyping of mycobacteria to the strain level is clearly indicated in Table 3 when compared to the EFA and EFAGB methods used in this study.
In view of this, the CMEFA prepared extracts were used to create a local MSP mycobacterial database. To exclude for any interference from reagents of the CMEFA method on mass spectra, negative controls consisting of a culture-free chloroform/methanol (1:1, v/v) and distilled water were included. The MALDI-TOF MS analyses of chloroform/methanol and distilled water revealed no protein peaks (results not shown). In view of the above the CMEFA sample preparation protocol was deemed the method of choice for all further sample preparations. The results of the MALDI-TOF MS-based identi cation of blind-coded ATCC mycobacterial strains were identi ed using our locally created CMEFA-MSP reference library (Table 3). These log score values were interpreted in accordance with stipulations by Bruker Daltonics (Table 2). Interestingly CMEFA-prepared samples of the 10 blind-coded ATCC mycobacterial strains displayed average log score values ≥ 2.3 that placed them in the highest descriptive category were identi ed correctly to the genus and species level. Of signi cance, the two MTBC-related strains (M. tuberculosis and M. bovis) were unambiguously differentiated from the other nine NTM species to the species level. In this study, MALDI-TOF MS-based biotyping correctly identi ed blind-coded strains to the genus and species level with 100% accuracy.
The mass spectral ngerprints generated from CMEFA extracted protein samples of clinical mycobacterial isolates were de ned by strong signals across the entire mass/charge range and spectral patterns of 3 selected strains are presented in Figure 3. The mass spectral signatures of individual isolates were characterized by high peak intensities and a relative abundance of protein peaks with low signal-to-noise ratios.
MALDI-TOF MS samples extracted using the CMEFA protocol from clinically-relevant mycobacterial isolates are presented in Table 4. Interestingly all the 110 blind-coded mycobacterial isolates when best matched by Bruker Daltonics Biotyper 3 software to MSP's of the CMEFA-MSP reference library displayed and average log score values ≥ 2.3 and were identi ed correctly to the genus and species level ( Table 4). The M. tuberculosis isolate was unambiguously differentiated from the NTM isolates to the species level. In this study, MALDI-TOF MS-based biotyping using the CMEFA extracted samples correctly identi ed blind-coded mycobacterial isolates to the genus and species level with 100% accuracy. A set of mass spectral patterns for each ATCC-typed mycobacterial strain using the CMEFA protocol was further analysed using the MALDI-Biotyper 3.0 software to create their main spectral pro les (MSP). Using default analysis parameters as assigned by Biotyper software (distance measure, correlation; linkage, average), the differentiation and clustering of closely related mycobacteria were ascertained from a dendrogram ( Figure 4). Notably, pathogenic members of MTBC (red) were differentiated to the species level and correctly clustered away from members of NTM mycobacteria. Interestingly, mycobacteria of the M. avium-intracellular complex (MAC) clustered separately indicating that the spectra of the species are more similar to one another than spectra of other mycobacterial species. A similar scenario prevails for organisms of the MTBC which are M. bovis and M. tuberculosis.
A set of mass CMEFA -derived spectral patterns for each ATCC-typed mycobacterial strain and clinical isolates were further analysed using the Biotyper 3.0 software to create a main spectral pro le (MSP). A dendrogram was created using default analysis parameters as assigned by Biotyper software (distance measure, correlation; linkage, average) to access the differentiation and clustering of these mycobacteria ( Figure 5). Importantly, M. tuberculosis and M. bovis organisms from MTBC were clearly differentiated to the species level from each other. In addition, the MTBCrelated organisms (ATCC and clinical isolates) correctly clustered away from NTM-related mycobacteria and were differentiated to the species level.
The MSP's of NTM-related mycobacteria fell into two major clades with M. avium, M. intracellulare, M. scrofulaceum, M. marinum and M. gordonae in one clad whilst the other contained M. fortuitum and M. abscessus. It was also noted that each of the NTM family members formed individually discreet clusters. The latter phylogeny is indicative of high levels of spectral similarities in terms of the location and intensity of diagnostic ions. Due to negligible differences in distance, the proteomic pro les of the 15 clinical isolates of M. fortuitum seem to be highly conserved whilst there appears to be greater variation with other NTM-related mycobacteria. It should be noted that M. abscessus, like M. fortuitum is a rapidly growing mycobacteria (RGM) and forms part of the M. fortuitum complex. However, although M. abscessus shared the same clade as M. fortuitum, the formation of discrete clusters has allowed for a phylogenetic discrimination of these two closely related organisms to the species level.
The results of the MALDI-TOF MS-based identi cation of 110 blind-coded clinical To further characterise the main spectral pro les of CMEFA-extracted samples, the MSP's of ve clinical isolates were subjected to three-dimensional principal component analyses (PCA, Figure 6). Replicate spectra from closely related species formed discrete clusters with little variation. Of importance, the MSP's of MTB and NTM isolates reside in distinct spatial arrangements to one another that were non-overlapping ( Figure 6). The MTB isolate clustered away from the clinically signi cant NTM isolates. Of relevance is that M. avium and M. intracellulare, although closely related at a biological level, clustered on different planes emphasising their dissimilarity.

Discussion
This is the rst study that has generated a local MALDI-TOF main spectral pro le (MSP) database for the explicit purpose of biotyping clinically signi cant mycobacterial isolates using CMEFA sample preparation protocol. The diagnosis of MTB as the causative agent of pulmonary TB using only clinical symptoms may be inadequate and inaccurate since M. bovis, M. avium, M. kansasii, and M. intracellulare all produce pulmonary disease in humans that may be indistinguishable from that caused by MTB 4,41 .In this regard, the potential of the CMEFA sample preparation protocol was evaluated for a rapid and reliable discrimination of clinically relevant mycobacterial isolates using MALDI-TOF MS.
The data from this present study clearly demonstrates that the newly developed organic solvent-based CMEFA protein extraction method which is cost-effective and simple to perform is highly e cient in producing mycobacterial cellular protein samples for MALDI-TOF MS analysis to generate high resolution MSP's. In addition, the independent CMEFA-MSP-reference library containing MSP's of ATCC-typed strains were used in the undisputable differentiation of blind-coded clinical mycobacterial isolates sourced from an independent research facility. According to Gutacker et al. 40 genomic differences can be seen at the protein level. Thus, it can be suggested that the unambiguous classi cation of the clinical mycobacterial isolates employed in this study were achieved through the recognition of differences in the proteomic pro les of clinical isolates. It has also been reported that the proteomic pro ling of intact mycobacteria using MALDI-TOF MS analysis is based on conserved ribosomal proteins that are abundantly expressed within the cells 35,42,43 . In this study, cluster analysis based on a matrix of pair wise correlation values of all clinical mycobacterial MSP's resulted in two separate clusters at the species level, one for MTBC members and the other including all clinical isolates of the NTM. Within the second cluster, NTM isolates formed discrete clusters with biologically similar isolates clustering closer together. Of signi cance, is that the closely related organisms M. abscessus and M. fortuitum (M. fortuitum complex) shared the same clade and formed discrete clusters resulting in phylogenetic discrimination of these two organisms. Similar ndings were observed for M. avium and M. intracellulare isolates. In this respect it seems that the CMEFA-based MALDI-TOF MS mycobacterial biotyping strategy employed in this study is superior to EFAGB-based strategy used by Saleeb et al. 35 which did not clearly resolve or distinguish similar NTM-related mycobacteria.
The use of the newly developed CMEFA protein extraction protocol seems to havefacilitated the removal of unwanted lipids and physiological salts, both of which are likely to interfere with matrix crystallization and spectral quality 44 thereby increasing the quality of the mass signals. Overall, in comparison to the spectra obtained by Saleeb and co-researches, a signi cantly increased number of diagnostic ions in the mass/charge range of 2 to 10 kDa were observed for CMEFA-derived samples and are most probably re ective of higher protein content samples.
This study contributes to the relatively few available studies that describes a speci c protein extraction MALDI-TOF-MS protocol that is useful for the identi cation of all types of mycobacteria. Although the newly developed reference library was not as comprehensive as those described in previous studies 12,20,22,28,35 , the potential of the library was observed for the unequivocal discrimination of clinically relevant mycobacterial isolates frequently associated with disease in our local setting. It should be noted that a unique microcosm prevails in KwaZulu-Natal, South Africa in terms of mycobacterial-related infections. Unique socio-economic and health management practices have contributed, for there exists various tuberculosis pathological conditions that include multidrug-resistant (MDR) and extensively drug-resistant mycobacterial strain-based infections. In addition to the above, a platform is now available for the addition of other species of mycobacteria into the database as needs arise. This is an important milestone since the number of species within the genus Mycobacterium is ever expanding hence diagnostic laboratories need to continuously employ strategies to identify new emerging mycobacterial pathogens. Although the CMEFA-based MALDI-TOF MS mycobacterial biotyping strategy remains to be rati ed in other independent laboratories, this study demonstrates its viable potential as a simple and cost-effective diagnostic tool that provides rapid results (approximately 2 hours). Moreover, the system lends itself to automation thereby permitting high throughput work ow. Conclusively the data from this comparative study seems to suggest that the newly developed CMEFA protocol is most effective in identi cation of clinically relevant ATCC-typed and clinical strains of mycobacteria representing members of the MTBC and NTM families that are prevalent in our locality.   Dendrogram created from hierarchical cluster analysis of the main spectral pro les (MSP) of mycobacteria using the CMEFA sample preparation protocol.

Figure 5
Dendrogram created from hierarchical cluster analysis of the main spectral pro les (MSP) of ATCC-typed strains and clinical isolates of mycobacteria using the CMEFA sample preparation protocol.