Molecular epidemiology of tuberculosis in northeastern Ethiopia

Fikru Gashaw (  krug2012@gmail.com ) Kotebe Metropolitan University College of Natural and Computational Sciences https://orcid.org/00000002-8962-9686 Aboma Zewde Ethiopian Public Health Institute Endalkachew Tedla Bikat Higher Diagnostic Laboratory Biniam Wondale Arbaminch University Yalemtsehay Mekonnen Addis Ababa University College of Natural Sciences Berhanu Erko Addis Ababa University Aklilu Lemma Institute of Pathobiology Gobena Ameni United Arab Emirates University

geographical region on an evolutionary time scale [3]. Spoligotyping is further used for simultaneous detection and molecular typing of the Mycobacterium. It is based on the visualization of the spacer DNA sequences in between the 36-bp direct repeats (DRs). The DR region in individual M. tb strains and in different members of the MTBC was identi ed and alignment of the spoligotype patterns was done to group the isolates according to their similarity into clades or strain families. The spoligotyping is further used for genotyping of the MTBC isolates and identify the circulating lineages with cluster formation and orphans at the targeted study area [4].
The MIRU-VNTR typing is the most advanced typing technique with more acceptance in comparison to spoligotyping sensitivity. It shows an adequate balance between variability and also has essential features to differentiate non-related isolates. The MIRU also has greater discriminatory performance as compared to spoligotyping with better performance when both are combined. It is further crucial for determination of recent transmission among the community [5]. The 24-locus sets have improved its discriminatory importance than the initial 12-locus or the 15-locus sets and is suggested as the current gold standard technique in molecular typing of the Mycobacterium [6][7][8].
In Ethiopia, most of the characterization technologies are regions of differentiation and spoligotyping while other forms of characterization are done outside the country. The MIRU-VNTR technology was established for the rst time in the country and applied in this study using the Mycobacterium isolates from northeastern part of the country. Of the molecular characterization done in Ethiopia, the most studies were conducted in Northwest [9][10][11][12] where as it was rare from other parts of the country. Some of the studies found noble new Ethiopian phylogenetic lineages which were not reported elsewhere in the world. Scarcity of precise and instant M. tb detection technologies in the country is an additional concern. Nationwide status regarding molecular epidemiology of the disease is not yet well established. This needs further studies at different sites of the country to sum up and make countrywide determination at regional and zonal level for the bacterial strains. As far as our knowledge is concerned, no study has been conducted on the molecular characterization of the Mycobacterium in Oromia Special Zone and only limited investigation was done in the South Wollo Zone of this study settings.

The study area
The study was conducted in Oromia Special Zone and South Wollo Zone of the Amhara Regional State, northeastern Ethiopia. A preliminary survey was made for 3 months in governmental health facilities of the seven districts in Oromia Special Zone; based on the availability of samples and transportation access, the data were continuously collected from Kemise and Bati Town health centers. However, due to a lower number of TB cases accessed from the special Zone, the study was extended to four neighboring health institutes of Dessie Town (Dessie Referral Hospital, Bikat Higher Diagnostic Laboratories, Dessie Health Center and Boru Meda Hospital) as additional study sites.

Study design and laboratory processing
An institution-based cross-sectional study was conducted from April 2015 to January 2017. Dry, translucent, leak-proof 50ml capacities of falcon tubes were used to collect a minimum of 3-5ml sputum samples. Disposable gloves and respiratory masks were used when samples were collected from suspected TB patients. For all study participants, the socio-demographic data was also recorded on the spot of sample collection.

Bacteriological examinations
Ziehl-Neelsen staining. Ziehl-Neelsen staining and direct microscopic examination for acid-fast bacteria was performed at sample collection sites. The positive samples were temporarly stored at -20 0 C in the refrigerator of the health institutes and transported to Aklilu Lemma Institute of Pathobiology (ALIPB) using a cold chain at 4 0 C for further laboratory processing [13][14][15].
Mycobacterial culturing. A stock of selective Lowenstein-Jensen (LJ) media, glycerol, pyruvate and a homogenized whole eggs was processed for the Mycobacterium culture. Both of the sputum and FNA samples were cultured following [16] procedures. The inoculated LJ slants were incubated aerobically at 37 0 C and monitored every week for the formation of mycobacterial colony until 8 th week. The grown Mycobacteria colony was collected, heat killed and freezed whereas the weakly grown colonies were subcultured. Both, the heat killed and freezed isolates were kept at -80 0 C until molecular characterization was done using region of differentiation (RD), spoligotyping and MIRU-VNTR.

Molecular typing
Speciation of the isolates using RD9-based polymerase chain reaction Heat killed cells and the genomes oligonucleotide primers (RD9 FW and RD9 REV) each at a concentration of 100 mM was used for the PCR ampli cation process. The reaction mixture was prepared and subjected to 35 cycles consisting of 95°C for 1 min, 55°C for 1 min, and 72°C for 1 min in the PCR thermocycler (VWR, UK). Finally, the reaction mixture was maintained at 72°C for 10 min. The resulting PCR products were processed by 1% standard agarose gel electrophoresis at 110 V and 400 mA for 35 minutes. The gel was visualized using a UVP photodoc imaging system and the resulting bands were interpreted compared to positive (M. tb H37Rv and M. bovis) and negative (distilled water) controls [11,17].
Spoligotyping and its result interpretation. The DR region was ampli ed by PCR using oligonucleotide primers (DRa biotinylated at 5' end and DRbs) derived from the DR sequence [4]. The PCR was processed and hybridization of DNA was detected by the enhanced chemiluminescence detection liquid followed by exposure to X-ray lm as described by the manufacturer (Hain Life Science Company). The autorad spoligotyping results were checked visually by three experienced operators and the spacers were written in a binary format using uncapitalized English letters 'o' and 'n' indicating when the spacers are absent and present, respectively. The binary representation was converted to the octal code and entered to the international database SITVIT2 (Institute Pasteur de Guadeloupe) to determine and interpret the speci c M. tb complex strain and SIT. The novel isolates which have not yet been described in the existing spoligotyping data base pro le using SITVIT2 lists were considered as orphans [4].
MIRU-VNTR typing. In this study, performance and optimization of the protocol were done at ALIPB-AAU. The PCR ampli cation Master Mix was prepared for 26 PCR tubes of which 24 pairs of the primers [Additional le 1] were used for a single isolate (Table 1).
Twenty four μl of the Master Mix was distributed to 24 PCR tubes. Then 1μl of the respective 24-locus MIRU-VNTR primers were added to the tubes to make a nal volume of 25μl for the ampli cation.
Thermocycler was set with an enzyme activation step of 15 minutes at 95°C, followed by 40 cycles of 1 minute at 94°C for denaturation, 1 minute at 59°C for annealing, and 1 minute 30 seconds at 72°C for extension. Thereafter, the reactions were incubated for 10 minutes at 72°C for nal extension/elongation. A positive control, H37Rv, and a negative control, sterile distilled water, were used in the study and the ampli ed PCR products were run in a gel-electrophoresis [6,18,19]. The 69 ampli ed PCR products of MGIT sub-culture positive isolates were electrophoresed to determine the size of amplicon. Electrophoresis was made on 1.8% (w/v) 300ml agarose gel with 15 μl ethidium bromide in 1X Tris Borate-EDTA (TBE) buffer run at 120 volts and 400 milliampere for 5 hours. Product sizes of each band were determined by comparing with the standard DNA ladder bands (100 bp and 50 bp) after photograph was taken under ultraviolet (UV) transilluminator.The numbers of MIRU-VNTR alleles were determined by inferring the size of bands with interpretation tables [19]. Then, the main phylogenetic predictions were facilitated using MIRU-VNTR-24 loci pro les into freely accessible, on-line strain identi cation databases of MIRU-VNTRPlus (http://www.miru-vntrplus.org). The isolate patterns were used to compare with the reference strains in the database for the assignment of MTBC species, lineages and genotypes. Phylogenetic dendrogram was constructed based on the neighbor-joining (NJ) clustering algorithmsand and minimum spanning tree (MST) analysis was also performed. TB cases con rmed by the health personnel and those who ful lled the inclusion criteria were included in the study. The samples (sputum and Fine needle aspirates (FNAs)) were collected on the spot by consenting participants 18 years and older. Those with severe TB who were unable to provide their sputum specimens were also excluded from the study.

Data analysis
Descriptive statistics was used to determine frequency and percentage. In spoligotyping, the reference data base available online http://www.pasteur-guadeloupe.fr:8081/SITVITDemo/ was used to assign the shared international spoligotype numbers (SIT); but if SIT number was not found, the pattern was considered as 'orphan' type. An online tool Run TB-Lineage with a website of http://tbinsight.cs.rpi.edu/run_tb_lineage.html was also used to identifyfamily/clade, lineages and sublineages of the isolates. Spoligotypes consisting of more than one isolate were classi ed as clustered types while those with only one isolates were classi ed as singletons. For recent transmission index (RTI), cluster analysis was calculated using the formula (nc -c)/n, where nc is the total number of clustered patients, c is the number of clusters, and n as the total number of patients in the sample.The discriminatory power of each locus was also evaluated using the Hunter and Gaston Discriminatory Index [20].

Tuberculosis infections and demographic characteristics of the study patients
A total of 384 TB cases (213 males and 171 females) were involved in the study. Both forms of TB were identi ed with smear positive pulmonary cases as more predominant (74.5%) than the EPTB. TB lymphadenitis was found to be the most prevalent (85.9%) form of EPTB with cervical adenopathy (75.3%) being the commonly existing cases. A diverse spoligotyping pattern was identi ed with 86.5% as not registered in the global spoligotyping database. A low proportion of the isolates (20.2%) was recognized in clustered forms by the spoligotyping. From the study, both modern and ancient Lineages were identi ed with the modern Euro-American Lineage as predominant. The 24-loci MIRU-VNTR showed all the isolates as orphan and highly diverse.
Most of the cases 64.3% (247/384) were recruited from the South Wollo Zone and the remaining subjects were from Oromia Special Zone and elsewhere. Of the identi ed TB cases, 96% (369/384) were recorded at the district level (Table 2).
Pulmonary TB cases accounted for 74.5% (286/384), and the overall prevalence of TB was highest (67.0%) in the 18-37 years age group.

Molecular epidemiology of tuberculosis
Speciation of the isolates using genomic deletion and spoligotyping Deletion typing via region of differentiation was made in all 112 LJ-culture positive isolates with 77.7% had an intact RD9 (396 bp) and identi ed as M. tuberculosis. On the other hand, a total of 92.9% (104/112) of the isolates gave interpretable results for the formation of spoligotyping patterns. Ten clusters were identi ed in 20.2% (21/104) of the isolates while the remaining isolates were unique (singletons). Nine of the clusters had two isolates each while the remaining one cluster had three isolates with similar patterns. In these ndings, the majority of the patients had different strains of M. tuberculosis. Taking the number of cases with clustered genotypes into account, the RTI was calculated with a result of 0.12.
The study showed that there was a statistically signi cant difference in the proportion of clustering across the Oromia Special Zone and South Wollo Zone isolates (p-value = 0.000). In addition, all the isolates from the Oromia Special Zone were orphans (Table 3).

Mycobacterial interspersed repetitive unit variable number tandem repeat
Of the 69/112 MGIT sub-culture positive isolates, 56 had valid ampli cation products for the 24-loci MIRU-VNTR while the remaining 13 isolates had either incomplete or negative MIRU-VNTR pro les. The locus band was absent for one of the 24 loci (MIRU-VNTR locus 4052) with the oligos sequence FW(5'-AACGCTCAGCTGTCGGAT-3') and REV(5'-CGGCCGTGCCGGCCAGGTCCTTCCCGAT-3') in the gel for all isolates. The MIRU-24 locus of different M. tb. strains was ampli ed by PCR and separated by agarose gel electrophoresis. Each strain had a different allele number ranging from 0-9 repeats. Highly diverse genotypes were displayed with all the valid patterns as unique and no clustered isolates were detected. The discriminatory e ciency of 24-loci MIRU-VNTR and a combination of both the MIRU-VNTR and spoligotyping in this study was found as the highest with HGDI as 1.000 in each case. It was higher than that of spoligotyping discriminatory power (HGDI = 0.996). The minimum spanning tree (MST) analysis determined the evolutionary relationship among the strains using MIRU-VNTRplus data and none of the strains form a distinct complex [Additional le 3].

Discussion
This is the rst molecular epidemiologic study of MTBC strains from Oromia Special Zone in Amhara region with an additional characterizations of the Mycobacterium isolates from South Wollo using region of differentiation, spoligotyping and 24-loci MIRU-VNTR typing. Importantly, the MIRU-VNTR technique was established for the rst time in the country at ALIPB-AAU using gel electrophoresis and identi ed all the characterized isolates as orphan. The study also found pulmonary TB cases as the most dominant TB types.

Culture positivity, pulmonary and extra-pulmonary tuberculosis
The lower proportion of culture positivity in this study is close to the study in Addis Ababa [21]; this might be due to delayed culturing time and electric interruption when the specimens were preserved in the refrigerator at the sample collection sites and a long distance travel from temporarily storage sample collection site to ALIPB where the specimens were cultured. These factors increase the chance of bacterial death in the collected sputum samples. Perhaps it could also be our expectation that bacterial culture positivity might decrease when the samples were stored in the refrigerator for a long duration than instant culturing.
The extent of pulmonary and extra-pulmonary TB culture positivity in this study is lower than that of a study in India [22]. A study from different sites in Ethiopia also reported greater culture positivity of both clinically manifested smear positive pulmonary 79% (753/953) and extra-pulmonary 38% (456/1198) TB [15]. Likewise, a study report from northwestern Ethiopia and Addis Ababa showed greater culture positivity of PTB than EPTB [23,24]. In all instances, there is lower culture positivity of EPTB than pulmonary ones; this might be due to cytological suspicion of the specimen by pathologist unlike to the detection of disease causative organism itself as in the case of PTB. Moreover, the suspected Mycobacterium cellular infection could be paucibacillary which decreases the sensitivity of diagnostic test in EPTB.
The ndings of this study agree with the global report that the proportion of EPTB is higher; this might be due to diagnostic challenges including shortage of pathologists to identify and treat the cases on time in most health institutions. There were no pathologists to diagnose EPTB in all governmental health institutes including the referral hospital where this study was conducted. Because of this, all the suspected cases were referred to a single private diagnostic laboratory (Bikat Higher Diagnostic Laboratory) and remained as the diagnostic challenge of the area.
The EPTB has different manifestations based on the organs being attacked and its intent of dissemination in the body. Similar to other relevant studies in the country, this nding also revealed lymph nodes as the leading organs affected. In fact, the percentage of their infection rate differs as cervical, auxiliary, inguinal, supra-clavicular, sub-mandibular and anterior neck lymph nodes [15,23]. The higher infection of lymph node is similar to the study reported from Germany [25] but the most common sites involved were bones/joints and lymph nodes in United States of America [26], whereas the genitourinary system and skin were the common sites of infection reported from Hong Kong [27]. Such differences might be attributable to either host or pathogen related factors as well as access to patient sample collection in the clinical settings.
The largest number of TB cases in Dessie Town compared with other districts might be due to higher population density in the Town. This is in agreement with the WHO report that revealed the prevalence of TB cases as considerably higher in urban areas than in rural areas [28]. In addition, access and better diagnosis with more proximity to the health institutes could be another factor to nd out the higher number of cases in Dessie Town. Male to female ratio of this nding is the same to bacteriologically con rmed pulmonary TB patients of WHO report for Ethiopia with M:F ratio as 1.2 [1]. Such greater proportion of male patients to females might be due to biological, social and economic activities to contact with many people. The disease was also more common within age ranges of 18 -37 years which could be due to their active movement for economic engagement leading to greater risk of exposure.

Molecular epidemiology of tuberculosis
The detection of M. tb from culture positive samples in this study using RD9 was lower than the ndings from many other studies. A study from Western [29], Northeastern [14], central [30] and Northwestern [12] Ethiopia reported M. tb detection proportion of 97.1% in the rst study and 100% in all the rest. On the contrary, there was also a report from Addis Ababa [31] that showed RD9 detection typing for M. tb with a lower proportion of 47.7% (41/86) than this nding.
The variation of spoligotype clustering in this study in comparison to many other reports might be due to the differences in geographical study settings. There were higher clustering rates reported from most studies in Ethiopia [30,32,33]. Similarly, studies outside of the country also reported greater proportion of clusters [34][35][36][37] than the overall clustering rate of this study 20.2% (21/104) which might be due to the poor recovery rate of smear positive bacteria on LJ-culture. In fact, this nding is close to 18.8% (6/32) [14] and 23% (6/26) [38] report from Dessie and Addis Ababa, respectively. The ndings of all isolates in Oromia Special Zone as Orphan imply that they were not registered in the database as there was no more study from the area so far. The lower proportion of clustering rates in this study could imply that the Mycobacterium infections were from unrelated sources or they might be caused due to latent tuberculosis activation. The predominance families T1, family33, H37Rv and CAS is in line with a systematic review reported from the country [39] as well as a research nding [40].
The higher proportion of modern strains Lineage 3 (East-African Indian) and Lineage 4 (Euro-American) of the Mycobacterium than the ancient Lineage 1 (Indio Oceanic) was in agreement with a study from Dessie 71.4% (20/28) [14]. This greater proportion of the modern strains to the ancient strains could be due to the recent expansion of tuberculosis in Ethiopia compared with the ancient Indio Oceanic ones which are more common in populations living around the Indian Ocean.
The higher proportion of T sublineage followed by CAS1-Delhi in the present study was compatible with a study reported from Addis Ababa [41]. Higher proportion of T3-ETH sublineage implied its greatest transmission rate in the area. Unlike other studies [12,30], it was also found that lower number of isolates (2 or 3) with the same spoligotyping patterns were found in a single cluster of this nding. In fact, the less number of isolates in a cluster agrees with a study report from Gambella region, Southwest Ethiopia [42]. The higher proportion of orphans could be due to limitation of reports to the spoligotype database. On the other hand, the leading shared strain type frequently reported from Ethiopia namely SIT149 [39,[43][44][45] was also detected at greater rate in this nding.
The highest discriminatory power of the 24-loci MIRU-VNTR to spoligotyping agrees with other similar studies [30,46] and it can be used alone or in combination with spoligotyping for the discrimination purpose. Missing some of the 24-loci in this study might be due to ampli cation errors in the PCRs as similar to the validation of 24-Locus Variable-Number Tandem-Repeat Typing for M. tuberculosis [8].
The lowest clustering rate by spoligotyping and absence of clustering at all in MIRU-VNTR of this study is an indication of minor or no recent transmission of the Mycobacterium in the area, which implies that the disease is mainly due to endogenous reactivation of the latent TB infection. Similarly, other studies also showed that the clustering rate of MIRU-VNTR is less than that of spoligotyping [32,35,47,48]. Those clustering rate differences in different study areas might be due to the differences in the geography, population density and socio-economic diversity [49].
In agreement with previous studies, the predominance of lineage 3 (Delhi/CAS; 32.1%) using MIRU-VNTR 24-loci genotyping showed its wide distribution throughout the country [10,50,51]. On the contrary, it was not the predominant strain rather H37Rv like and Ethiopia_3 were the most common sub-lineages in studies from prisoners and communities in Southern, Southwestern and Southeastern Ethiopia [52] and Eastern Ethiopia [32], respectively. This revealed that the overall predominant lineage of M. tb across the country varries.

Conclusions
The ndings in this study show that all of the causative agents were M. tuberculosis with pulmonary TBs at a greater proportion. The majority of the isolates were singletons rather than clustered forms suggesting that lower recent transmission but evidence for more of endogenous reactivation. Family T1 and family33 were the most frequently infecting families with Euro-American lineage as the predominant. The study also identi ed that the majority of the isolates were orphan. All the isolates typed by MIRU-VNTR 24-loci were singletons implying the greatest genetic diversity. Thus, the presence of a large number of orphan isolates needs further investigation using more numbers of isolates to report and register in the international database. Identifying the status of genetic diversities could also help to strengthen TB prevention and control programs in the study area.

Consent for publication
Not applicable Availability of data and materials All data generated or analyzed during this study are included in this published article and its supplementary information les.

Competing interests
The authors declare that they have no competing interests.