The identity and distribution of striped bagrid catfish, Mystus tengara (Hamilton 1822) revealed through integrative taxonomy

The taxonomic status and geographical distribution of M. tengara are vague. No genetic diversity and phylogenetic study have been done till now to resolve its identity and distribution. In the present study, an integrated taxonomic approach has been applied to clarify the taxonomic status, identity, and distribution of bagrid catfish, Mystus tengara. Comparative morphometric evaluation of M. tengara identified in the present study from distant geographical locations revealed variations of the traits in response to body length and environment, without significant genetic distance. The observed morphometric traits of M. tengara were found to be overlapping with available morphometric traits of M. tengara, M. carcio and M. vittatus. Maximum likelihood and Bayesian phylogenetic analysis based on mitochondrial cytochrome c oxidase (COI) gene also could not resolve their identity, and five paraphyletic clades comprising of M. tengara, M. vittatus, and M. carcio from India, Nepal, and Bangladesh were observed. Morphological and genetic evidence along with comparative evaluation of M. tengara, from its type locality, we consider M. tengara identified in the present study to be true, with its distribution extending from North East India to West Bengal, North India, Central India, Northern peninsular India, and Bangladesh. The observation of paraphyletic subclades and evaluation of genetic distance between subclades reveals the presence of four cryptic species. Further confirmation on the identity of M. vittatus and M. carcio, by an integrated taxonomic approach based on fresh specimens collected from the type locality, is required.


Introduction
Genus Mystus Scopoli 1777 (Teleostei: Bagridae) comprises small to medium-sized freshwater and estuarine catfishes distributed from the Middle East to South, and South East Asia [1]. Currently, 42 species are considered valid within the genus, of which, 15 species are reported in India. The taxonomic validity of additional six species, described from India, requires confirmation as they have been published in 'predatory journals' and are considered 'unavailable' [2].
The taxonomy of members of the genus Mystus is in flux, as many species are morphologically similar, and subtle diagnostic characters have been used to delimit the species [1]. Therefore, accurate species-level identification using morphological characters alone is problematic [3]. Further, as the monophyly of the genus has been considered doubtful [4], several studies continue to be carried out on the molecular phylogenetics and genetic-based resolution of specieslevel identities [5,6].
Several authors attempted to clarify the long-standing confusion in the literature by re-describing M. carcio and M. tengara. They also confirmed that M. tengara, M. carcio, and M. vittatus are distinct species. Nevertheless, the molecular phylogeny and geographical distribution of these three species have not been studied. Further, studies reported the occurrence of these species far away from their type locality. For example, M. vittatus, described from the south-eastern part of India (Tamil Nadu) was subsequently recorded from North-East India [20][21][22]. Similarly, various authors have recorded M. tengara, a species described from Bengal, in Southern peninsular India [23]. Though, the validity of these records has been debated [11,24], several genetic sequences presumably of the three species collected from distinct geographical regions are available; thus, necessitating a study to understand and clarify the identity and distribution of M. tengara, M. carcio, and M. vittatus. In the present study, we have attempted to fill this knowledge gap using an integrated taxonomic approach.

Study area and sampling
Specimens of M. tengara were collected from the Sodepur fish market (n=5), West Bengal, and from Nath Sagar (n=9), Godavari River (19°32′05.9″ N, 75°20′09.7″ E), Maharashtra, India. For molecular analysis, muscle tissue along the left side of the specimens were stored in 95% ethanol. All samples were preserved in 10% formalin for morphological studies.

Morphometrics and meristics
The morphometric characters were measured with an automated digital caliper (to the nearest 0.1 mm), and counts were recorded from the left side of the fish, following Chakraborty and Ng [25]. Measurements were reported as percentages of standard length (SL), whereas subunits along the head region were presented as percentages of head length (HL). Species-level identification was confirmed by using available taxonomic literature [3,10,11,17,19,26,27]

DNA isolation and PCR amplification
Total genomic DNA was isolated from the muscle tissue (n=6) using the Phenol-Chloroform method [28]. The partial fragment of mitochondrial cytochrome c oxidase (COI) gene was amplified out using the method described by Ward et al. [29]. The PCR amplified products were purified using GelExtraction Kit (Qiagen, Germany) and both sense and antisense strands were sequenced by Xcelris Lab Limited (Gujarat, India). The generated sequences were deposited in GenBank with accession numbers MT928144-MT928148 and MT928150.

Molecular data analysis
A dataset was prepared including sequences generated in the present study (Five COI sequences of Mystus tengara and one COI sequence of Mystus cf. tengara) and those reported in NCBI GenBank (M. tengara-27, M. vittatus-41, M. carcio-8 and other species of the genus Mystus-22) (Online Resource 1). Sequences of Hemibagrus menoda and H. punctatus were used as outgroup. All the sequences were aligned using Clustal W program [30] (Online Resource 2). The phylogenetic tree was built using the maximum likelihood (ML) approach employing PhyML plugin and Bayesian (BI) approach using MrBayes plugin in Genious Prime v 2019.1.3. The most appropriate model was selected employing jModeltest v2.1 [31] under the Akaike information criterion (AIC), [32]. The best-fit model of sequence evolution was HKY+I+G. The gamma distribution parameter was obtained using jModeltest v2.1, and the robustness of tree topology was estimated by bootstrap analysis based on 1000 replicates. Intra and inter-specific genetic distance values were estimated using the Kimura 2-parameter model using MEGA7 software [29,33]. Body moderately compressed. Dorsal profile rising evenly from tip of snout to origin of the dorsal fin and sloping ventrally from the origin of the dorsal fin to end of caudal peduncle. Ventral profile more convex up to anal fin Sudasinghe et al. [24] Darshan et. al. [19] M. tengara(n  base, then sloping slightly dorsally to end of caudal peduncle. Bony elements of the dorsal surface of head covered with thin skin. Anterior cranial fontanel extending from the level of posterior nasal opening to posterior orbital margin. Posterior cranial fontanel long, invading the region of supraoccipital bone and reaching the base of the occipital process in juvenile specimens. Occipital process reaching basal bone of dorsal fin (West Bengal specimens), and in some cases a considerable gap seen between occipital process and basal bone of dorsal fin (Maharashtra specimens).

Coloration
In fresh condition, body greenish to bright yellow with dark brown to black stripes on either side of the body along with a dark tympanic spot above the pectoral fin. In 10% formalin, the dorsal surface of the head and body pale brown; the ventral surface of the head and body dirty white. Dark spot in tympanic region present. Four pale brown lateral stripes separated by pale interspaces on both sides.

Phylogenetic and genetic distance analysis
The maximum likelihood (Figs. 2 , 3, and 4) and Bayesian tree (Online Resource 3) revealed a similar topology. In the phylogenetic tree, sequences labeled as M. tengara, M. vittatus, and M. carcio formed four paraphyletic clades with significant bootstrap values However, these values were not high to signify the relationship between clades.
In the maximum likelihood tree, Clade I comprises Mystus tengara (samples collected from Maharashtra, Western India, and West Bengal, Eastern India, as a part of present Sudasinghe et al. [24] Darshan et. al. [19] M. tengara(n The average genetic distance values within and between clades are provided in Table 2. Within M. tengara (the present study samples), the genetic distance values are ranged from 0.2 %(West Bengal-Maharashtra) 0.4% (West Bengal). Between Mystus cf. tengara and M. tengara, the average genetic distance value was 12.6 %. The genetic divergence value among clades ranged from 9.0 to 11.3 (Clade I-II),

Comparative morphometric evaluation
The present study used an integrated taxonomic approach to resolve the identity and distribution of Mystus tengara. Comparative morphological evaluation of freshly collected specimens of M. tengara, from West Bengal, showed close similarity to the original description [10], in having four longitudinal stripes separated by 3 pale interspaces, presence of large tympanic spot above the pectoral fin, length of four barbels longer than the head, and occipital process reaching basal bone of dorsal fin. Specimens of M. tengara collected from Maharashtra also match with the description of Hamilton for M. tengara, except with the presence of a small interspace between the occipital process and dorsal fin base.
M. tengara is differentiated from M. vittatus [10] by the absence of serrations in the dorsal spine (vs. presence), a character which is suggested to be an error [27], based on Gunther's description of "Macrones tengara". M. tengara further differentiated from M. vittatus [26] by median longitudinal groove reaching to base of occipital process and occipital process reaching basal bone of dorsal fin vs. median longitudinal groove reaching midway behind the hind edge of the eye and base of the occipital process and the short interspace between occipital process and basal bone of dorsal fin. In the present study, we observed variations in the median longitudinal groove with size and geographical locations.
Darshan et al. [11] re-described M. tengara to establish and confirm its taxonomic identity and differentiated this species from M. vittatus. The specimens of M. tengara identified by Darshan et al. [11] varied from M. vittatus in having  [11,24], which shows that the effects of environmental variations in this trait cannot be ruled out.

DNA barcoding and phylogenetic study
DNA barcodes have been used for confirming the identity of species and their distribution [35]. Previous studies have shown that a genetic divergence value of 2-3% at DNA barcoding gene (COI) could be used as a threshold value to discriminate species [36,37]. Accordingly, conspecific individuals show a genetic divergence value of <3%, while congeneric species >3%. During the present study, in Clade I, sequences identified as M. vittatus, M. carcio, and M. horai from different geographical locations, were clustered with 'M. tengara' (collected in the present study) having a genetic distance of <3%. The average genetic distance value among M. tengara specimens of the present study is 0.3%. These observations suggest that the sequences identified and labeled as M. carico/M. vittatus/M. horai in GenBank could be misidentifications of M. tengara.
Hamilton [10] described M. carcio and distinguished it from M. tengara in the length of maxillary barbel (extending beyond pectoral vs. reaching to end of caudal) and serrations on dorsal spine (presence vs. absence). Darshan et al. [19] revalidated M. carcio and distinguished it from M. tengara and M. vittatus based on shorter adipose-fin base length (8.5-11.9 vs. 24.0-31.7 and 21.5-26. 0 respectively) and posterior fontanel length (reaching the base of supraoccipital process vs. not reaching the middle of supra-occipital bone vs. terminating at the anterior tip of supraoccipital respectively). The description of the median longitudinal groove in M. vittatus terminating at the anterior border of supraoccipital bone, not invading the supraoccipital region by Darshan et al. [19], is not in agreement with Day [26]. Further, the description of the median longitudinal groove in adult specimens of M. tengara not reaching beyond the middle of supra-occipital bone by Day [26], is not in agreement with the present study. However, M. carcio, re-described by Darshan et al. [19], was distinct in other characters from M. tengara identified in the present study. The re-description of M. carcio [19] was based on specimens collected from Assam, Tripura, and Bangladesh but without any molecular evidence. In our phylogenetic analysis, specimens identified as M. carcio from Assam were grouped with M. tengara sensustricto, whereas, specimens identified as M. carcio, from Bangladesh, grouped with M. tengara clade II with significant genetic variation.
Further, specimens collected from West Bengal and Maharashtra could be distinguished in morphometric characters such as dorsal spine length ( 20.06-26.00), maxillary barbel length (reaching the posterior tip of anal fin base or to caudal fin base in smaller specimens vs. reaching anal fin base) and occipital process (reaching basal bone of dorsal fin vs. a considerable gap present), which reveals geographical variations in these diagnostic phenotypic characters. Similar to our observations, there is a geographical variation in maxillary barbel length [26] in M. tengara from Punjab and Assam (reaching to the middle of the pectoral fin vs. reaching to the base of pelvic fin). These finding indicates that these characters may be influenced by the size of the fish and the environment, in which they inhabit and cannot be considered as good diagnostic characters for these species [3].

Molecular evidence reveals cryptic species
In clade II, sequences identified and labeled as M. tengara could likely be misidentifications as the genetic distance of these sequences with those in clade I are higher than 3%. M. tengara recorded from Assam (MH156942) also showed a higher genetic distance with sequences in clade I and could be a distinct species. Clade III is comprised of M. vittatus and this species was confirmed to be distributed in northern, north-eastern, western, and central India. Interestingly, specimens of Mystus cf. tengara, the focus of the present study clustered with M. vittatus recorded from north-east India and formed clade IV. Though a sister group relationship was observed between clade III and clade IV, the average genetic distance between these two clades was 3.2% suggesting the occurrence of distinct lineages or cryptic species in this group. Further based on morphometric and meristic data (Online Resource 4), we could not differentiate it from M. tengara and requires further confirmation based on more number of specimens. Species names and identities in clade V are also likely to be erroneous due to morphological ambiguities. Studies on generating reference DNA barcodes without morphological taxonomy could often lead to species misidentification [38,39], which has been demonstrated recently in hill stream loaches of the Western Ghats [40]. Due to overlapping diagnostic characters and morphological similarities, various authors could have misidentified M. tengara, M. vittatus, and M. carcio resulting in the deposition of erroneous sequences in NCBI GenBank. Based on morphological and genetic evidence of freshly collected M. tengara, from its type locality, we consider sequences that form part of clade I to be M. tengara sensustricto, with the distribution extending from North East India to West Bengal, North India, Central India, Northern peninsular India, and Bangladesh. Further confirmation on the identity of M. vittatus and M. carcio, by an integrated taxonomic approach based on freshly specimens collected from the type locality, is required. The observation of paraphyletic subclades and evaluation of genetic distance between subclades reveals that there could be at least four cryptic species in this group, opening up avenues for future research on the group.