Our 88.8% ORP BUSCO completion score was similar to or higher than those seen in recently published insect transcriptome data (e.g. [53, 54, 55]). Likewise, our ORP TransRate assembly score of 0.215 is within the range of scores currently reported for insect transcriptome data (e.g. [56, 57]), and is of higher quality than nearly 50% of transcriptomes deposited in the NCBI TSA database as of 2016 [46]. The TransRate assembly score is significantly impacted by both read quality and read duplication during PCR amplification [46]. Thus, the low quality seen in many of our R3 reads may have depressed the assembly score despite read trimming done by the ORP. These factors also highlight the importance of read trimming in assembly quality, as trimming was only conducted for the ORP assembly.
Mayfly Hox peptides are highly conserved relative to other hexapods
H. limbata Hox peptides contain all functional domains and motifs widely conserved in insect Hox genes, including a homeodomain, linker region, and hexapeptide motif or residue. Several functional regions specific to particular Hox genes are present in the H. limbata peptides. These include the presence of an N-terminal SSYF motif in Scr, Antp, Ubx, and abd-A, and its absence in lab and Abd-B; TDWM and PFER motifs in the linker region of abd-A; the C-terminal UbdA peptide in both Ubx and abd-A; and C-terminal QAQA and poly-A sequences in Ubx [19].
Further evidence of high sequence conservation in H. limbata Hox genes comes from specific residues within these functional regions. For example, there are four residues unique to Hox homeodomains: a glutamic acid in alpha-helix 1, an arginine and glutamic acid in alpha-helix 2, and a methionine in alpha-helix 3 [19]. These residues were all identified in our H. limbata sequences (e.g., residues Glu-136, Arg-148, Glu-150, and Met-171 in Fig. 1; see Figs. 2-8 and [19]). Additional unique homeodomain residues exist that are specific to particular Hox genes. Homeodomains for the Hox genes lab and pb have the largest number of unique residues, primarily within the N-terminal arm and first and third alpha-helices [19].
Three residues unique to the homeodomain N-terminals of Antp, Ubx, and abd-A (e.g., Gly-22, Gln-24, and Thr-25 in Fig. 5; see also Figs. 6-7) and Abd-B (Lys-12, Lys-13, and Pro-16, Fig. 8) [19], were identified in the corresponding H. limbata homologs. The SSYF and hexapeptide motifs are likewise present in the putative H. limbata Hox peptides.
A number of residues in the linker regions of lab, pb, Dfd, and Scr peptides are also conserved, though these vary more than the homeodomain and hexapeptide regions. For example, many animal lab linker regions share a VKRXXPKTXKXE sequence [19], which in H. limbata is represented by VKRXXPKP (Fig. 1, residues 7-14; with the conserved threonine replaced by proline). The rest of the linker sequence varies in most aligned hexapods, a phenomenon that is also seen in the conserved XKKXXK sequence for pb (Fig. 2, residues 7-13), and the KVHL sequence in Dfd and Scr (Figs. 3, 4, residues 11-14 and 9-12 respectively; [19]).
Antp and Ubx/abd-A embryonic expression is highly conserved amongst insects
Expression of both Antp and Ubx/abd-A during H. limbata embryogenesis closely resembles that of other insects, particularly non-holometabolous species. In the case of H. limbata Antp, we documented expression primarily through the embryonic thorax and abdominal midline. During segmentation in D. melanogaster, Antp expression occurs from the posterior of the labial segment to the abdominal segments, with the strongest expression in the thorax. During germ band retraction, expression remains strongest in the thorax, while abdominal expression is limited to the midline [58, 59, 60). Most studies of Antp in holometabolous (Apis mellifera, [61]) and non-holometabolous species (Schistocerca americana, [59]; Gryllus bimaculatus, [62]) reveal an anterior expression boundary in the posterior labial segment, as we observed in H. limbata. Similarly, Antp expression in H. limbata occurs throughout the thorax and midline abdominal segments, and closely matches that of other holometabolous and non-holometabolous insects, though some species show lateral staining in the abdominal tracheal pits [61]. The reduced midline thoracic staining and stronger proximal staining of H. limbata thoracic limb buds is also observed in orthopterans [59, 62], providing further evidence that Antp expression is highly conserved between H. limbata and other insects, particularly non-holometabolan species.
Similar to Antp expression, Ubx/abd-A expression is highly conserved between H. limbata and other insects despite the distinct differences in development between many holometabolous and non-holometabolous species. Ubx/abd-A expression was strongest at the T3/A1 border and along the lateral portions of the A1-A8 abdominal segments, with weaker expression from A8-A10. In D. melanogaster, Ubx and abd-A show largely overlapping and complementary expression profiles. D. melanogaster Ubx is expressed before segmentation in the presumptive T3 and A1-A7 segments, particularly at the T3/A1 border and within the anterior of each segment; this pattern persists after complete segmentation, with additional expression along the abdominal midline and weakly in A8 [63, 64, 65). After the development of all body segments, D. melanogaster abd-A expression is seen nearly simultaneously within A1-A7, most strongly within the posterior of each; like Ubx, it later extends to the abdominal midline and into A8 [65, 66, 67]. Ubx and abd-A expression is similar in the honeybee Apis mellifera, but begins in A1-A4 before spreading through A1-A7 and does not extend to the abdominal midline [61], a highly conserved pattern seen in both H. limbata and many other insects. In the orthopteran Gryllus bimaculatus, Ubx is first expressed in the posterior growth zone and in the presumptive T3, with expression after segmentation strongest on the T3/A1 border [62]. Ubx expression in the apterygote Thermobia domestica is similar, but also extends anteriorly around the T2 and T3 limb buds during germ band extension, similar to H. limbata lateral staining in the T2 and T3 segments [68]. The extension of Ubx and abd-A lateral expression through the developing abdomen until A10, followed by a post-segmentation weakening of expression from A8-A10, is widely conserved between H. limbata and other insect taxa [68, 62].
The role of Hox genes in nymphal body patterning
The body plan of mayfly nymphs diverges from that of most other insects in possessing unique abdominal appendages. Developmental regulation of thoracic legs and abdominal appendages in insects is controlled by the Hox genes Antp, Ubx, and abd-A, each of which contain functional regions well conserved between H. limbata and many insects. These include the SSYF motif necessary for the transcriptional activation of downstream target genes [69], and the hexapeptide motif, which contributes to extradenticle protein binding [70]. The length of each Hox gene linker region is likewise widely conserved between H. limbata and other insects, and facilitates proper gene function [71, 19]. However, as little is known regarding the functional significance of most linker region residues [19], the potential impact of both H. limbata specific and gene-specific differences in linker region sequences remains unknown. Conserved N-terminal residues of Antp, Ubx, and abd-A were also identified in H. limbata, and play a major role in specifying DNA binding affinity [72, 19]. Several functional regions are specific to Ubx and abd-A, including the UbdA peptide, QAQA, and poly-A sequences; all three have been demonstrated in Drosophila to repress the gene Distal-less (Dll) [73, 74, 75], and are present in the identified H. limbata homologs. H. limbata abd-A also contain the TDWM and PFER linker region motifs, which regulate extradenticle binding and wingless transcription, respectively [76, 77].
Collectively, these shared functional regions between studied insects and H. limbata suggest similar functional roles for Antp, Ubx, and abd-A at the phenotypic level. Antp promotes leg development in the thorax [78], and is abdominally expressed as part of the developing central nervous system in both insects [61, 79, 58, 59, 62] and crustaceans [80]. In D. melanogaster, Ubx and abd-A prevent abdominal limbs from developing through their inhibition of Dll [81], which specifies the distal portion of developing appendages [82, 83]. Appendage inhibition by Ubx and abd-A is also seen in lepidopterans, which require both the repression of Ubx and abd-A, and the expression of Antp and Dll in abdominal limb primordia for the larval prolegs to develop [84, 85, 86]. In coelopterans and orthopterans, Ubx serves as an appendage modifier that is co-expressed with Dll in the A1 segment, resulting in pleuropod development during embryogenesis [87, 88, 62]. Some insect taxa with abdominal structures do not show novel changes in Ubx and abd-A expression or function. For example, Ubx and abd-A expression do not appear to play a role in the development of styli on the A7-A9 segments in firebrats, possibly because styli may not be homologous to true appendages [89, 68]. In sawfly embryos, abdominal prolegs develop despite both the abdominal expression of Ubx and abd-A and the lack of abdominal Dll expression, suggesting that sawfly prolegs consist exclusively of morphologically proximal structures [86, 90].
Novel Ubx and abd-A expression profiles are not observed in H. limbata embryos; in fact, H. limbata Antp, Ubx, and abd-A expression all closely resemble the expression patterns of several other insects with abdominal appendages, including the orthopteran G. bimaculatus, the firebrat T. domestica, and hymenopteran sawfly larvae [68, 86, 62, 90]. Notably, the gill-bearing abdominal segments of H. limbata fall within the range of both Ubx and abd-A expression, indicating they experience limb repression during embryogenesis.
While it remains unknown if H. limbata Dll is abdominally expressed, it is possible that mayfly gills are exclusively proximal structures, akin to the prolegs of sawfly larvae, and thus not subject to Dll inhibition by Ubx or abd-A. Such a hypothesis is further supported by previous work that posits apterygote styli and mayfly gills as proximal appendicular structures possibly related to the proximal appendicular structures of crustacean epipodites [3, 91].