Between 2002 and 2017, 186 samples were collected in the Toulon Little Bay (Fig. 1A), from which O. nana female and male adults were isolated (Fig. 1B). Across fifteen years of observation, we noted minimum male/female ratios in February (0.11), maxima in September, October, and November (0.17), and a mean sex-ratio of 0.15 ± 0.11 over all years (Fig. 1C). This monitoring of the sex-ratio showed a strong bias toward females, with relative stability over the years (ANOVA, p = 0.87).
Central nervous system labelling by immunofluorescence
The central nervous system labeling with β3-tubulin in O. nana (Fig. 2A–B) showed the male-specific presence of a lateral ganglion with a high density of β3-tubulin in the anterior part of the ganglion, and a higher density of nuclei in its posterior part. We also observed the presence of β3-tubulin-rich post-ganglionic nerves that possibly connect the anterior part of the lateral ganglion to the tritocerebrum and/or the subesophageal ganglion. Not all males presented this labeling, and certain males contained one ganglion symmetrically on each side. However, no females presented this ganglion (Fig. 2C). The α-tubulin labeling showed about seven to nine parallel afferent nerve fibers on the lateral part of the ganglion (Fig. 2D–E) connected to free nerve endings located in the ventral part of the ganglion and in the external environment. Such labeling was absent in O. nana females. Together, these assays indicated the presence of a new nerve ganglion present only in O. nana males, located on the anterolateral part of the prosome.
Transcriptomic support for Oithona nana male homogamety
To identify the most likely cause of this sex-ratio bias between potential environmental sex determination (ESD) and higher male mortality, we used SD-pop on four individual transcriptomes of both sexes to first determine the O. nana sexual system. According to SD-pop, the ZW model was preferred (lowest Bayesian information criterion (BIC)) for O. nana. This result is unlikely to be due to chance, as in none of the runs on the 69 datasets for which the sex was permuted did the ZW model have the lowest BIC. Eleven genes had a probability of being sex-linked in O. nana greater than 0.8; however, none of the SNPs in these genes showed the typical pattern of a fixed ZW SNP. The four females genotyped were heterozygous, and the four males were homozygous (except for some SNPs, for which one male individual was not genotyped), indicating that the recombination suppression between the gametologs is recent, and that no or few mutations have been fixed independently in both gametolog copies. Annotation of these eleven genes shows that only one shared homology with other metazoan genes, that being ATP5H, which codes a subunit of the mitochondrial ATP synthase (Supplementary Notes S5). As in Drosophila, O. nana ATP5H is encoded in the nucleus [21].
LNR domains burst in the O. nana proteome
To identify LDPs, we developed a HMM dedicated to O. nana LNR identification based on 31 conserved amino acid residues. In the O. nana proteome, 178 LNR and LNR-like domains were detected, encoded by 75 LDPGs, while a maximum of eight domains coded by six LDPGs were detected in four other copepods (Fig. 3A–B). Among the 178 O. nana domains, 22 were canonical LNR and 156 were LNR-like domains (Fig. 3C). By comparing the structure of Notch, LNR, and LNR-like domains, we observed the loss of two cysteines (Fig. 3C) in the LNR-like domains. Among the 75 LDPs, we identified nine different protein structure patterns (Fig. 3D), including notably 47 LNR-only proteins, 12 trypsin-associated LDPs, and eight metallopeptidase-associated LDPs. Overall, LDPs were predicted to contain a maximum of 5 LNR domains and 13 LNR-like domains.
Forty-nine LDPs were predicted to be secreted (eLDP), six membranous (mLDPs), and twenty intracellular (iLDPs) (Supplementary Notes S6). Among the iLDPs, two were associated with proteolytic domains, three associated with sugar-protein or protein-protein interaction domains (PAN/Appel, lectin, and ankyrin domains), and 13 (65%) were LNR-only proteins. Among the eLDPs, 18 (37%) contained proteolytic domains corresponding to a significant enrichment of proteolysis in eLDPs (hypergeometric test, p = 2.13e − 17); other eLDPs corresponded to LNR-only proteins (63%). The mLDPs were represented by one Notch protein, two proteins with LNR domains associated with lectin or thrombospondin domains, respectively, and three LNR-only proteins.
In phylogenetic trees based on nucleic acid sequences of the LNR and LNR-like domains (Fig. 3E), only 17% of the nodes had support over 90%. Twenty-seven branch splits corresponded to tandem duplications involving 15 LDPGs, including Notch and a cluster of five trypsin-associated LDPGs coding three eLDPs and two iLDPs.
Oithona nana male gene expression
Among the 15,399 genes predicted in the O. nana reference genome, 1,233 (~ 8%) were significantly differentially expressed in at least one of the five developmental stages. Among them, 619 genes were specifically upregulated in one stage, with 53 genes upregulated in the egg, 19 in nauplii, 75 in copepodids, 27 in adult females, and 445 in adult males (Fig. 4A). The male-upregulated genes were categorized based on their functional annotation (Fig. 4B).
Upregulation of LNR-coding and proteolytic genes in males
The 1,233 differentially expressed genes contained 27 LDPGs (36% of total LDPGs) (Fig. 4C). Of these 27 genes, 18 were specifically upregulated in adult males, producing a significant and robust enrichment of LDPGs in the adult male transcriptomes (fold change > 8; p = 2.95e − 12) (Fig. 4C). Among the 445 male-specific genes, 27 were predicted to play a role in proteolysis, including 16 trypsins with three trypsin-associated LDPGs, showing significant enrichment of trypsin-coding genes in males (p = 1.73e – 05), as well as three metalloproteinases and five proteases inhibitors.
Upregulation of nervous system-associated genes in adult males
Forty-eight upregulated genes in males had predicted functions in the nervous system (Supplementary Notes S7). These included 36 genes related to neuropeptides and hormones, either through their metabolism (10 genes, seven of which encode enzymes involved in neuropeptide maturation and one of which is an allatostatin), through their transport and release (9 genes), or through neuropeptide or hormone receptors (17 genes, seven of which are FMRF amide receptors). Six genes were predicted to be involved in neuron polarization, four in the organization and growth guidance of axons and dendrites (including homologs to B4GAT1 and zig-like genes), one in the development and maintenance of sensory and motor neurons (IMPL2), and one in synapse formation (SYG-2, futsch-like).
Upregulation of amino-acid conversion into neurotransmitters in male adults
We observed 10 upregulated genes in males predicted to play a role in amino acid metabolism (Fig. 4D). This includes five enzymes that directly convert lysine, tyrosine, and glutamine into glutamate through the activity of one α-aminoadipic semialdehyde synthase (AASS), one tyrosine aminotransferase (TyrAT), and three glutaminases, respectively. Three other upregulated enzymes play a role in the formation of pyruvate: one alanine dehydrogenase (AlaDH), one serine dehydrogenase (SDH), and, indirectly, one phosphoglycerate mutase (PGM). Furthermore, two other upregulated enzymes are involved in the formation of glycine, one sarcosine dehydrogenase (SARDH) and one betaine-homocysteine methyltransferase (BHMT) (Fig. 3D).
Downregulation of food uptake regulation in male adult
Three genes with predicted functions in food uptake regulation showed specific patterns in males. These included an increase of the mRNA encoding allatostatin, a neuropeptide known in arthropods to reduce food uptake, but also three male under-expressed genes: a crustacean cardioactive peptide (CCAP, a neuropeptide that triggers digestive enzymes activation), and the two bursicon protein subunits, which encode hormones known to be involved in intestinal and metabolic homeostasis.
Protein-protein interaction network involving LDPs and IGFBP
In order to further characterize the role of LDPs, we studied their potential protein interactions using a yeast two-hybrid system (Y2H). To this end, we selected 11 genes: 7 male-overexpressed LDPGs, and 4 potential IGFBPs. The choice of the IGFBPs was made with the hypothesis that potential insulin-like androgenic gland hormone partners would be found in decapods [22].
We performed Y2H analysis using two different approaches (Supplementary Notes S3): the first was a matrix-based screen with pairwise interaction assays, and the second aimed to identify potential interactors in the entire O. nana proteome by a random library screen. This latter screening approach was more time-consuming, and was applied to only a subset of four genes (two LDPGs and two IGFBPs) used as bait proteins against a Y2H library constructed from O. nana cDNAs. Together, these two approaches allowed the reconstruction of a protein network containing 17 proteins, including two LDPs and one IGFBP used as baits (Fig. 5A) (Supplementary Notes S4), and 14 interacting partners, of which six have orthologs in other metazoans and five have no orthologs, but at least one of which was detected using the InterProScan domain (Fig. 5B).
On_LDP1, a putative extracellular trypsin-containing LDP, was found to form a homodimer and interact with a trypsin, two extracellular matrix proteins, and also an insulin-like growth factor binding protein (On_IGFBP7) that contained a trypsin inhibitor kazal domain. Based on its phylogeny, this protein is homologous to IGFBP7, also present in vertebrates (Supplementary Notes S8).
On_IGFBP7 formed a homodimer and interacted with three other proteins: one spondin-1 like protein (On_Spon1-like) containing a kazal domain, one thrombospondin domain-containing protein, and one vitellogenin 2-like protein (On_Vtg2).
On_LDP2, coded by a gene upregulated in males (Fig. 5C), interacted with nine proteins: one vitellogenin 2-like protein, three uncharacterized proteins, one homolog to somatomedin-B thrombospondin type 1 domain-containing protein, one homolog to the neuroendocrine protein 7b2 that contains a secretogranin V-like domain, one wnt5-like protein, one laminin 1 subunit β, and one furin-like 1 protein. No PPIs with IGF were detected, and no homologs to insulin-like androgenic gland hormone [22] were found in the O. nana proteome.