Due to specific biological characteristics of plant-parasitic nematodes, host plant resistances tend to be a remarkably durable means to manage this category of soil-borne pathogens. The main challenge is the actual developing and breeding resistant host-plant varieties. As the genetic basis for virulence in plant-parasitic nematodes is unknown, breeding for resistance can only be done on a trial-and-error basis. The whole process is, therefore, inefficient, and thus time consuming and expensive. The availability of molecular-based pathotyping methods of plant-parasitic nematode populations would allow for the design of more targeted resistance. Here we concentrated on two Globodera rostochiensis inbred lines, Gr-Line19 and Gr-Line22, with distinct pathotypes . Resulting from a single male-female crossing by Janssen et al. (1990) , each of these lines' genomic background is small, with a maximum of 4 haplotypes per locus. These small genomic backgrounds signifcantly simplify the generation of high-quality reference genomes, which has been a challenge for sexually reproducing plant-parasitic nematodes in the past. Therefore, we expect that the reference genome sequences of Gr-Line19 and Gr-Line22 are a more accurate representation of the G. rostochiensis genome, bringing the concept of molecular pathotyping a step closer. Furthermore, long-read sequencing technology allowed us to generate reference genomes about 24 fold less fragmented than the current reference genome . This higher continuity made it possible to pinpoint the physical distribution and the diversification in a way that was not possible with the highly fragmented JHI-Ro1 reference genome sequence.
One of the main technical challenges we tried to overcome was generating high-quality reference genome sequences of a highly heterozygous nematode species. The assembly sizes of the novel reference genomes are comparable to the G. rostochiensis JHI-Ro1 genome  as well as to the estimated G. rostochiensis genome size . Therefore, it is likely that the high levels of heterozygosity did not negatively impact the assembly by the presence of haplotigs . To further reduce the number of scaffolds, possibly to a chromosome level, it might be advantageous in the future to supplement long-read sequencing with other techniques such as optical mapping [32, 33].
We furthermore assessed the effect of generating a genome assembly of a highly inbred line instead of a field population. A comparison was made between SNPs' zygosities called on the basis of short-read data and we observed that Gr-Line19 had a 10% higher proportion of homozygous SNPs than the JHI-Ro1 population. Since more than 50% of the called SNPs in Gr-Line19 are still heterozygous, it seems reasonable to assume that the currently measured heterozygosity levels provide a more realistic picture of the heterozygosity that is present in an individual. This observation is in line with previous findings for the cyst nematode Heterodera glycines by Ste-Croix et al. (2021)  showing that individuals from a single population can be homozygous or heterozygous vis-à-vis putative virulence genes.
A more detailed analysis of four effector families for which at least a subset of members are known to be expressed in the dorsal gland of nematodes during feeding site induction or maintenance revealed dozens of novel potentially virulence-associated variants.
To some extent, our starting point was comparable to the approach taken by Bekal et al. (2008) . Within the soybean cyst nematode Heterodera glycines subsets of populations with comparable pathogenicity have been defined based on their multiplication characteristics on a set of seven soybean indicator lines. Populations that shared multiplication characteristics were coined ‘HG types’ . Subsequently,  used two inbred lines that were either avirulent (‘TN10’; HG type 0)) or virulent (‘TN20’; HG type 1, 2, 3, 4, 5, 6, 7). 454 micro-bead sequencing of these indicator lines resulted in the generation of tens of millions of short reads (110–120 bp), which allowed for whole-genome comparative analysis. These efforts resulted in 239 homozygous SNPs between TN20 and TN10 . Although the relationship between these SNPs and pathogenicity is unclear, these SNPs could be considered one of the first molecular markers for pathogenicity in cyst nematodes. Here we took it one step further, by identifying copy number variation that might serve as potential pathotype-specific molecular markers. Copy number variation is relevant, as it has been linked to virulence in various pathogens [37, 38] including plant-parasitic nematodes .
For potato cyst nematodes, Folkertsma et al. (1996)  used AFLP assays  to characterize pathotypes of the potato cyst nematodes G. rostochiensis and G. pallida. Almost 1,000 marker loci were employed to genotype populations of both potato cyst nematode species. These analyses revealed genetic markers that can distinguish between the G. rostochiensis pathotypes Ro1, Ro3 and Ro4, while such loci appeared to be absent for the G. pallida pathotypes Pa2 and Pa3. In a more extensive approach focussing on G. rostochiensis only, Mimee et al. (2015)  employed a restriction enzyme-based genotyping-by-sequencing approach. The genotypic characterization of 23 populations, covering all five pathotypes, revealed a clear distinction between pathotypes Ro1 and Ro2 on the one hand, and Ro,3, Ro4, and Ro5 on the other. Moreover, their analyses seemed to demonstrate intra-pathotype variation within Ro1. However, it is noted that with 14 populations from 9 different countries, Ro1 was substantially overrepresented in this research.
The first reference genome for G. rostochiensis was published by Eves-van den Aker (2016) , and - in conjunction with this - the intra-species variation regarding members of known effector families was mapped. All five G. rostochiensis pathotypes were represented in this study, and whereas homozygous molecular markers to discriminate between Ro4 and Ro5 could be identified, this was not possible for the remaining three pathotypes. Moreover, this research confirmed the large genotypic diversity of populations that are all labeled Ro1, indicating that there are many possible genotypes that yield a similar Ro1-like virulence. Here it might be mentioned that Ro1 and Ro4 share the inability to parasitize potato genotypes that harbor the H1 resistance gene from Solanum tuberosum ssp. andigena CPC 1673. Moreover, the H1 resistance genes have been introgressed in most commercial potato varieties, and potato cyst nematode populations worldwide have been exposed to these resistance genes likely including the ones characterized by Eves-van den Akker (2016) . So, although these pathotypes share their avirulence with regard to the H1 gene, they belong to another G. rostochiensis genotype and differ significantly in intra-pathotype variation.
We used pathotypically characterized inbred lines as a starting point from which we generated new reference genome sequences. On this basis, complete effector families could be mapped and compared. In essence, the make-up of effector families in lines with distinct pathogenic characteristics could vary because of (1) non-synonymous variants in sequence in a given set of effector genes and/or (2) effector gene loss or gain. The balance between these two (dependent) sources of variation varies in a pathogen-dependent manner. The genome-wide comparison of three Microbotryum species parasitizing distinct Caryophyllaceae allowed Beckerson et al. (2019)  to define the secretomes of the individual species. Their analyses revealed that host specificity was explained by rapid changes in effector genes rather than by variation in the effector copy numbers. With a similar underlying question, Qutob et al. (2009)  investigated two effector genes families of Phytophthora sojae, Avr1a and Avr3a in a range of races. The presence of multiple copies of nearly identical genes on the Avr1a and the Avr3a locus was suggested to contribute to the fitness of these races, and races with distinct pathogenicities were characterized by variations in effector gene numbers. These examples demonstrate that both sources of variation can generate differences in pathogenicity among plant pathogens. Here we specifically focussed on effector gene loss and gain effects, and observed that both events happen in the avirulent Gr-Line19 as well as the virulent Gr-Line22. Previous studies show that, at least in potato cyst nematodes, single nucleotide polymorphisms are also related to virulence (e.g. ), which indicates that both types of genomic variance are relatable with virulence.
In case of the tropical root-knot nematode Meloidogyne incognita,  tried to pinpoint the genetic basis of avirulence and virulence with regard to the tomato resistance gene Mi-1.2. Genome-wide characterization of two pairs of avirulent and virulent lines revealed 20 gene families that all showed a lower number of copies in both virulent M. incognita lines. It is noted that the 20 families included pioneers and household genes, and not known effector families. Hence, although a lower copy number per gene family was associated with virulence, this research did not identify gene loss events that could be causally related to virulence.
We separately considered the dorsal esophageal gland-expressed effectors that are thought to be involved in immune response suppression and feeding site induction, and the subventral gland-expressed effectors that are active during plant penetration. Concerning dorsal gland-expressed effector families, G. rostochiensis Gr-Line19 harboured on average 14 genes per effector family, while on average, 19 members were identified per effector family in Gr-Line22, a homozygous virulent line with regard to the H1 resistance gene. Two exceptions to this overall increase in effector copy number should be mentioned; GSS appeared to be absent in Gr-Line22, while lower copy numbers were observed for Hg-GLAND6. GSS is a peculiar example as this was the first example of a cyst nematode effector family that arose from neofunctionalization of a household gene, glutathione synthetase . Two expansion waves of GS-like genes were described, and only in the second wave GS-like genes with a signal peptide for secretion arose. The resulting clade 3 comprised cyst nematode effectors. It should be noted that GSS genes in this clade share only 36% protein identity. The thresholds used in this study to identify GSS family members were presumably too strict to identify more GS-like members.
The effector families expressed in the subventral glands that are included in this study showed less expansion than the dorsal gland specific families, with only small differences in copy numbers between the two lines. Strikingly, a substantial number of genes belonging to this category lack a signal peptide presence. Since many of these genes (e.g., glycoside hydrolases, pectate lyases) code for cell wall-degrading or modifying enzymes, the proteins would have to be secreted to make physical contact with the plant in order to perform their function. One hypothesis could be that these genes are, in fact, pseudogenes. However, this seems unlikely as manual inspection showed that most of these SP lacking genes show a RNAseq signal (results not shown). Whereas ample RNAseq data allowed for an accurate prediction of the intron-exon structure, the transcription start site is more difficult to predict without additional experimental data. If transcription start sites were misplaced, we could have missed a preceding signal peptide. Alternatively, it could be that this cyst nematode genuinely harbours effector variants without apparent signal peptide similar to the invertases identified in Meloidogyne incognita. These effectors were suggested to be acquired at a late stage during cyst nematode evolution .
Phylogenetic analysis of effector families as presented here takes along both effector diversification and effector loss and gain. These data clearly demonstrate that the balance between both sources of variation differs per effector family. Whereas effector family 1106 showed overall little copy number variation, SPRYSECs showed a 65% increase in copy number in Gr-Line22, and in case of the GLAND5 family significant diversification was accompanied by a sharp increase of the copy number in Gr-Line22. Other population genetic studies on plant-pathogenic fungi and oomycetes showed exclusively low  or high  levels of diversification between effector genes. We are not aware of other plant pathogens for which such drastic contrasts in diversification pattern between effector families were described.
Because of its extreme level of physical clustering of CLE effectors in both G. rostochiensis inbred lines we investigated its genomic organisation in more detail. Potato cyst nematodes produce and secrete mimics of plant CLEs. Plant CLEs are signalling components that were shown to be conserved in both Arabidopsis and potato roots . Among Globodera CLE genes two functional classes are distinguished, CLE1 and CLE4. The main difference between these classes is the composition of CLE peptides that are present as small cleavable units separated by small spacers at the protein’s C terminus. Mitchum et al. (2012)  described a single CLE1 representative, and here a second potential CLE1 variant is identified in both G. rostochiensis lines (Fig. 8a). This second variant has a domain structure similar to GrCLE1 (Supplemental Fig. 3), and the conservation of the domain structure makes it plausible this variant has a CLE1-like function. Notable is the putative duplication event of Gr-CLE4 genes in Gr-Line22. Gr-CLE-4 genes are highly conserved, even between pathotypes and we assume that this duplication event might result in a higher production of GrCLE4 peptides. A dose effect for a nematode effector was previously reported for the 32E03 effector of the beet cyst nematode Heterodera schachtii . So our finding might suggest that the virulent G. rostochiensis line 22 might exert a stronger CLE4 peptide-based effects on its host.