Understanding variation in oleic acid content of high-oleic virginia-type peanut

Contamination at the FAD2B locus due to inadequate screening protocols is the primary cause of sporadic, insufficient oleic acid content in Virginia-type peanut. The high oleic trait in peanut is conditioned by loss-of-function mutations in a pair of homeologous enzymes and is well known to improve the shelf life of peanut products. As such, the trait is given high priority in current and future cultivars by the North Carolina State University peanut breeding program. For unknown reasons, high oleic cultivars and breeding lines intermittently failed to meet self-imposed thresholds for oleic acid content in internal testing. To determine why, a manual seed chipper, crude DNA isolation protocol, genotyping assays for both mutations, and a web-based SNP calling application were developed. The primary cause was determined to be contamination with normal oleic seeds resulting from inadequate screening protocols. In order to correct the problem, a faster screening method was acquired to accommodate a higher oleic acid threshold. Additionally, results showed the mutation in one homeolog is fixed in the program, dig date had no significant effect on oleic acid content, and minor modifiers segregating within the program explained 6% of the variation in oleic acid content.


Introduction
The two major market types of allotetraploid peanut (Arachis hypogaea, 2n = 4x = 40) in the US are runner-types and Virginia-types, which constitute 85% and 15% of the market share, respectively. The North Carolina State University (NCSU) peanut breeding program is the primary purveyor of Virginia-type peanut cultivars for the tristate production region of Virginia, North Carolina, and South Carolina (also known as the Virginia-Carolinas (VC) region). A major destination for Virginia-type peanuts is the in-shell market where, due to packaging constraints, short shelf life and resulting rancid taste is a frequent consumer complaint (Mozingo et al. 2004). As with other market-types, the high oleic (HO) trait extends the shelf life of in-shell peanuts from 2-4 weeks to over 32 weeks (Norden et al. 1987;O'Keefe et al. 1993;Braddock et al. 1995;Mozingo et al. 2004). The NCSU program has committed to the release of only HO Virginia-type cultivars and thus no premium is currently offered for the trait in the VC region. However, adoption of the HO trait has proceeded more slowly in runner-type peanuts due to packaging flexibility and $25/ton premiums are common. Based on this, and with the VC region producing 457,050 tons of peanut in 2021(USDA NASS 2021), the HO trait could be valued at $11,426,250 in the VC last year.
In normal oleic (NO) peanut seed, oleic acid is present in a ~ 2:1 ratio over linoleic acid (the O/L ratio) with oleic constituting 50% of the total fatty acid content, linoleic accounting for 28% and other fatty acids (predominantly palmitic acid) constituting the remainder (Mozingo et al. 2004). As an unsaturated fatty acid, linoleic acid is more prone to rapid oxidative rancidity than monounsaturated oleic acid (Moore and Knauft 1989;Ray et al. 1993). In HO peanut seed, the O/L ratio can reach 40:1 with 80% oleic and 2% linoleic (Mozingo et al. 2004). Thus, due to reduced linoleic acid concentration, HO peanut takes considerably longer to reach detectable levels of rancidity.
The high oleic trait appeared as a spontaneous mutation within the experimental breeding line 'F435' in the University of Florida peanut breeding program (Norden et al. 1987) and was subsequently determined to be controlled by Communicated by Volker Hahn.
1 3 two genes (Moore and Knauft 1989). The enzyme fatty acid desaturase (FAD), also known as microsomal oleoyl-PC desaturase or Δ 12 fatty acid desaturase converts oleic acid to linoleic acid by adding a second double bond to the carbon chain and a reduction in the activity of this enzyme was shown to be the causal defect underlying the trait in developing seed (Ray et al. 1993). The homeologous genes were later named FAD2A and FAD2B and placed on Chromosomes A09 and B09, respectively (Jung et al. 2000b;Pandey et al. 2014). The gene action is considered partially recessive with a double homozygous mutant needed to fully condition the trait (Isleib et al. 2006b;Barkley et al. 2011Barkley et al. , 2013. The HO trait is considered pleiotropic on levels of other fatty acids in the seed (Isleib et al. 1996(Isleib et al. , 2006bBarkley et al. 2011Barkley et al. , 2013 and does not affect flavor in fresh samples (Pattee et al. 2002a, b;Talcott et al. 2005;Isleib et al. 2006aIsleib et al. , 2015. As a simply inherited trait, environment plays little to no role (Andersen and Gorbet 2002;Mozingo et al. 2004;Tonnis et al. 2020) and minor modifiers are thought to exist but have yet to be identified (Isleib et al. 2006b;Barkley et al. 2013;Tonnis et al. 2020;Otyama et al. 2022).
The causal mutation in FAD2A is a G to A transition resulting in a change from aspartic acid to asparagine at amino acid position 150 (D150N) of the resulting protein (Jung et al. 2000a;López et al. 2000;Bruner et al. 2001). This mutation occurs near a histidine-rich motif that is highly conserved among desaturases and are involved in the metal ion binding necessary for oxygen reduction (Jung et al. 2000a;López et al. 2000). While the origin of the FAD2A mutation is unknown, it is common in modern U.S. runner and Virginia-types but rarer in less adapted material, as well as Spanish and Valencia-types Isleib et al. 1996;Lopez et al. 2001;Chu et al. 2007).
The FAD2B causal mutation is a single base insertion of adenosine (A) at 442 bp of the coding sequence, which is 15 bp after the second of three histidine motifs and causes a frameshift mutation which eliminates the third histidine motif (López et al. 2000;Patel et al. 2004). Beginning in 1990, the NCSU peanut breeding program began incorporating the FAD2B mutation from F435 via backcrossing (Isleib et al. 1996). Despite the development of numerous molecular marker systems to facilitate this process (Chu et al. 2007(Chu et al. , 2009; Barkley et al. 2010Barkley et al. , 2011Chen et al. 2010;Tonnis et al. 2020) selection for the HO trait remained phenotypic in the NCSU program.
Despite being under relatively simple genetic control, maintaining a high level of purity for the HO trait throughout all stages of cultivar development and release has proven problematic (Davis et al. 2021). Bailey II is the HO, nearisogenic replacement for the popular cultivar Bailey and was created by a 4X backcross of Bailey to a HO source. In a recent test, Bailey II narrowly passed the self-imposed threshold of 74% oleic acid content to be considered a HO cultivar (Table 1). Nevertheless, in other studies the oleic acid content of Bailey II is comfortably above 74% while other cultivars or breeding lines struggle to achieve this threshold (Balota et al. , 2021. Three hypotheses were proposed for the sporadic low oleic acid content of HO material in the NCSU program: (1) Seed lots are contaminated with NO seeds lacking the mutation in FAD2B and, at high enough contamination, lowers the lot average. (2) Anecdotal reports implied seed maturity influences oleic acid content and either immature seeds had insufficient time to produced oleic acid or over mature seeds lost oleic acid via metabolism. Thus, proper harvest timing may play a significant role in oleic acid content and this may vary based on line. (3) Minor modifiers influencing oleic acid content may be segregating within the program and cause significant reductions in oleic acid content in certain combinations. This study aims to determine the cause of suboptimal oleic acid content in NCSU HO peanut lines and initiate corrective action. To accomplish this objective, a manual seed chipping device, crude DNA isolation protocol, genotyping strategy and SNP caller were developed to economically and efficiently genotype a large number of individuals. Going forward, this approach will enable substantial expansion of marker-assisted selection (MAS) in the breeding program.

FAD2A allele frequency
The causal mutation in FAD2A is present on the Affymetrix Axiom Arachis2 48 K SNP array (Clevenger et al. 2018) as AX_147234396. Data on 200 Virginia-type lines from the NCSU program run on the array was collected to investigate the allele frequencies of the FAD2A mutation (Hancock 2018). DNA was isolated from young leaf tissue of an additional 136 lines not run on the array with the Qiagen  (Table 2). Thermocycling conditions followed the manufacturer's instructions with 42 cycles during Step 3 and afterwards were read on a BMG Labtech GmbH PHERAstar plate reader.

Plant material and field design
The eight lines listed in Table 1 were planted for 2 years (2019 and 2020) in two-row 31-foot plots at the Peanut Belt Research Station in Lewiston-Woodville, NC. The same seed source was used in both years. Of the eight lines, only Bailey is NO while the remaining seven are HO. Peanuts were dug at three dig dates in order to test for the effects of maturity: an optimum dig date (145 days after planting (DAP)), early (131 DAP, i.e. 2 weeks early) and late (159 DAP, i.e., 2 weeks late). Dig dates necessitated a strip-plot design in order to accommodate machinery, plots that needed to be dug on the same date needed to be in the same strip. After digging, a small sample bag was harvested and shelled from each plot. There were two replications for 96 total experimental units or plots.

Phenotyping
Oleic acid content was determined by running 96 seeds from each plot (9,216 total seeds) on a Brimrose Luminar 3076 Seedmeister. This is the same machine used by the North Carolina Foundation Seed Producers (NCFSP) to verify seed lots as high oleic. This machine transmits light to a seed, measures how the light is reflected by the seed, and uses those measures in a model determined by gas chromatography to predict oleic acid content. After phenotyping and in an orderly fashion, each seed was placed in an individual well of a 24-square well microplate (Spex Sample Prep, Product # 2230) for tissue collection. This allowed the phenotype of each individual seed to be paired with its genotypic data.

Design of manual seed chipper
To facilitate the tissue collection and individual genotyping of all 9,216 phenotyped seeds, a manual seed chipper was designed and constructed in the NCSU Biological and Agriculture Engineering Research Shop. A detailed overview of the seed chipper and its operation is presented in Supplementary File 1. The seed chipper consisted of a 2-ton Dayton Arbor Press (Model 467L16) with 24, 3.0 mm reusable rapid punch biopsy kits (World Precision Instruments # WP3030) mounted in a 6-by-4 layout within a stainless steel bracket. The bracket included an ejector plate that, when lowered, depressed the plunger on all 24 biopsy kits simultaneously, thus ejecting all 24 seed chips simultaneously. The entire apparatus was mounted at the base of the arbor press ram and moved in concert with the ram. A stainless steel base was designed that could be rapidly alternated to accommodate the two different microplates needed, as described below. A spring-loaded lever arm was also added to maintain a consistent sampling height for the biopsy kits.

Tissue collection
The 24-square well microplate fit perfectly in the groove of the seed chipper's base, centering each biopsy kit over its respective seed. Therefore, samples were always taken from the middle of the seed, leaving the embryo at either end of the seed intact. With depression of the lever arm, all 24 biopsy kits moved down simultaneously into the 24-well plate and sampled the respective seed. After pushing the lever arm back up, seeds remained in their wells while samples were within the biopsy kit. The 24-well plate was removed and a flat metal sheet placed over the grooved base.
A 96-round well microplate (Spex Sample Prep, Product # 2210) was placed onto the metal sheet and moved under the biopsy kits. By design, the 96-well plate could fit under the biopsy kits in four different positions designated by ordinal

Crude DNA Isolation and Genotyping at FAD2B
The DNA isolation protocol was loosely based on an existing protocol used in cotton (Zheng et al. 2015). To each well, 100μL of 100 mM NaOH, 2% Tween 20 was added and plates were vortexed for one minute. Plates were then incubated at 65 °C for ten minutes in a forced air oven . Following incubation, 100μL of 100 mM Tris-HCl, 2 mM EDTA was added to each well and shaken vigorously for ten seconds. After a brief spin to collect all material in a well, 400μL molecular biology grade water (VWR Cat No. 02-0201-1000) was added to each well. Then, 30μL of supernatant was transferred to a new plate and mixed with 30μL of molecular biology grade water. All centrifuge steps were performed in a Beckman Coulter Avanti J-15 benchtop centrifuge with a JS-4.750 swingbucket rotor. Without quantification or normalization, 1μL of DNA from each seed was genotyped as described above for FAD2A except the FAD2B genotyping assay (Table 2) was used instead.

Data analysis
Statistical analysis for the strip-split plot design was performed using the PROC GLM procedure in SAS version 9.4 (Cary, NC, USA). Within each year, the eight peanut lines were arranged among each of the dig dates (strip-plots) and then grouped in two blocks, representing replications. For each factor, the strip-and split-plots were randomly assigned within the blocks and strip-plots, respectively. The analysis of the strip-split plot factors was conducted using appropriate F-tests based on the expected mean squares. Observations for the analysis were limited to only those seed with a homozygous FAD2B genotype (i.e., seeds which 'failed' to produce a FAD2B genotype or were heterozygous were excluded). All means separation tests were conducted using Fisher's protected least significant difference (LSD) at a significance of α = 0.05, and were presented for dig date and line.

The FAD2A mutation is fixed in the NCSU peanut breeding program
Of the 336 lines genotyped with either the Axiom Arachis2 array or the FAD2A PACE assay, all carried the mutant allele at FAD2A. This includes any cultivar released by the NCSU program since its inception and any elite breeding line from 2002-2018 (which includes the eight lines from Table 1). Thus, the program can be considered fixed for the FAD2A mutant allele and variation at FAD2A cannot explain any observed inadequacies of oleic acid values in the program's HO material.

Seed chipping did not impact germination rate
Of the 96 seeds chipped during the germination test, 74 remained intact after tissue collection, and the remaining 22 split to some degree, mostly along the sagittal plane. However, germination was not affected as 92 of 92 unchipped seeds germinated within 1 week of planting while, 94 of 95 chipped seeds germinated. The chipped seed that did not germinate was a split seed. Five seeds (four unchipped and one chipped) were not found when digging through the soil at the end of the germination test. All five were planted in close proximity and showed signs of being excavated by rodents.

Genotyping at FAD2B
In batches of 384, tissue collection and DNA isolation took approximately two hours at an estimated cost of ~ $0.26 per sample. An example marker figure downloaded from the SNP caller is presented in Fig. 1. Seeds carrying the HO mutation in FAD2B are colored blue and cluster in the top left corner while seeds carrying the wild type NO allele are green and cluster in the lower right. Heterozygous seeds are red and lie between the two main clusters. Seeds that failed genotyping are yellow and cluster in the bottom left corner. In total, 24 such genotyping figures were produced and each had a corresponding marker call file.
Overall genotyping success rate, defined as an individual seed that produced a high-confidence genotype call was 98.4% across both years with 97.1% in 2019 and 99.7% in 2020. The number of seeds that produced a heterozygous call was 1.1% overall with 1.2% in 2019 and 1.0% in 2020. Natural cross-pollination in peanut is negligible and, since all lines were highly inbred, residual heterozygosity was unlikely. Therefore, heterozygote seeds were considered contaminants or genotyping errors and dropped from the study along with the fails, leaving 8,964 seeds.

Contamination at FAD2B is the primary cause of suboptimal oleic acid content.
As can be seen from Table 3, no line in the study was completely pure. While some lines, such as N14023 (0.72%) and Sullivan (1.34%), had relatively low levels of contamination others, such as Bailey II (19.71%) and N15041 (16.35%) had high levels. When looking at genotypically NO vs HO seeds within each line, the percent oleic acid content means separated as expected with a 1% increase in contamination roughly causing a 0.2% decline in mean oleic acid content. Of the 7,855 seeds from the seven HO lines, 645 (8.21%) were genotypically NO while for Bailey, the only NO line in the study, 36 of 1,109 seeds (3.25%) were HO contaminants. Therefore, throughout the study, 681 of 8,964 seeds (7.60%) were genotypically confirmed to be contaminants. Most importantly, when looking only at genotypically confirmed HO seed (Tables 3  and 4), all seven HO lines in the study comfortably met the 74% threshold.

Minor modifiers explaining 6% of variation in oleic acid content likely exist
When looking at genetically confirmed HO seed in the seven high oleic lines there remained significant differences in mean oleic acid content between lines. Bailey II (85.50%) and N14004 (85.03%) were statistically the greatest confirming that pure lots of Bailey II should comfortably meet oleic acid thresholds and perform near the top of HO lines. NC 20 (79.00%) was statistically the worst performing line but still comfortably above the 74% threshold. N15017 (84.15%), Sullivan (83.59%), Emery (83.14%), and N15041 (79.74%) showed intermediate levels of oleic acid content with varying statistical differences. Thus, genes with a minor effect on oleic acid content (i.e. minor modifiers) likely exist that explain approximately 6% of the variation in oleic acid content from 79 to 85%.

Dig date had no effect on oleic acid content
After combining all genetically confirmed HO seed across the seven lines, dig date, used as a proxy for seed maturity, did not have a significant effect on oleic acid content (Table 5). There was a slight numeric increase in oleic acid content as dig date progressed but this did not approach statistical significance. Figure 2 shows the oleic acid content of all Bailey seeds genotypically confirmed as NO in red and all Bailey II seeds confirmed as HO in grey. Of note is the pronounced overlap between the two distributions highlighted by the black circle. This shows HO seeds with the lowest oleic acid content are phenotypically indistinguishable from NO seeds at the higher end of the oleic acid distribution. Thus, when an arbitrary threshold of 74% was imposed to differentiate HO vs NO seed, many NO seeds were inadvertently retained. These are represented by any red bars to the right of the black line in Fig. 2. Additionally, many true-breeding HO seeds were discarded, as shown by any grey bars to the left of the black line. To eliminate all genetically confirmed NO seed, would have required an oleic acid threshold of 83% (green line in Fig. 2).

Discussion
Seed chipping was facilitated by the relatively large size and soft texture of peanut seed relative to that of most other row crops. In addition, the germ is small and located at one end of the seed. While determining which end of the seed contains the germ was difficult and time-consuming, the middle of the seed represented a large target area to hit without compromising viability. The seed chipper should work well with all market-types of peanut as well as tree nuts and seeds of similar size and texture. With modifications, the seed chipper could work on plant species with smaller or harder seeds.
The genotyping strategy developed here (seed chipper, crude DNA isolation, PACE 2.0 genotyping, and SNP caller) will allow the implementation of low-cost, high-throughput, early generation MAS in a small, public sector, cultivar development program. This pertains to not only the HO mutation in FAD2B demonstrated here, but also any Fig. 1 Example marker figure produced by the SNP caller. Seeds homozygous for the HO mutation in FAD2B will produce predominantly HEX fluorescence signal and plot towards the Y-axis (blue cluster in top left corner). Seeds homozygous for the wild type NO allele in FAD2B will produce predominantly FAM fluorescence sig-nal and plot towards the X axis (green cluster in bottom right corner). Heterozygous seeds will produce an equal mix of both signals and cluster in red between the two homozygous clusters. Seeds that fail genotyping will cluster near the origin (yellow in bottom left corner) 1 3 subsequent markers developed for other traits of interest. This might include the HO mutation in FAD2A, which while fixed within elite material, could be absent in external material used in crossing, particularly sources of disease resistance such as wild species and landraces. Genotyping directly off seed, as opposed to leaf tissue, will allow MAS to be conducted quicker while reducing resources committed to greenhouse and field space by only planting genotypically selected seed. This is especially important for programs, such as ours, whose primary field sites are a considerable distance from their laboratory space. The SNP caller provides a free, simple, high-throughput method to convert any microplate reader output file to actionable genotypic data that can be accessed anywhere with an internet connection without the need to purchase and install additional software. The SNP caller will be continuously updated as the need for additional features arises.
While early generation MAS should alleviate the contamination issue in new crosses, purifying existing cultivars and breeding lines by genotyping is inefficient. To adopt the 83% threshold proposed above for Bailey II would create three problems. First, large amounts of seed would be discarded, which while wasteful, should be manageable given the large amounts of seed on hand for material at this stage in the breeding process. Second, different thresholds may be needed for different lines depending on minor modifiers. Lastly, existing equipment is too slow to screen the volume of seed that would be required. Scanning each seed took approximately 20 s, or 52 h for all 9,216 seeds in the experiment. Constant manual intervention is also needed to recognize if a seed was misread and the measurement needed to be repeated. To rectify this, with the help of NCFSP, a Qualysense QSorter Explorer was purchased. This machine is capable of processing 20 seeds/sec and will also modernize and improve the seed grading process (Davis et al. 2021).
The QSorter provides enough throughput to screen every line in every generation, which is not possible with existing equipment. As the shift to HO nears completion for Virginiatype peanut production in the VC region, opportunities for contamination will continue to decrease as the NO allele at FAD2B becomes less frequent. However, the NCSU program will likely continually cross with NO material, particularly for disease resistance; therefore, HO selection at both loci will be perpetual.
What remains an open question at present is the utility of breeding for the minor modifiers affecting oleic acid content. Identifying the underlying, small-effect genes would likely take considerable time and effort, as would selecting for them with molecular markers. Ultimately, the answer to this question likely depends on the success of future efforts to reduce contamination. If maintaining pure seed lots, particularly at large scale, remains problematic then selection for minor modifiers may be worthwhile. However, if contamination issues can be resolved, selection for minor modifiers is likely unnecessary.
The insignificance of dig date, and by extension seed maturity, simplifies both breeding and production. Due to practical considerations, nursery plots are generally dug and harvested at the same time. This may have caused otherwise promising material to fail the HO screen and be discarded despite carrying the desired mutations. Thus, greater care would have been needed to harvest each line at optimum maturity to assess oleic acid content accurately. This result also enables flexibility during production harvests, as weather and limited availability of specialized equipment can be accommodated without fear of affecting oleic acid content.
The contamination issues are not endemic to any particular line and vary based on seed source or even random sampling error within a large seed bag. Repeating the experiment with different seed sources may reveal different lines with heavy contamination than those reported here. Contamination can occur at many points in the breeding process including planting, digging, harvesting, shelling, and seed handling. Relying on a slow machine to screen limited quantities of seed, and failure to recognize the large overlap in oleic acid distributions between NO and HO seed, led to the present contamination issues (Fig. 2). As healthy peanut plants are prolific seed producers, even a single inadvertently advanced NO seed could quickly lead to contamination issues since screening was performed sparingly due to labor and time constraints. Therefore, through a combination of genotyping and generational, high throughput screening, low oleic Fig. 2 Oleic acid distribution of pure Bailey and Bailey II seeds in the experiment. The black circle highlights the sizable overlap between the two distributions particularly for near-isogenic lines. Red bars to the right of the black line indicate NO seeds that would have passed the HO screen and contributed to the contamination issue. The green line indicates the threshold needed to eliminate contamination. Figure was created using the seaborn package in Python and edited in Microsoft PowerPoint 1 3 contamination issues in Virginia-type peanuts for the VC region should be resolved.