The genome is 137116 bp long (Table 1). The LSC is 81132 bp long, the SSC is 12820 bp long and the IR is 21582 bp long. No ambiguous bases were found in the genome.
As stated above, SNP calling type of analysis were rendered difficult by the presence among 7 out of 8 of the other available genomes of numerous non-attributed bases. Instead, analyzes focused on the presence of indels. To do so, chloroplast genomes were partitionned by sub-units, aligned using MAFFT 7 [12] and then vizualized using MEGAX [13].
Results provided evidences of the strong proximity between S. sylvestre Host introd. no. 6047 and Secale strictum voucher R 1108 (KY636137). A total of 16 indels were found to be common between these two strains, that discriminate them from all other (KC912691, KY636135, KY636136, KY636132, KY636134, KY636133, KY636138). The size of these indels ranges from 2 to 36 bp. Among these indels, 13 of were found in intergenic sequences (rpl32 – tRNA-L; psaC – ndhE; rrn16 – trnI-GAU; atpH – atpF; psaA – ycf3; trnT-UGU – trnL-UAA; trnF-GAA – ndhJ; atpB – rbcL; ycf4 – cemA; trnP-UGG – psaJ; psaJ – rpl33; clpP – psbB; rpl16 – rps3). It is worth being underlined that the last three indels occurred in intronic sequences, one inside a tRNA (intron trnK-UUU), two inside protein-coding genes (intron rps16; intron petD), a feature that received recent attention [14, 15], especially for the purpose of genetic distinction between closely related species [16].
Limitations
The protocol itself showed no limitation, as it allowed to obtain complete and non-ambiguous genome sequence. However, far more clean genome sequences are needed in order to describe the most reliable molecular markers for species identification and phylogeny, especially for what concerns SNPs.