Genetic diversity and heterotic orientation of South Africa maize inbred lines towards tropical and temperate testers

An efficient hybrid breeding program defines and utilizes a few heterotic groups. Objectives of this study were to determine genetic diversity and alignment of South Africa maize inbred lines collection towards tropical and temperate testers. Forty-four maize inbred lines were genotyped with 56,110 single nucleotide polymorphism (SNP) markers. A total of 40 lines were crossed to two tropical and two temperate inbred line testers. Due to seed shortages, testcross hybrids containing 14 and 18 lines were used in 2014/2015 and 2015/2016 seasons, respectively. Genotypes and specific combining ability (SCA) effects of hybrids were significantly different (P < 0.05) for grain yield. There was a weak correlation between molecular genetic distances and both grain yield mean performance and SCA effects of hybrids, indicating that productivity of maize inbred lines could not be reliably determined based on molecular genetic distances. The SCA data classified these maize inbred lines into three heterotic groups with respect to both tropical and temperate testers. The study also indicated high level of diversity among the maize inbred lines, which was shown by both the dendogram and molecular genetic distances. The SNP marker data classified the inbred lines into 11 clusters that could be simplified into three major groups of normal maize endosperm and two groups of quality protein maize endosperm types. However, the SNP data indicated that maize lines were more aligned towards the tropical than temperate inbred testers. This information would be useful for simplifying heterotic classification of the lines with profound implications for breeding progress.


Introduction
Advances in maize breeding have an important role in continually developing and deploying new maize cultivars, exhibiting improved grain yield and desirable agronomic characters. The strength and success of any breeding program depends on a well-established and properly arranged germplasm (Rajendran et al. 2014), which provides potential exploitation of maximum heterosis in crosses (Bidhendi et al. 2012;Librando and Magulama 2008). Therefore, identification of heterotic groups (HGs) and heterotic patterns is the principal step towards developing a well-established germplasm (Barata and Carena 2006;Delucchi et al. 2012) for a hybrid oriented breeding programme. Melchinger and Gumber (1998) defined a heterotic group as a group of related or unrelated genotypes from the same or different populations, which show similar combining ability and heterotic response when crossed with genotypes from other genetically different groups, whereas heterotic pattern is observed when a cross between a specific pair of two heterotic groups expresses high level of heterosis.
Classification of germplasm into heterotic groups offers several advantages in most breeding programs. In the recent years, bio-fortification such as the use quality protein maize is highly encouraged to curb malnutrition. When inbred lines from different known and unknown origins are used, a large number of hybrid combinations are made which are later evaluated in multi-location environments (Bidhendi et al. 2012). However, this is labour-intensive, lengthy and costly (Aguiar et al. 2008). However, when heterotic groups are formed, undesirable combinations are avoided by crossing inbred lines from the existing and divergent groups (Aguiar et al. 2008). This allows the exploitation of maximum heterosis (Melchinger and Gumber 1998) among selected lines in hybrid combination, reduces the chance of missing superior hybrids and subsequently improves breeding efficiency. According to Fan et al. (2008) breeding efficiency is the percentage of superior hybrids, which are obtained relative to the total number of crosses made between lines from distinct heterotic groups in a breeding program. The employment of simple heterotic groups would simplify germplasm management (Fan et al. 2003). Selecting parental lines from defined heterotic groups is rapid and efficient (Kanyamasoro et al. 2012). The information on simplified heterotic orientation is crucial for setting up an efficient maize breeding program.
Knowledge of combining ability among maize inbred lines in a program is paramount for establishing heterotic orientations. General combining ability (GCA) and specific combining ability (SCA) as defined by Sprague and Tatum (1942) are used to determine utility of maize inbred lines. They reveal the ability of inbred lines to combine amongst each other during hybridization and transmit the desirable genes to the F 1 crosses (Fasahat et al. 2016). A survey of the literature indicates that SCA is an important statistical tool for classifying maize germplasm into heterotic groups. The SCA data has a high predictive value for F 1 grain yield than heterosis data (Betran et al. 2003). However, the SCA data can be used together with other methods that are cheaper and even more precise such as molecular markers which are not affected by genotype x environment interaction effects. These molecular markers include single nucleotide polymorphisms (SNPs) which are abundant and easily automated.
Genotyping maize inbred lines with molecular markers and pedigree analyses are some of the methods which can be used to establish levels of diversity and classify germplasm into potential heterotic groups (Parentoni et al. 2001;de Pinto et al. 2003;Fan et al. 2003Fan et al. , 2004Menkir et al. 2004; Barata and Carena 2006;Aguiar et al. 2008;Dhliwayo et al. 2009;Semagn et al. 2012;Badu-Apraku et al. 2013Rajendran et al. 2014;Nyombayire et al. 2016). Effectiveness of these methods in classifying germplasm is mainly validated by estimating SCA of the hybrids using a specific mating design such as the line by tester method. The line x tester mating scheme utilizes a few divergent testers of known heterotic groups to determine heterotic orientation of inbred lines. Among other tools, the line x tester mating scheme has been widely used and reported in the literature (Vasal et al. 1992;Menkir et al. 2003;Li et al. 2007;Aguiar et al. 2008;Fan et al. 2010;Rajendran et al. 2014;Fasahat et al. 2016).
The public maize germplasm lines in South Africa have been maintained in at least seven heterotic groups since the beginning of professional maize breeding in the country. These heterotic groups have been utilised extensively in different combinations to develop commercial maize hybrids, since the Saunders and Gevers eras (Fourie and Gevers 1987;Gevers and Whythe 1987;Gevers and Lake 1998). The use of a large number of heterotic groups from different populations complicates the breeding process and decision making during selection of parental lines. Employment of many heterotic groups is associated with several major drawbacks, including high costs of making and evaluating hybrids, and extended timeline to obtain results (de Pinto et al. 2003;Aguiar et al. 2008;Bidhendi et al. 2012) and subsequent reduction in breeding efficiency (Fan et al. 2014). This prompts the need to simplify the heterotic classification of the lines in South Africa for use in breeding.
Due to climate change, there is need to continuously introgress and incorporate germplasm from other countries, such as tropical Africa, tropical South America and USA temperate maize inbred lines. This would enhance genetic diversity and adaptation of the lines to climate change effects, such as increasing frequency and severity of drought, and global warming among other challenges.
Given the foregoing, the objectives of the current study were to determine diversity of South Africa maize inbred lines, and investigate their orientation towards the widely utilized tropical and temperate testers. The use of tropical and temperate testers was prudent because South Africa maize production falls within subtropical to warm temperate conditions. In this regard, both tropical and temperate maize germplasm could be exploited to develop productive hybrids, using a simplified heterotic classification system.

Germplasm
The forty maize inbred lines and two temperate testers (B73 and MO17) were selected from the Agricultural Research Council of South Africa maize germplasm database while the other two tropical testers (CML312 and CML444) came from CIMMYT. Six representative maize inbred lines were randomly selected from each of the seven traditional heterotic groups. The seven heterotic groups were described by fore-bearers of the public maize breeding programs in the region, such as Saunders, Gevers, Olver, Fourie and others (Gevers and Whythe 1987;Gevers and Lake 1998;Olver 1998) and Fourie (2017, personal communication). According to the literature, the inbred lines used were developed from germplasm which was introduced from the USA, Australia and also mainly from the local open-pollinated varieties which mostly originated from the USA corn-belt (Gevers 1997). There is a possibility of infusion of temperate and tropical genetic backgrounds through both incorporation and introgression of USA's temperate germplasm, in South Africa. Therefore, it was prudent that the 40 lines were crossed with four testers, representing equally the heterotic Groups A and B of both tropical and temperate origin. The maize inbred line testers were represented by the International Maize and Wheat Improvement Center's (CIMMYT) tropical testers from Group A (CML312) and Group B (CML444). The temperate testers were represented by the Group A (B73) and Group B (MO17) inbred lines from the USA. These testers and their derivatives are widely used worldwide. The B73 and MO17 maize testers are temperate inbred lines representing the US Reid (heterotic group A) and Lancaster (heterotic group B), respectively. These testers were developed by the University of Missouri and Iowa State University, respectively. The CML312 and CML444 are CIMMYT tropical maize inbred lines that were developed by CIMMYT breeding programs in Mexico and Harare in Zimbabwe, respectively. The CML312 is an intermediate maturing and subtropical line that is drought tolerant. The CML444 is a late maturing and mid-altitude or subtropical adapted inbred line. According to CIMMYT (2001), the heterotic Group A to which CML312 belongs consists of the pro-Tuxpeno (Mexico), Kitale (Kenya), BSSS and N3 (Zimbabwe) germplasm types. Most of the germplasm which is aligned to this group is more of dent grain texture. The CML444 is the counterpart of the CML312 in CIMMYT's heterotic Group B. It corresponds to the ETO, Ecuador 573, Lancaster and the SC (Zimbabwe) germplasm (CIMMYT 2001). The germplasm lines in this group tend to be of a flint grain texture type in the CIMMYT heterotic system. CML444 is drought and low nitrogen tolerant (CIMMYT 2001). The N3 and SC heterotic pattern has been reviewed and discussed in detail (Derera and Musimwa 2015;Musimwa and Derera 2017 (29 years). During the 2015/16 season, the Potchefstroom Station received 67% and Cedara got 78% of the long term average total rainfall of 541 mm and 662 mm, respectively; while both stations received 93 -95% of the long term total rainfall in the 2014/15 season. The 2015/16 season was characterised by the El Nino associated drought, which affected the whole Southern Africa region. Maximum daily average temperatures were at least 2°C above the long term average of 29°C and 25°C at Potchefstroom and Cedara stations, respectively. The average night or minimum temperatures were at least 1°C below the long-term average of 15°C and 14°C at Potchefstroom and Cedara, respectively.
All trials were laid out as a-lattice designs with two replicates within the sites. A compound fertiliser (NPK, 3:2:1) was applied as a basal fertiliser prior to sowing, at a rate of 25 kg N ha -1 , 17 kg P ha -1 and 8 kg K ha -1 . Each hybrid entry was planted to a tworow plot of 4 m length, and the intra-and inter-row spacing were 0.25 m and 0.75 m, respectively. Thirtyfour seeds (two per hill) were initially sown and later thinned to 17 vigorous plants per row to achieve 53,333 plants ha -1 . Lime ammonium nitrate (LAN, 33% N) was applied as a top dressing at a rate of 150 kg ha -1 , at four weeks after crop emergence. Weeds were controlled using pre-emergence herbicide, Bateleur Gold 650EC (Flumetsulam (sulfonanilide), s -metalachlor), at 1.3 L ha -1 and postemergence herbicide, Basagran (480 g/L bendioxide (thiadiazine)) at 2.5 L ha -1 . This chemical weed control was augmented by occasional manual weeding when needed. Insecticides were applied to control stalk-borer using Karate (50 g/L lambda-cyhalothrin) at 70 ml ha -1 . Due to the below normal total rainfall which was experienced at both stations, supplementary irrigation water was applied until crops reached physiological maturity, especially during the 2015/16 season. Grain yield per plot was estimated and converted to tonnes ha -1 at 12.5% grain moisture content in line with the standard practice for maize production in South Africa.
Field data analysis and classification of lines Analysis of variance (ANOVA) was performed on plot means for grain yield data across sites for each season. Lines that did not cross successfully with all testers were not included in the analysis hence due to seed shortages, testcross hybrids containing 14 and 18 lines were used in 2014/2015 and 2015/2016 seasons, respectively. Furthermore, one of the four testers did not cross well in the hybrids used in the 2014/15 season. Specific combining ability effects for grain yield were estimated using the line by tester procedure, which is embedded in the Analyses of Genetic Designs computer software with 'R' (AGD-R, version 3.0) (Rodriguez et al. 2015). Lines and testers were treated as fixed effects. The statistical model used for the combined analysis was as follows: where Y ijkl is the measured grain yield data of the testcross of the ith line crossed to jth tester in the lth site and kth replicate; l is the grand mean; L i is the effect of the ith line; T j is the effect of the jth tester; E l is the effect of the lth site; (L 9 T) ij is the interaction effect of the cross between the ith line and the jth tester; (L 9 E) il is the interaction effect of the ith line and the lth site; (T 9 E) jl is the interaction effect of the jth tester and the lth site; R k (E l ) is the effect of the kth replicate nested in the lth site; (L 9 T 9 E) ijl is the interaction effect of the ith line, jth tester and the lth site and E ijkl is the random error term.
Classification of the inbred lines into heterotic groups was done based on the specific combining ability (SCA) value of a line and a tester cross for grain yield as proposed by Vasal et al. (1992). Lines that showed negative SCA effects for grain yield when crossed to a particular tester were deemed to belong to the same heterotic group with the tester; whereas lines that exhibited positive SCA effects with one tester indicate that those lines belong to the opposite heterotic group. The lines displaying positive SCA with both testers (A and B) were allocated to the AB heterotic group and those displaying negative SCA with both testers could not be classified on the basis of SCA data.

SNP genotyping and diversity analysis
The 44 maize inbred lines were planted in the glasshouse for DNA sampling. Leaf samples bulked from four plants of each inbred line were taken at two weeks after planting. Genomic DNA was extracted at the Agricultural Research Council-Grain Crops laboratory, following a modified CTAB procedure (Doyle 1987). DNA samples were sent to the Agricultural Research Council-Biotechnology platform in South Africa for genotyping using the single nucleotide polymorphism (SNP) markers. Genotyping was carried out using the Illumina MaizeSNP50 Bead chip, which was derived from the B73 reference genome (www.illumina.com). The protocol provided 56110 SNP markers, which were densely covering all the ten chromosomes of the maize genome. Quality control check was applied to the SNP data by filtering based on SNP call rate of [ 90% and a minor allele frequency of [ 5%. The genetic distance (dissimilarity) estimates among inbred lines were calculated based on the Gower's distance (Gower 1971). Genotypes were subsequently grouped by the Unweighted Pair Group Method using Arithmetic average (UPGMA) algorithm, using the R Statistical Program version 3.3 (Team R Core 2015).

Diversity analysis based on SNP markers
The UPGMA dendogram of the 40 maize inbred lines and the four testers is shown in Fig. 1. The SNP markers divided the South Africa germplasm into 11 clusters (Table 1). The principal clusters were the USA Corn-belt temperate containing the two testers B73 and MO17, and the other mega cluster consisted of South Africa maize inbred lines (Fig. 1). The South Africa maize inbred lines were on the same side with the tropical testers, CML444 and CML312 ( Fig. 1), indicating that they were more inclined towards the tropical CIMMYT testers than the USA temperate testers. The South Africa maize inbred lines had shorter average genetic distance from the tropical testers than the temperate testers (Table 1).
The average genetic distances between lines and different testers are shown in Table 2. The widest (0.95) mean genetic distance (an average of 0.901 and 0.993) was between the South Africa lines and the USA Corn-belt testers. The distance ranged from 0.68 to 1.19, with the minimum distance between P598MSV and B73 and the maximum distance between J80W and MO17. The mean genetic distance with the tropical testers was 0.65 (an average of 0.647 and 0.648) and ranging from 0.52 to 0.80. The lowest genetic distance was estimated between U127Y and both CIMMYT testers, CML312 and CML444, while the largest genetic distance was found between J80W and RO452W with both tropical testers.
There were generally large average genetic distances between lines within clusters, ranging from 0.31 to 0.54 (Table 3). The minimum distance was between the temperate inbred lines in cluster 1, whereas the maximum was in clusters 3 and 6. The distance between lines within cluster 11 showed the widest range (0.17-0.70). The average genetic distance between clusters ranged from 0.70 to 0.86. The minimum genetic distance was recorded between clusters 6 and 8 and clusters 8 and 9; while the maximum genetic distance was between clusters 1 and 11 (Table 3).
The average distances among the current South Africa heterotic groups and among tester lines are shown in Table 4. The average genetic distance among the South Africa heterotic groups ranged from 0.74 (L and K) to 0.86 (I and M as well as M and R). The average genetic distance between the South Africa heterotic groups and the tester maize inbred lines was the highest (1.03, which is an average of 1.08 and 0.98) between group M and temperate testers, while the minimum genetic distance (0.59, which an average of 0.58 and 0.59) was between heterotic group F and the tropical testers (Table 4). The distance between testers was generally large, ranging from 0.31 for B73 and MO17 to 0.91 for MO17 and CML444 (Table 4).

Analyses of variance
The analysis of variance results of grain yield in 2014/15 and 2015/16 are presented in Table 5. During the 2014/15 season, the mean squares associated with the sites main effects was highly significant (P \ 0.001) for grain yield. The results also showed significant differences due to lines main effects (P \ 0.05) and testers main effects (P \ 0.001). Although the genotype x site interaction effects were not significant (P [ 0.05), the site main effects and tester interaction effects were highly significant (P \ 0.001) for grain yield. The mean squares associated with the line x tester, site x line, and site x line x tester interaction effects were not significant (P [ 0.05). There were no significant differences between sites main effects, tester main effects and tester x sites interaction effects, during the 2015/16 season (Table 5). However, the mean squares associated with genotypes, lines main effects and the interaction between sites x lines were highly significant (P \ 0.001) for grain yield, during the 2015/16 season. The interactions between lines x testers, sites x genotypes were significant (P \ 0.01) during the 2015/16 season. The mean squares associated with site x line x tester interaction effects was also significant (P \ 0.05). The results indicated the presence of sufficient variability among the lines to be classified into heterotic groups.
Heterotic grouping of lines based on SCA effects Classification of 26 South Africa maize inbred lines in accordance with their specific combining ability effects with tropical testers is shown in Table 6. Heterotic classification of South Africa maize inbred lines based on their SCA with the tropical CIMMYT maize inbred testers indicated that 10 maize inbred lines were oriented towards CML312 and were fitted in heterotic Group A, 12 inbred lines with negative SCA with CML444 were placed in Group B; while two inbred lines which exhibited positive SCA with both tropical testers were classified as heterotic Group AB. Only two lines, which displayed negative SCA effects with both tropical testers, were not classified on the basis of SCA data. The results indicated a weak correlation (r \ 0.30, absolute value) between genetic distance with both grain yield mean and specific combining ability data of the testcrosses of South Africa lines with tropical inbred testers (Table 6).
Heterotic orientation of 18 South Africa maize inbred lines on the basis of their SCA effects with temperate USA corn-belt testers is shown in Table 7. Five inbred lines which showed negative SCA with B73 were fitted into heterotic group A, while six inbred lines, which showed negative SCA with MO17, were put in Group B. Three inbred lines, which exhibited positive SCA with both temperate testers, were allocated to the Group AB. Four maize inbred lines, which showed negative SCA with both 7 9 1 0 b 10a 11 8 Fig. 1 UPGMA dendogram based on the analysis of 40 maize inbred lines and four testers using the Gower's distance (Gower 1971) temperate testers, could not be fitted into any of the heterotic groups on the basis of SCA data.

Classification of maize inbred lines based on SNP markers
The SNP markers were highly effective in confirming classification of the germplasm collection in South Africa. As expected, the SNP markers divided the maize lines into two principal clusters of USA Corn Belt temperate testers and the South Africa maize lines clustering with the tropical testers from CIMMYT. This indicated that South Africa public lines were more inclined towards the tropical CIMMYT than the USA temperate testers. However, there were 11 subclusters within the South Africa germplasm collections indicating tremendous diversity. Only the lines from the current I heterotic group were placed in the closest cluster with both CIMMYT's tropical testers, CML444 and CML312, indicating that these lines were limitedly introgressed with temperate germplasm, and were more pro-tropical than the rest of the 40 lines in the sample which is consistent with the SNP data. Based on the pedigree data (Table 1), lines with a history of temperate germplasm introgression, such as P614MSV and P612MSV with B73 introgression and P590MSV and P594MSV with MO17 introgression      Table 4 Average genetic distances between current South Africa heterotic groups (HG) and between the inbred testers* (genetic distances between the inbred line testers, and between the inbred testers and the heterotic groups are italicised and in bold)  were placed in clusters 3 and 5, which is near the temperate cluster (Cluster 1) on the dendogram, indicating their orientation towards the temperate material. Clusters 10 and 11, which are furthest away from the temperate cluster on the dendogram are formed by a group of quality protein maize lines, which is a pro-tropical germplasm collection. These lines resulted from the opaque-2 maize breeding program, which led to the release of yellow and white opaque-2 modified inbred lines (Gevers 1972). These lines were bred from tropical germplasm. However, within the QPM group (Table 1), the inbred line MO17HtHtN from the current L heterotic group was non-QPM by origin. Clustering of this line within the QPM collection is an implication of more QPM alleles than normal maize in its genetic background, which could have been introduced along with Northern corn leaf blight (Ht) resistance. Although genotypic classification by SNP markers was closely linked with the origin of inbred lines and pedigree records, there were deviations that were observed. The lines NC258 and U127Y which are currently classified in L and K heterotic groups, respectively, in South Africa, were actually placed in the same sub-cluster with the I and CIMMYT's tropical testers, respectively. This indicates that the classification of South Africa lines in the whole collection might require confirmation with SNP genotyping. In the same vein, lines RO421W (M), K64 (K) and U2540W (K) which were expected to cluster with the current M and K heterotic groups of South Africa, formed singleton clusters, and were not associated with any of the South Africa lines. They have also displayed a lack of association with the USA temperate and tropical CIMMYT lines, qualifying them as representing distinct germplasm groups within the South Africa inbred lines collection.

Classification of maize inbred lines based on specific combining ability
Based on SCA for yield data, inbred lines were grouped into three heterotic groups by both temperate and tropical testers, which are consistent with the system that was described by previous researchers (Gevers 1972). The SCA value reveals the genetic relationship between two parents or between a line and a tester (Vasal et al. 1992). High SCA value (positive) between two parents indicates a distant relationship and low (negative SCA) value is an indication of close relationship (Fan et al. 2003). The tropical testers were capable of discriminating the lines, such that six lines were allocated into tropical group A, indicating that they were oriented towards CML312, whereas nine Grand mean (t ha -1 ) 5.98 Standard error (SCA) at P = 0.05 0.98 * Significant at P \ 0.05; SCA, specific combining ability lines were placed into tropical group B indicating their orientation towards CML444. Only three lines which showed negative SCA with both tropical testers could not be classified on the basis of SCA data. The classification of the South Africa lines using their interaction (SCA effects) with the temperate testers allocated five lines into temperate group A, indicating that they were B73-orientated; while six lines were placed in temperate group B indicating that they were orientated towards MO17. Three lines, which exhibited positive SCA with both B73 and MO17, were classified as temperate AB group on the basis of SCA data; while four lines, which displayed negative SCA effects with both temperate testers, could not be classified.

Relationship between SNP and SCA-based classification
Heterotic orientation of inbred lines corresponded with the SNP marker classification. It was noted that lines that did not belong to any heterotic group when tropical testers were used could be classified clearly using temperate testers and vice versa. For example, in SNP cluster analysis, the I-group representatives were more aligned to both tropical testers. However, heterotic orientation by temperate testers did not classify them into any heterotic group, as revealed by negative SCA effects with both B73 and MO17. The grouping of the I-group lines with only tropical testers was an indication of a distant relationship between the I heterotic group and the temperate lines. This was further supported by the maximum average genetic distance between the I-group and both temperate testers, B73 (0.97) and MO17 (1.07). Therefore, delineation of inbred lines by either temperate or tropical lines is influenced by the predominance of either temperate or tropical genome in their genetic background, which was clearly confirmed by the SNPs marker data. Findings from this study are consistent with previous reports. Adetimirin et al. (2008) also observed that markers clearly discriminated temperate from tropical germplasm. They further observed that some lines that were known to be temperate x tropical were aligned with the tropical lines, indicating that they were more closely related to the tropical than temperate material. In this study, some lines were consistent in their heterotic orientation with both USA temperate and CIMMYT tropical testers. However, the strongest relationship towards either temperate or tropical testers was observed from the magnitude of SCA effects and their placement in the dendogram. There are inbred lines including RO421W, K64 and U254OW that formed distinct singleton clusters. Use of the SCA heterotic orientation approach did not clearly classify these lines. For example, RO421W did not belong to any heterotic group under tropical classification, while classification based on temperate testers allocated this line into the temperate heterotic group AB. The inbred line K64 was assigned into temperate group AB by temperate testers and into heterotic tropical group A by tropical testers.
Grouping lines by SCA effects therefore revealed that when inbred lines are classified into heterotic groups using two sets of testers, failure of one set of testers to classify lines could be an indication of distant relationship between the line and the particular set of testers. Alternatively, classification of some lines might not be clear with both sets, such lines could represent distinct germplasm groups within the germplasm collection. Therefore, when molecular data is not available for validation, there should be no preference for any set of testers over another. However, inclusion of two sets of testers could be a more accurate approach, especially in distinguishing germplasm like the South Africa collection with a history of both tropical and temperate introgression.
The inbred lines used in the study were pooled from seven known South Africa heterotic groups to minimise the number of heterotic groups for improved breeding efficiency. The 11 clusters obtained could be simplified into a broader group of normal endosperm maize and quality protein maize lines. Normal maize endosperm lines may be grouped into three groups, with one group forming temperate germplasm (clusters 1 and 3) with pro-MO17 and B73. The second group is a group of South Africa lines in clusters five and six, with no temperate or tropical alignment, while the I-group lines and those lines with tropical alignment formed the third group. The QPM lines on the far right of the dendogram could be split into two groups (clusters 10 and 11). Intra-group combinations among the normal maize groups and between the QPM groups are expected to give high heterosis in crosses. Cross combinations between normal maize and some lines in the QPM group, particularly the white versions are also recommended to maximise heterosis. The heterotic potential of these groups was shown by a cross between FO215W from the QPM cluster and the tropical tester, CML444 (Table 6). Lines that represented distinct germplasm should also be considered for utilisation in crosses; for example in the current study, high heterosis was revealed by crosses RO421W x B73 and RO421W x MO17 (Table 7). Previous investigations revealed high heterosis in crosses including F and I, M and K, K and P, M and P and crosses between M and F and R and L group combinations (Gevers and Whythe 1987). Based on the new clusters formed, these combinations also included temperate x tropical lines. The identified groups will likely result in superior tropical x temperate combinations with new favourable alleles, wider genetic base, and consequently, improvement in agronomic performance and adaptation of local varieties.
From the current seven heterotic groups of South Africa studied, some groups clearly clustered into the identified clusters. The study therefore indicated that the established patterns would be beneficial in inferring the classification of lines that were not included in the study, but represented in the seven heterotic groups. The F, P and M group lines and the rest of the Opaque-2 modified lines from other South Africa heterotic groups belong to the QPM group. Lines from the L group form a distinct group of normal maize endosperm lines; the I group lines form a distinct group closest to the CIMMYT lines, while the temperate group include the R lines and K group lines with a normal maize endosperm type. However, classification of some South Africa heterotic groups, particularly the K group lines may not be straightforward as lines from this group were distributed in different clusters. Such deviations may be an indication of the presence of diversity within the K group; this also suggests that inbred lines extracted from the same population do not always cluster together. This was also observed in many previous investigations (Vasal et al. 1992;Yu et al. 2001;Fan et al. 2003;Barata and Carena 2006) that reported that maize inbred lines which were derived from the same population classified differently.

Genetic distances
Genetic distances summarise the extent of genetic differences between populations or species (Dogan and Dogan 2016). Genetic distances were estimated to determine the relatedness among inbred lines studied. The average genetic distances computed between clusters, South Africa heterotic groups, inbred lines within clusters and among tester inbred lines indicated the presence of sufficient variability between the identified clusters and therefore the possibility of obtaining superior hybrid combinations from intercluster crosses. The average genetic distances between lines within clusters clearly indicated that lines that formed the same cluster were genetically related by distance, as the average genetic distance ranged from 0.31 to 0.54 between clusters. However, for some clusters, the genetic distance range of inbred lines showed a considerable diversity within a cluster. For example, the genetic distance in cluster 11 ranged from 0.17 to 0.70, therefore showing that the collection of QPM lines that formed this cluster were derived from distinct genetic backgrounds. The maximum average genetic distance between clusters 1 and 11, further demonstrated the reliability of SNP markers in distinguishing between inbred lines. Genetic distances between testers clearly revealed that both temperate USA Corn-belt testers, MO17 and B73, are closely related which reflected their temperate adaptation domain. The closest relationship was further confirmed by the grouping together of these inbred testers on the dendogram. Similarly, the close genetic relationship between tropical CIMMYT testers, CML312 and CML444, was confirmed by the minimum genetic distance between them, and their placement within the same cluster, indicating their tropical adaptation domain.
The genetic differences among the current South Africa heterotic groups were generally large, indicating the presence of diversity within the South Africa germplasm collection. The largest genetic distance was between group R and M, and I and M, indicating genetic dissimilarity between these South Africa heterotic groups. The R group lines were grouped in cluster 3, with temperate alignment, whereas the M group lines were mainly grouped along with the QPM lines in cluster 11; the M group also showed the highest genetic dissimilarity with I group. Inbred lines from heterotic group I formed a distinct group in cluster 7 that was closest to the tropical testers, and were thus not closely associated with the M group lines. The highest average genetic distance between these heterotic groups thus indicated that they are the most distinct heterotic groups within the seven South Africa heterotic groups. The minimum distance between the South Africa heterotic groups K and L was an indication of genetic similarity. Although these groups did not cluster together, their placement by the SNP markers in closest clusters, for example allocation of the K lines in clusters 3 and 4 and the L group lines in cluster 5, could be an explanation of the minimum genetic distance estimates between them. Overall, the average genetic distance was generally large between South Africa inbred lines and temperate testers than with the tropical testers, indicating that although some lines may be aligned towards temperate than tropical testers, the lines are of South Africa origin and were closer to the tropical testers. This is expected because South Africa falls within the subtropical to a warm temperate environment, which places its germplasm between tropical and temperate although with a bias placement towards tropical orientation.
Correlation of genetic distance with grain yield and specific combining ability There were weak correlations detected between genetic distance with grain yield mean and specific combining ability data, indicating that hybrid performance could not be efficiently predicted using molecular marker-based genetic distances. The low predictive value of genetic distance was also confirmed by the degree of heterosis observed in crosses (data not shown). For example, a pair of lines that showed the widest range of genetic distances did not necessarily show maximum heterosis in their testcrosses. These observations are in agreement with previous studies where poor (Makumbi 2005;Parentoni et al. 2001;Xu et al. 2004) to no (Rajendran et al. 2014) correlation was demonstrated between genetic distance with grain yield and SCA. The low correlation observed in several studies has been associated with the markers used to compute genetic distances. The markers used are generally many, evenly distributed on the genome, and may not be closely linked to grain yield and associated quantitative characters (Melchinger 1999). This therefore reduces the association between genetic distances and heterosis. Furthermore, the genetic differences between parents are due to many traits and loci which may not all be correlated with yield. Zhang et al. (2010) thus suggested the use of only yield related markers to analyse genetic differences, and subsequently to reliably predict hybrid performance.
In sharp contrast to the findings in the current study, Schrag et al. (2010) and Gichuru et al. (2017) reported high predictive value of molecular maker genetic distances in heterosis; the observed degree of correlation was associated with the genetic differences and complementary traits of lines used. The ability of genetic distance to predict heterosis in crosses therefore varies with germplasm used (Betran et al. 2003) and probably the set of molecular markers used. In the current study, the observed levels of correlations therefore underlines the need for conducting field trials to evaluate hybrid performance, irrespective of known genetic distances between inbred lines. However, despite the inability of genetic distances to reliably predict hybrid performance, their usefulness as a guide during selection of inbred lines for hybrid make-ups is underscored.

Conclusion
The study confirmed existence of high level of genetic diversity among the South Africa line collection. These lines have shown alignment towards both tropical and temperate testers. However, they were more inclined towards the tropical than temperate testers. The SNP markers genotyping was highly effective in confirming classification of the germplasm collection in South Africa. On the basis of SNP genotyping data the germplasm could be simplified into two broader groups, which are consistent with biochemical composition of the grain endosperm type. The first group comprised quality protein maize lines and the second group is normal endosperm type of maize inbred lines. However, there are further subdivisions within these broad germplasm groups indicating that the program in South Africa has maintained the diversity in both quality protein and normal endosperm maize germplasm. The SNP data was complemented by SCA data which grouped the germplasm into three heterotic groups (A, B and AB), indicating that South Africa germplasm lines have orientation towards both tropical and temperate testers, and that some lines showed heterosis with both testers within the two broad adaptation groups. These findings have profound implications for optimising breeding programmes that target the warm temperate production environments in South Africa.