Published Date
Corn grain
Genotype
Environment
GC–MS
Metabolomics
For further details log on website :
http://www.sciencedirect.com/science/article/pii/S2214514116300241
June 2016, Vol.4(3):177–187, doi:10.1016/j.cj.2016.03.004
Open Access, Creative Commons license, Funding information
Title
Metabolite variation in hybrid corn grain from a large-scale multisite study
Received 26 September 2015. Revised 15 March 2016. Accepted 29 March 2016. Available online 6 April 2016.
Abstract
Metabolite composition is strongly affected by genotype, environment, and interactions between genotype and environment, although the extent of variation caused by these factors may depend upon the type of metabolite. To characterize the complexity of genotype, environment, and their interaction in hybrid seeds, 50 genetically diverse non-genetically modified (GM) maize hybrids were grown in six geographically diverse locations in North America. Polar metabolites from 553 harvested corn grain samples were isolated and analyzed by gas chromatography–mass spectrometry and 45 metabolites detected in all samples were used to generate a data matrix for statistical analysis. There was moderate variation among biological replicates and across genotypes and test sites. The genotype effects were detected by univariate and Hierarchical clustering analyses (HCA) when environmental effects were excluded. Overall, environment exerted larger effects than genotype, and polar metabolite accumulation showed a geographic effect. We conclude that it is possible to increase seed polar metabolite content in hybrid corn by selection of appropriate inbred lines and growing regions.
Keywords
1 Introduction
Corn is one of the most important cereal crops worldwide for food, feed, and energy. Starch, protein, oil, and fiber represent the major nutritional components and economic value in corn grain [1], and are the main targets of plant breeding and biotechnology [2], [3], [4], [5] and [6]. In contrast, polar metabolites in corn grain are of low abundance (~ 5%) relative to the cumulative biomass of major seed components [1]. However, seed metabolite content is closely associated with food and feed quality, and increasing attention is being paid to foods that provide health benefits beyond basic nutrition (functional foods). Two different strategies are widely applied to improve the nutritional value of corn grain. One is to increase the content of beneficial metabolites: breeding for high vitamin A and high lysine corn are two noteworthy examples [7], [8] and [9]. A second strategy is to reduce anti-nutritional metabolite content, for example, developing lower-phytic-acid (PA) corn [10], [11], [12], [13], [14] and [15].
A common feature of these two strategies is targeting a single metabolite or a few metabolites at a time for corn breeding. Generally, both are successfully implemented for seed trait improvements of greatest importance for industrial or nutritional value. However, this approach does not take into account the complexity of metabolic networks. Alteration in the amount of one particular metabolite could potentially affect other metabolites, with some changes desirable and others not. This tradeoff was demonstrated by the recent detailed characterization of a low-PA mutant: lower PA and higher free phosphorus (P) in the PA mutant improved grain food quality, but a decrease in gamma tocopherol and increase in Hg, As, and Al content compromised its nutritional value [16] and [17].
In recent years the genetic basis of seed metabolite profiles has been investigated, and several metabolite QTLs have been identified [18] and [19]. These studies also revealed that seed metabolite content is sensitive to environmental fluctuations, so that the heritability of metabolite traits is usually low. In addition, improved seed metabolite traits were often accompanied by yield penalties in several crop species [20] and [21]. An exception was recently discovered in hybrid plants, in which certain metabolite contents can be enhanced by mechanisms that do not incur a yield penalty [22]. These findings showed that it is practical to improve some seed metabolite traits without yield loss.
The metabolite inheritance pattern in hybrid seeds still remains largely unexplored. However, it is clear that genotype, environment, and the interactions between these two factors play different roles in shaping the seed metabolite profile and contents. Yang et al. [23] generated 30 genetically related corn hybrids by crossing six female inbred lines (from a common Stiff-Stalk progenitor) with five different male inbred lines (from Non-Stiff Stalk), and grew them in two geographically similar locations in Illinois. Parallel metabolic and transcriptional profiling revealed marked variation even among genetically related corn hybrids. This study also suggested that hybrid seed metabolite content is a multigene trait and that the genetic interactions among these genes remains poorly understood. Reynolds et al. [24] performed composition analysis of corn grain from seven hybrids grown at four test sites and found both genotype and environment to be determinants of seed biochemical composition. Harrigan et al. [18] reported that corn hybrid (genotype) influenced the number and type of grain metabolites in response to water deficit. Metabolic profiling of corn hybrids derived from 48 inbreds crossed to two different testers showed that metabolite pool size was highly dependent on genotype; certain metabolite classes showed a tester effect, while others showed either non-interacting or interacting tester and location effects [25] and [26]. These studies suggested that manipulation of seed metabolite content can be achieved by selection of appropriate tester lines for hybrid production. Röhlig et al. [27] showed that growing season was the most prominent factor influencing metabolite variation when four cultivars were grown at one location for three consecutive seasons. Furthermore, by metabolite profiling of one cultivar grown for three years at four locations, the authors found that natural variation in corn grain metabolite pools was the result of interplay between location, season, and genotype. Different chemical classes could show differences in a genotype- or environment-dependent manner. Recently, Cong et al. [28] reported that crude protein, manganese, β-carotene, and all amino acids except lysine in maize grain were more affected by environment than by genotype. In contrast, most proximates and fibers, all fatty acids, lysine, and most minerals, vitamins, and secondary metabolites in maize grain were affected by genotype more than by environment. A strong interaction between genotype and environment was seen for some analytes.
In this study, 50 genetically diverse non-GM maize hybrids were grown at six locations representing three different climate zones in North America. Forty-five polar metabolites in corn grain were then quantified and different statistical methods were compared for metabolite signature extraction. This study provides additional insight about metabolite variability in hybrid corn seed and its implications for nutritional improvement.
2 Materials and methods
2.1 Plant materials
Fifty non-GM maize hybrid varieties (or genotypes) from DuPont-Pioneer HiBred were grown in six locations (Texas, Kansas, Illinois, Nebraska, Minnesota, and Ontario). At each site 20 varieties were selected based on their maturity zone (Table S1). Each test site was divided into three blocks and the selected 20 varieties were grown in each block in a randomized manner. Grain samples (F2 seeds) were collected at the R6 stage from all varieties in all blocks. In each block, five hand-pollinated ears were collected from the same variety and the shelled ears were pooled as one sample. One or two samples were collected per block per variety, so that three or six samples per variety were collected from each test site [29] and [30].
Fifty non-GM maize hybrid varieties (or genotypes) from DuPont-Pioneer HiBred were grown in six locations (Texas, Kansas, Illinois, Nebraska, Minnesota, and Ontario). At each site 20 varieties were selected based on their maturity zone (Table S1). Each test site was divided into three blocks and the selected 20 varieties were grown in each block in a randomized manner. Grain samples (F2 seeds) were collected at the R6 stage from all varieties in all blocks. In each block, five hand-pollinated ears were collected from the same variety and the shelled ears were pooled as one sample. One or two samples were collected per block per variety, so that three or six samples per variety were collected from each test site [29] and [30].
2.2 Polar metabolite extraction and derivatization
Metabolites were extracted from lyophilized ground powder of grain samples. Samples of dry weight 5.5–6.5 mg were weighed and transferred into 2 mL microfuge tubes and 0.75 mL of chloroform was added. Samples were incubated at 55 °C with rotation for 30 min, and then 0.75 mL of deionized water (containing 5 μg mL− 1 ribitol internal standard) was added and incubated for an additional 30 min. Samples were then centrifuged at 1500 × g for 15 min to allow phase separation. Of the upper aqueous phase, 660 μL was transferred into a 2 mL glass autosampler vial and evaporated to dryness in CentriVap Console (Labconco, USA). Test samples from the same site were divided into batches (Table S2). One reference sample was included with each batch and analyzed three times during the batch run to monitor instrumental variation. The grain reference samples were obtained by pooling and mixing thoroughly powdered grain from all Illinois varieties, and metabolites were extracted as described above.
Metabolites were extracted from lyophilized ground powder of grain samples. Samples of dry weight 5.5–6.5 mg were weighed and transferred into 2 mL microfuge tubes and 0.75 mL of chloroform was added. Samples were incubated at 55 °C with rotation for 30 min, and then 0.75 mL of deionized water (containing 5 μg mL− 1 ribitol internal standard) was added and incubated for an additional 30 min. Samples were then centrifuged at 1500 × g for 15 min to allow phase separation. Of the upper aqueous phase, 660 μL was transferred into a 2 mL glass autosampler vial and evaporated to dryness in CentriVap Console (Labconco, USA). Test samples from the same site were divided into batches (Table S2). One reference sample was included with each batch and analyzed three times during the batch run to monitor instrumental variation. The grain reference samples were obtained by pooling and mixing thoroughly powdered grain from all Illinois varieties, and metabolites were extracted as described above.
The dried extracts were dissolved in 120 μL of 20 mg mL− 1 methoxyamine hydrochloride in pyridine and incubated at 37 °C for 90 min to form methoxyamine derivatives. Then, 120 μL of N-methyl-N-(trimethylsilyl) trifluoroacetamide (MSTFA) plus 1% (v/v) TMCS was added and the extracts were incubated at 37 °C for 90 min to yield trimethylsilyl derivatives.
2.3 GC–MS analysis
Derivatized metabolite mixtures were analyzed with a 6890 gas chromatograph equipped with 5973 mass selective detector and 7683 series injector (Agilent Technologies, Palo Alto, CA). Helium flow was 1 mL min− 1. One μL sample was injected with a split ratio of 1:30 and resolved on a 30 m × 0.25 mm × 0.25 μm ZB-5MSi column (Phenomenex, USA). The temperatures of the inlet, interface, and ion source were 230, 250, and 200 °C, respectively. After a 5-min solvent delay at 80 °C, the oven temperature was increased at a rate of 5 °C min− 1 to 310 °C, where it was held for 6 min before returning to 80 °C for the next cycle. Electron impact (70 eV) mass spectra were recorded from m/z 50 to 600 at 2.69 scans per second. The instrument was auto-tuned for mass calibration using perfluorotributylamine (PFTBA).
2.4 Data preprocessing
Raw data files (with suffix. d) were converted into network common data form (.netCDF) and exported to the Automatic Mass spectral Deconvolution and Identification System (AMDIS_32) for spectral deconvolution [31] and database search against the NIST Mass Spectral Database (Rev.D.04.00) and Golm metabolomics library (http://csbdb.mpimp-golm.mpg.de/csbdb/gmd/msri/gmd_msri.html). A list of ion-retention time pairs (IRt) was generated. The IRt data were exported into METabolomics Ion-based Data Extraction Algorithm for automatic peak alignment, annotation, and integration [32]. Ions were extracted and quantified based on retention time and ion mass/charge (m/z) pairs. The output was generated in Microsoft Excel format with rows representing samples and columns representing identified metabolites.
The dataset was then interrogated manually to remove system contaminants, correct annotations if necessary, and remove uninformative data. Compounds that were identified in the negative control sample were eliminated. Compounds identified with high confidence but whose annotation was uncertain were annotated with a “?” prefix. Metabolites that were present in corn grain but with undetermined chemical identities (so called “known unknown” metabolites). A total of 45 metabolites were identified across all grain samples (Table S3). The intensity value of each metabolite was normalized to both the ribitol internal standard signal and the sample dry weight, and then normalized to the respective reference sample within the same batch to convert each metabolite's intensity value to a ratio relative to its counterpart in the reference samples [33]. The resulting data matrix was subjected to statistical analysis.
The dataset was then interrogated manually to remove system contaminants, correct annotations if necessary, and remove uninformative data. Compounds that were identified in the negative control sample were eliminated. Compounds identified with high confidence but whose annotation was uncertain were annotated with a “?” prefix. Metabolites that were present in corn grain but with undetermined chemical identities (so called “known unknown” metabolites). A total of 45 metabolites were identified across all grain samples (Table S3). The intensity value of each metabolite was normalized to both the ribitol internal standard signal and the sample dry weight, and then normalized to the respective reference sample within the same batch to convert each metabolite's intensity value to a ratio relative to its counterpart in the reference samples [33]. The resulting data matrix was subjected to statistical analysis.
2.5 Statistical analysis
Relative standard deviation (RSD% = standard deviation/mean × 100) for each metabolite was calculated in Microsoft Excel as a measure of data variation [34] and [35].
Pair-wise t tests were performed between metabolites and among varieties. The number of metabolites (given as % of the total number of metabolites) that were significantly different (P < 0.01, after Bonferroni correction) between each pair of varieties was tabulated (Tables 1 and S4). A value of zero (%) for a pair means that no metabolite was significantly different between the pair. Analysis of variance (ANOVA) was performed (in the R statistical package, version 3.0) for metabolites using location or genotype as treatments. The ANOVA P-values were logit transformed using logit (P) = lg (P) – lg (1–P), and a frequency histogram of logit (P) was plotted. A more equal distribution of logit (P) values indicates a more significant effect of a treatment [27].
Pair-wise t tests were performed between metabolites and among varieties. The number of metabolites (given as % of the total number of metabolites) that were significantly different (P < 0.01, after Bonferroni correction) between each pair of varieties was tabulated (Tables 1 and S4). A value of zero (%) for a pair means that no metabolite was significantly different between the pair. Analysis of variance (ANOVA) was performed (in the R statistical package, version 3.0) for metabolites using location or genotype as treatments. The ANOVA P-values were logit transformed using logit (P) = lg (P) – lg (1–P), and a frequency histogram of logit (P) was plotted. A more equal distribution of logit (P) values indicates a more significant effect of a treatment [27].
The corn grain metabolome data were pretreated by centering and scaling and principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were performed on the correlation matrix [29]. The correlation matrix was preferred over the covariance matrix, owing to large differences in the values of measured variables. R was used for PCA and PLS-DA and principal components (PC) or latent variables (LV) 1 and 2 were used to plot and visualize the scores. Key metabolites exerting the largest effect on the overall variability were identified by component loading scores [27]. Loadings indicate the proportional contribution of each variable (metabolite) to PC scores. The higher the absolute loading value, the more important was the corresponding metabolite.
Hierarchical clustering and heat map analysis were performed for mean-centered and standardized data in R. Replicate values were averaged where appropriate and Ward's method for Euclidean distance matrices was used for clustering [29].
3 Results
3.1 Data quality
During instrumental analysis, the same reference sample was included within each batch run and analyzed three times at the beginning, middle and end. Technical variation was calculated, showing that the average metabolite RSD was in the range 2.3–14.5%, with about half of the metabolites showing average RSD below 10.0%. These findings show acceptable reproducibility within a batch. However, there was considerable variation across batches owing to batch effects [33]. To the data across batches, each metabolite of a test sample was normalized to its nearby reference sample from the same batch run, a normalization step found to be effective for suppressing batch effects [33]. With this adjustment, we expect that the observed variation among test samples is biological.
3.2 Polar metabolite variation among 20 corn hybrid varieties grown at the same test site
To investigate within-sample variation, biological replicates of each individual hybrid variety grown at the same test site were used, the RSD value for each individual metabolite was calculated, and the results are presented as scatter plots. Half of the metabolites from the 20 entries showed RSD below 50% and a few showed RSD above 100%. Of all the samples from Ontario, three metabolites (IDs 19, 22, and 43) from three varieties (5, 6, and 7) showed RSD above 100%, suggesting that metabolite variation among Ontario samples was relatively small. The majority of the metabolites from the Texas test site showed RSD below 50%. Metabolites with RSD above 100% were concentrated in variety 1 samples (Fig. 1), suggesting that variety 1 was diverse in TX.
3.3 Metabolite variation across six test sites
To assess the data variation across test sites, the mean RSD from all individual metabolites was calculated by test site. Overall, the majority of metabolites had mean RSD values below 100% across six test sites, except for some outliers (Fig. 2). The scatter plot indicated that samples from Texas showed smaller RSD values for most metabolites. In contrast, samples from Illinois and Kansas showed larger mean RSD values for many metabolites (Fig. 2).
3.4 The number of growing locations affects data variability
In this study, 50 DuPont Pioneer non-GM commercial hybrids differing in genetic background and maturity group were grown at six test sites across commercial maize production regions in North America. At each test site, 20 varieties were selected based on their maturity range. This maturity-zone constraint resulted in an unbalanced experimental design. Five varieties representing broad genetic diversity were grown in all six locations, and the other 45 varieties were grown in 1–5 test sites. Thus, the 50 varieties can be divided into six groups depending on the test sites at which they were grown, regardless of their maturity range. When the mean RSD was calculated for each individual variety, the scatter plot (Fig. 3) revealed that the mean RSD of a variety had a positive correlation with the number of test sites: the more test sites at which a variety was grown, the larger was the mean RSD. For the 17 varieties that were grown at only one test site (Fig. S1), their mean RSD essentially reflected the data variability among the biological replicates. The majority (15 of 17) showed a mean RSD below 42%, suggesting that the biological variation within an individual variety was moderate. Variety 47 grown in Texas showed the lowest mean RSD (23%), while variety 31 grown in Illinois showed the highest mean RSD (57%). The mean RSDs for the other 33 varieties, grown in two or more locations, were all above 41%. This finding suggests that the grain polar metabolome varied considerably among as well as within locations.
In this study, 50 DuPont Pioneer non-GM commercial hybrids differing in genetic background and maturity group were grown at six test sites across commercial maize production regions in North America. At each test site, 20 varieties were selected based on their maturity range. This maturity-zone constraint resulted in an unbalanced experimental design. Five varieties representing broad genetic diversity were grown in all six locations, and the other 45 varieties were grown in 1–5 test sites. Thus, the 50 varieties can be divided into six groups depending on the test sites at which they were grown, regardless of their maturity range. When the mean RSD was calculated for each individual variety, the scatter plot (Fig. 3) revealed that the mean RSD of a variety had a positive correlation with the number of test sites: the more test sites at which a variety was grown, the larger was the mean RSD. For the 17 varieties that were grown at only one test site (Fig. S1), their mean RSD essentially reflected the data variability among the biological replicates. The majority (15 of 17) showed a mean RSD below 42%, suggesting that the biological variation within an individual variety was moderate. Variety 47 grown in Texas showed the lowest mean RSD (23%), while variety 31 grown in Illinois showed the highest mean RSD (57%). The mean RSDs for the other 33 varieties, grown in two or more locations, were all above 41%. This finding suggests that the grain polar metabolome varied considerably among as well as within locations.
3.5 Univariate analysis of polar metabolites of hybrid corn grain from the same test site identified genotype effects
To assess genotype effects on the polar metabolome of hybrid corn grain, the environmental influence was eliminated by comparing only samples of 20 varieties grown at the same location. The percentages of metabolites with statistically significant change (P-value < 0.01 with Bonferroni correction for multiple comparisons) between a pair of varieties were calculated for each test site. There were thus 45 comparisons for each pair of varieties and 8550 comparisons per test site. The percentage of metabolites that differed significantly (P < 0.01) between any pair of varieties was used as a measure of genotype effects between a given pair of varieties (under the influence of a specific environment or test site): the larger the percentage, the less similar were the two varieties. For the majority of paired varieties, fewer than 10% of metabolites showed significant differences (Table 1 and Table S4). For example, in Texas, varieties 2 and 3 were significantly different from most other varieties, whereas varieties 1 and 4 (Table 1, above) showed no difference in their metabolome profile. At the Illinois test site, variety 33 showed a significant difference from most other varieties (Table 1, below). However, there was no general trend, as the metabolome differences among varieties were mostly test site-dependent.
To assess genotype effects on the polar metabolome of hybrid corn grain, the environmental influence was eliminated by comparing only samples of 20 varieties grown at the same location. The percentages of metabolites with statistically significant change (P-value < 0.01 with Bonferroni correction for multiple comparisons) between a pair of varieties were calculated for each test site. There were thus 45 comparisons for each pair of varieties and 8550 comparisons per test site. The percentage of metabolites that differed significantly (P < 0.01) between any pair of varieties was used as a measure of genotype effects between a given pair of varieties (under the influence of a specific environment or test site): the larger the percentage, the less similar were the two varieties. For the majority of paired varieties, fewer than 10% of metabolites showed significant differences (Table 1 and Table S4). For example, in Texas, varieties 2 and 3 were significantly different from most other varieties, whereas varieties 1 and 4 (Table 1, above) showed no difference in their metabolome profile. At the Illinois test site, variety 33 showed a significant difference from most other varieties (Table 1, below). However, there was no general trend, as the metabolome differences among varieties were mostly test site-dependent.
3.6 Hierarchical clustering and heat map of grain polar metabolome from same test site
The samples of 20 varieties grown at the same test sites were subjected to hierarchical clustering (HCA) and heat map analysis (Fig. 4, Fig. S2). In Texas, the majority of the metabolite contents in variety 2 and 3 samples were lower than those of other varieties, and these two varieties clustered together. Variety 4 formed a cluster with variety 5 instead of variety 1 (Fig. 4-left). These observations are in agreement with the univariate analysis in Table 1. For the Illinois test site, the univariate analysis showed that variety 33 was significantly differentiated from other entries (Table 1), and the HCA analysis showed that variety 33 contained a lower metabolite content than the others and was most similar to variety 34. In contrast, varieties 36 and 38 showed higher metabolite contents and clustered together (Fig. 4-right). However, no genotype clusters were clearly common to two or more test sites, except that variety 2 and 4 clustered together at the Illinois, Kansas, and Ontario test sites. This finding indicates that either the genotypes performed differently under different environmental conditions, or that genotypic effects were weakly expressed under certain environments.
The samples of 20 varieties grown at the same test sites were subjected to hierarchical clustering (HCA) and heat map analysis (Fig. 4, Fig. S2). In Texas, the majority of the metabolite contents in variety 2 and 3 samples were lower than those of other varieties, and these two varieties clustered together. Variety 4 formed a cluster with variety 5 instead of variety 1 (Fig. 4-left). These observations are in agreement with the univariate analysis in Table 1. For the Illinois test site, the univariate analysis showed that variety 33 was significantly differentiated from other entries (Table 1), and the HCA analysis showed that variety 33 contained a lower metabolite content than the others and was most similar to variety 34. In contrast, varieties 36 and 38 showed higher metabolite contents and clustered together (Fig. 4-right). However, no genotype clusters were clearly common to two or more test sites, except that variety 2 and 4 clustered together at the Illinois, Kansas, and Ontario test sites. This finding indicates that either the genotypes performed differently under different environmental conditions, or that genotypic effects were weakly expressed under certain environments.
3.7 Hierarchical clustering and heat map of maize grain metabolome across six test sites reveal genotype and environment interaction
Although the variety selection for each test site was based on maturity range, varieties 1–5, which represented broad genetic diversity, were grown in all six locations, providing opportunities to evaluate genetic and environmental interactions. HCA showed that samples were clustered mostly by location, and not by variety (genotype) (Fig. 5). Illinois and Nebraska samples showed full clustering, whereas Texas samples were least clustered (Fig. 5). Given that varieties 1 to 5 represent different genetic backgrounds based on genotyping data for 3000 SNPs [28], our results suggested that environmental factors dominate in shaping the corn grain polar metabolome, especially at the Illinois and Nebraska test sites. In Texas, varieties 4 and 5 clustered together, varieties 2 and 3 also clustered together, and variety 1 constituted a separate cluster (Fig. 5). The metabolome-based cluster pattern did not exactly match the SNP-based phylogeny, in which variety 1 and 4 were more closely related to each other [28]. These observations suggest that samples from the Texas test site were subject to larger interactions between genotype and environment. It was also clear that variety 5 in Kansas was differentiated from varieties 1 to 4 and that variety 3 in Ontario performed differently compared with the other four varieties.
All 50 varieties across the six test sites were also subjected to HCA and heat map analysis. Samples from Minnesota, Kansas, and Texas showed relatively higher concentrations of certain metabolite groups (shown in red), whereas samples from Illinois, Nebraska, and Ontario showed lower concentrations of most metabolites (Fig. 6). Interestingly, samples from Minnesota showed much higher metabolite contents than samples from Ontario despite their geographic proximity (Fig. 6). Samples from Minnesota and Ontario were all harvested in October, 2010. The 2010 weather data indicated that Ontario received 3.4 in. of rain in October, whereas Minnesota did not (Table S5). The precipitation in Ontario may have resulted in higher seed moisture, permitting higher seed respiration activity and lower polar metabolite content.
All 50 varieties across the six test sites were also subjected to HCA and heat map analysis. Samples from Minnesota, Kansas, and Texas showed relatively higher concentrations of certain metabolite groups (shown in red), whereas samples from Illinois, Nebraska, and Ontario showed lower concentrations of most metabolites (Fig. 6). Interestingly, samples from Minnesota showed much higher metabolite contents than samples from Ontario despite their geographic proximity (Fig. 6). Samples from Minnesota and Ontario were all harvested in October, 2010. The 2010 weather data indicated that Ontario received 3.4 in. of rain in October, whereas Minnesota did not (Table S5). The precipitation in Ontario may have resulted in higher seed moisture, permitting higher seed respiration activity and lower polar metabolite content.
3.8 Identifying key metabolites that exert the strongest effect on overall variation
To identify the key metabolites contributing to observed sample variation, loading scores of the PC1/2 were examined, and the five metabolites with highest loading scores on PC1/2 were identified for all 50 varieties combined (Fig. 7, left), or for varieties 1–5 (Fig. 7, right). Metabolites such as m41 (sucrose), m36 (unknown), m30 (glucose), m27 (unknown), and m26 (fructose) showed the highest loading scores in PC1/2 of all varieties, or of varieties 1 to 5 (Fig. 7), and contributed most to the variability of the samples. Sucrose is present at high levels in corn grain, but its concentration is highly variable [18]. Harrigan et al. (2007) also found that fructose showed non-interacting tester and location effects [25]. In this study we found that fructose is one of the major sources of variation influenced by both genotype and environment. In seeds, fructose is derived mainly from sucrose breakdown, so that it is reasonable to observe variability for both sucrose and fructose.
4 Discussion
4.1 Polar metabolite variation affects QTL identification
Metabolite QTL identification depends on reliable capture of metabolite variation within a defined population. Large variation within biological replicates places a constraint on reliable QTL determination. In this study we found high polar metabolite variation within biological replicates (Fig. 3). The nature of the hybrid F2 seeds may contribute to this observation. Theoretically, each F2 seed from an F1 hybrid plant has a different genetic makeup, which could result in a different polar metabolome for each individual F2 seed. To minimize this problem, a pooled strategy was adopted, of combining five cobs from the same variety grown in the same block. But this approach generated another challenge: the biological replicates were derived from three individual blocks at the same test site and thus were subject to block-specific effects.
Some metabolites are strongly influenced by genotype and environment interactions [28]. To date, most identified metabolite QTLs have been those showing stronger genotype effects, such as oleic acid content [36]. Depending on the genotype or environment effects on groups of metabolites, different strategies should be adopted for their improvement. For metabolites affected more by environment than by genotype, the potential for modifying their content could be limited. When a metabolite trait shows stronger genotype effects, it is important to select appropriate parental inbred lines for hybrid F1 seed production. If a metabolite trait is strongly affected by the interaction of genotype and environment, large-scale trials should be conducted to identify the suitable growth region for each cross combination.
4.2 Univariate analysis and HCA can detect genotype effects
Previous studies have shown that PCA has limited power to detect genotype effects [29] and [30], and may not be suited to studying complex biological traits such as metabolic response in hybrid seeds. In contrast, univariate analysis treats each metabolite individually and then compares its changes under different genetic backgrounds, so that the overall metabolic difference among varieties could reveal their genotype difference. This study showed that it is possible to apply univariate analysis to detect genotype effects in hybrid seeds. HCA too was successfully applied to detect genotypic effects from samples grown at the same test site as well as interactions of genotype and environment across multiple test sites. Compared with univariate analysis, HCA has the advantage of revealing additional details such as metabolite level and similarities among varieties.
4.3 Genetic diversity and seed metabolome prediction
The 50 hybrid corn varieties used in this study had been genotyped with approximately 3000 SNPs distributed throughout the maize genome. A dendrogram classified them into five major distinct genetic groups and they represent the broad genetic diversity of commercial maize [28]. The clustering analysis based on the metabolome dataset in this study provides opportunities for parallel comparison. Overall, the clustering patterns based on SNP and metabolome data were discordant, likely because genotype, environment and the interactions between genotype and environment all play a role in determining seed polar metabolite content. Thus, SNP-based clustering may not be a reliable predictor of the polar metabolome in hybrid corn seeds. This observation is in agreement with previous reports that environmental factors and interactions between genotype and environment can strongly affect the seed metabolome [18], [23], [24], [25], [26], [27] and [28]. Mean temperature, duration of day or night, average rainfall, soil characteristics, and planting and harvesting date all vary considerably among the six test sites. In addition, the varieties grown in Texas and Kansas had a longer growth season than those at the other test sites. These factors add another layer of complexity to the metabolome variability. Generally, the genetic and physiological mechanisms underlying trait variance in hybrid plants remains unclear. Studies [37] and [38] have suggested that allelic differences in structural genes or catalytic enzymes could contribute to natural variation in plant primary metabolism. Careful selection of parental inbred lines coupled with metabolomics evaluation of F2 hybrid progeny may reveal additional mechanisms how different genes interact to shape the polar metabolome in hybrid seeds.
Acknowledgements
This work was financially supported by DuPont–Pioneer HiBred. The corn grain samples were provided by DuPont-Pioneer HiBred. We thank Dr. Vincent M. Asiago and Jan Hazebroek for valuable scientific contribution to the manuscript.
Appendix A Supplementary data
References
- [1]
- Description, Development, Structure and Composition of the Corn Kernel
- Corn: Chemistry and Technology, P.J. White, L.A. Johnson, 2003, American Association of Cereal Chemists, St. Paul, pp. 69–106
- [2]
- Quantitative trait loci influencing protein and starch concentration in the Illinois long term selection strains
- Theor. Appl. Genet., Volume 87, 1993, pp. 217–224
- [3]
- Maize selection passes the century mark: a unique resource for 21st century genomics
- Trends Plant Sci., Volume 9, 2004, pp. 358–364
- | |
- [4]
- Transposon tagging and molecular analysis of the maize regulatory locus opaque-2
- Science, Volume 238, 1987, pp. 960–963
- [6]
- A phenylalanine in DGAT is a key determinant of oil content and composition in maize
- Nat. Genet., Volume 40, 2008, pp. 367–372
- |
- [7]
- Natural genetic variation in Lycopene Epsilon Cyclase tapped for maize biofortification
- Science, Volume 319, 2008, pp. 330–333
- |
- [8]
- Mutant gene that changes protein composition and increases lysine content of maize endosperm
- Science, Volume 145, 1964, pp. 279–280
- [9]
- Second mutant gene affecting the amino acid pattern of maize endosperm proteins
- Science, Volume 150, 1965, pp. 1469–1470
- [10]
- Study of low phytic acid 1-7 (lpa1-7), a new ZmMRP4 mutation in maize
- J. Hered., Volume 103, 2012, pp. 598–605
- [11]
- Metabolomics Analysis of Low Phytic Acid in Maize Kernels
- Concepts in Plant Metabolomics, B.J. Nikolu, E.S. Wurtele, 2007, Springer, The Netherlands, pp. 221–238
- [12]
- Effect of genetically modified, low–phytic acid maize on absorption of iron from tortillas
- Am. J. Clin. Nutr., Volume 68, 1998, pp. 1123–1127
- [13]
- Phenotypic, genetic and molecular characterization of a maize low phytic acid mutant (lpa241)
- Theor. Appl. Genet., Volume 107, 2003, pp. 980–987
- |
- [14]
- Origin and seed phenotype of maize low phytic acid 1-1 and low phytic acid 2-1
- Plant Physiol., Volume 124, 2000, pp. 355–368
- |
- [15]
- Embryo-specific silencing of a transporter reduces phytic acid content of maize and soybean seeds
- Nat. Biotechnol., Volume 25, 2007, pp. 930–937
- |
- [16]
- Phytic acid prevents oxidative stress in seeds: evidence from a maize (Zea maysL.) low phytic acid mutant
- J. Exp. Bot., Volume 60, 2009, pp. 967–978
- |
- [19]
- Application of a metabolomic method combining one-dimensional and two-dimensional gas chromatography time of flight/mass spectrometry to metabolic phenotyping of natural variants in rice
- J. Chromatogr. B, Volume 855, 2007, pp. 71–79
- | |
- [20]
- Relationship between metabolic and genomic diversity in sesame (Sesamum indicum L.)
- BMC Genomics, Volume 9, 2008, p. 250
- |
- [21]
- Identification of loci affecting flavor volatile emissions in tomato fruits
- J. Exp. Bot., Volume 57, 2006, pp. 887–896
- |
- [22]
- An evaluation of the costs of making specific secondary metabolites: does the yield penalty incurred by host plant resistance to insects result from competition for resources?
- Int. J. Pest Manage., Volume 53, 2007, pp. 175–182
- |
- [23]
- Omics technologies reveal abundant natural variation in metabolites and transcripts among conventional maize hybrids
- Food Nutr. Sci., Volume 4, 2013, pp. 335–341
- |
- [24]
- Natural variability of metabolites in maize grain: differences due to genetic background
- J. Agric. Food Chem., Volume 53, 2005, pp. 10061–10067
- |
- [27]
- Metabolite profiling of maize grain: differentiation due to genetics and environment
- Metabolomics, Volume 5, 2009, pp. 459–477
- |
- [29]
- Effects of genetics and environment on the metabolome of commercial maize hybrids: a multisite study
- J. Agric. Food Chem., Volume 60, 2012, pp. 11498–11508
- |
- [30]
- Effect of environment and genotype on commercial maize hybrids using LC/MS-based metabolomics
- J. Agric. Food Chem., Volume 62, 2014, pp. 1412–1422
- |
- [31]
- An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometery data
- J. Am. Soc. Mass Spectrom., Volume 10, 1999, pp. 770–781
- | | |
- [33]
- A modified data normalization method for GC–MS-based metabolomics to minimize batch variation
- SpringerPlus, Volume 3, 2014, p. 439
- [34]
- Spectral relative standard deviation: a practical benchmark in metabolomics
- Analyst, Volume 134, 2009, pp. 478–485
- |
- [35]
- Analytical precision, biological variation and mathematical normalization in high data density metabolomics
- Metabolomics, Volume 1, 2005, pp. 75–85
- |
- [36]
- Whole genome scan detects an allelic variant of fad2 associated with increased oleic acid levels in maize
- Mol. Gen. Genomics., Volume 279, 2008, pp. 1–10
- |
- [37]
- Overdominant quantitative trait loci for yield and fitness into tomato
- Proc. Natl. Acad. Sci. U. S. A., Volume 103, 2006, pp. 12981–12986
- [38]
- Candidate genes and quantitative trait loci affecting fruit ascorbic acid content in three tomato populations
- Plant Physiol., Volume 143, 2007, pp. 1943–1953
- |
- Peer review under responsibility of Crop Science Society of China and Institute of Crop Science, CAAS.
- ⁎ Corresponding author at: Department of Biochemistry, Christopher S. Bond Life Science Center, University of Missouri, Columbia, MO 65211, USA. Tel.: + 1 573 884 5979; fax: + 1 573 884 9676.
For further details log on website :
http://www.sciencedirect.com/science/article/pii/S2214514116300241