Agronomy Journal Journal of Natural Resources and Life Sciences Education
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via ISI Web of Science (5)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Chang, J.
Right arrow Articles by O'Neill, M.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Chang, J.
Right arrow Articles by O'Neill, M.
Agricola
Right arrow Articles by Chang, J.
Right arrow Articles by O'Neill, M.
Related Collections
Right arrow Crop Models
Right arrow Site-Specific Analysis
Right arrow Statistics
Published in Agron. J. 95:1447-1453 (2003).
© American Society of Agronomy
677 S. Segoe Rd., Madison, WI 53711 USA

MODELING

Corn (Zea mays L.) Yield Prediction Using Multispectral and Multidate Reflectance

Jiyul Chang*,a, David E. Claya, Kevin Dalstedb, Sharon Claya and Mary O'Neillb

a S. Clay, Plant Sci. Dep., South Dakota State Univ., Brookings, SD 57007
b Eng. Resour. Cent., South Dakota State Univ., Brookings, SD 57007

* Corresponding author (jiyul_chang{at}sdstate.edu).

Received for publication December 5, 2002.

    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 SUMMARY
 REFERENCES
 
Yield predictive models based on multiple sampling dates may explain more yield variability than models based on a single sampling date. This study determined the influence of two approaches (multiple regression of either soil and crop multispectral and multidate reflectance or variables developed during principal-component analysis of reflectance data) on corn (Zea mays L.) yield predictions. Research was conducted in 1999, 2000, and 2001. Corn yield data from two 65-ha fields were collected with a yield monitor, and crop and soil radiance [green, red, and near-infrared (NIR) wavebands] was measured three times (April–May, July, and August–September). Relative radiance (reflectance) was determined by dividing the measured radiance by the radiance at invariant target. Stepwise multiple regression based on reflectance or principal components was used to develop predictive equations. Multiple-regression models based on 2 yr of reflectance data were biased and provided poor estimates of yield whereas a model based on variables developed during principal-component analysis of reflectance data measured in the spring and summer of 1999 and 200 was unbiased and explained 45% of the corn yield variability in 2001. Differences between these models were attributed to multicollinearity of the data. Models that included data from two or three sampling dates generally explained more yield variability than models that used only one sampling date. Reflectance measured early in the season provided information about soil water and color while reflectance measured in August and September provided information about plant conditions.

Abbreviations: GNDVI, green normalized vegetation index • NDVI, normalized difference vegetation index • NIR, near infrared


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 SUMMARY
 REFERENCES
 
SPECTRAL RADIANCE collected at different dates provides different information about the system (Ryerson and Curran, 1997). In tilled fields, during the early part of the growing season, reflectance is primarily influenced by soil characteristics (Huete et al., 1985). Different soils have different spectral characteristics. For example, Barnes and Baker (2000) reported that reflectance in the visible and NIR portions of the spectrum were greater from sandy than fine-textured soil. Elvidge and Lyon (1985) showed that normalized difference vegetation index (NDVI) values were larger for pixels containing vegetation and dark soil than pixels containing vegetation and light-colored soil.

As the season progresses, spectral reflectance characteristics are increasingly influenced by plant factors. For example, Carter (1993) reported that dehydration stress increased reflectance in the visible wavelengths (535–640 nm and 685–700 nm) but had little or no impact on reflectance in the infrared range. Weigand et al. (1991) used the NDVI [(NIR - red)/(NIR + red)] collected in June to estimate cotton (Gossypium hirsutum L.) yields in Texas. They reported that NDVI accounted for more than 75% of the cotton yield variability. Li et al. (2000) had similar results and reported that Texas cotton yields were correlated to NDVI and soil moisture. Staggenborg and Taylor (2000) reported that the green normalized vegetation index [GNDVI = (NIR - green)/(NIR + green)] explained 40% of the corn yield variability observed in 13 Kansas fields. Weigand et al. (1999) showed that an equation based on NIR, red, and yellow/green bands collected in May explained 85% corn yield variability in Texas. Yang and Everitt (2000) had similar results for grain sorghum (Sorghum bicolor L.) and reported that an equation based on NIR, red, green, NIR/red, NIR/green, NDVI, and GNDVI data collected in May explained 85% of the yield variability in monitored Texas fields. Senay et al. (1998) showed that reflectance in the NIR was more strongly correlated to corn yield than reflectance measured in the visible bands and that yield was highly correlated to elevation (r = 0.92). Shanahan et al. (2001) proposed that GNDVI measured during mid–grain filling could be used to develop relative yield maps depicting spatial corn yield variability in fields before harvest.

It may be possible to improve the accuracy of predictive equations by basing the equations on several sampling dates and using principal-component analysis to develop independent variables (Schowengerdt, 1997; Johnson, 1998). Hong et al. (2001) used principal-component analysis to develop independent variables from hyperspectral data. Predictive equations based on these variables were able to explain 70% of the corn yield variability and 39% of the soybean [Glycine max (L.) Merr.] yield variability in Missouri. The studies described above did not test the impact of using multiple remote sensing sampling dates and principal-component analysis on improving yield predictions by models. The objective of this study was to determine the influence of two approaches (multiple regression of either soil and crop multispectral and multidate reflectance or variables developed during principal-component analysis of reflectance data) on corn yield predictions.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 SUMMARY
 REFERENCES
 
Ground Scouting Data
This research was conducted in two 65-ha corn–soybean rotation fields located in east-central South Dakota. The latitude and longitude values were 44°10' N and 96°37' W, respectively, for a field identified as Moody and 44°13' N and 96°39' W, respectively, for a field identified as Brookings. Elevation ranged from 518 to 534 m in Moody and from 505 to 518 m in Brookings. Soil nutrients and soil series information for these sites are available in Clay et al. (2001). Corn was harvested after physiological maturity by a combine equipped with an eight-row header (4.6 m) and a calibrated yield monitor. Procedures used to remove erroneous data from the yield monitor data included eliminating points where the combine speed was slower than 1.78 m s-1 or faster than 3.05 m s-1 and when the flow rate exceeded ±3 standard deviations of the average flow rate. ArcView (ESRI, Redland, CA) geographic information system (GIS) software was used to determine the average yield for fifty 0.1-ha grid cells located along four transects in each field. Bias in the yield monitor data set was assessed by comparing yield monitor–estimated yields in these cells with hand-harvested yields (5-m2 areas).

Remote Sensing
In 1999, 2000, and 2001, multispectral radiance was collected on three sampling dates corresponding to different physiological growth stages. The first sampling date was between germination and second-leaf stage, the second sampling date was between the six- to eight- leaf stage, and the third sampling date was between R2 (blister) to R4.5 (soft dough) (McWilliams et al., 1999). In 1999 and 2000, multispectral data were collected with a digital camera mounted for NADIR viewing on a plane flying at 1500 m above mean sea level between 1000 and 1400 h at local time on cloud-free days. The pixel sizes were <= 1 m, and the spectral bands collected were green (557–582 nm), red (647–672 nm), and NIR (720–920 nm). In 1999, images were collected on 28 May, 27 July, and 21 September, and in 2000, images were collected on 24 May, 28 July, and 29 August. In 2001, multispectral data were obtained by the IKONOS satellite on 17 May, 14 July, and 21 August. IKONOS images had 4-m spatial resolution, and the spectral bands used in the analysis were green (520–600 nm), red (630–690 nm), and NIR (760–900 nm).

Relative radiance (reflectance) was calculated using equation

where Ro was a radiance of objective and Rref was a radiance of invalid target (reference) in remote sensing data (Avery and Berlin, 1992). The reference radiance for each field and sampling date was obtained from an invariant target (the center of an intersection of two gravel roads located adjacent to the study sites) that was approximately 15 by 15 m in size.

At least four control points in each field were used by IMAGINE (ERDAS, Atlanta, GA) for georegistration. ArcView and ArcView Spatial Analyst were used for mapping, and spatial analyses were used to convert yield and remote sensing data into a common grid-cell system. Remote sensing data was used to calculate NDVI and GNDVI. Yield information was overlaid on the aerial images. Correlation coefficients (r) between corn yields and reflectance or indices were determined.

Model Development
The models for predicting yields were based on remote sensing data collected at three sampling dates. The remote sensing data used in the models were green, red, NIR, NDVI, and GNDVI. Seven different combinations of remote sensing data were used to predict corn yield. The combinations were first sampling date (System 1), second sampling date (System 2), third sampling date (System 3), first and second sampling dates (System 4), first and third sampling dates (System 5), second and third sampling dates (System 6), and all sampling dates (System 7).

Principal-component analysis of reflectance data set was conducted using PROC PRINCOMP available in SAS (SAS Inst., 1995; Johnson, 1998). The correlation matrix was used to calculate eigenvalues and eigenvectors.

Stepwise multiple regression of reflectance or independent variables developed in principal-component analysis of reflectance were used to develop predictive equations (SAS Inst., 1995; Johnson, 1998). In the regression analysis, the significant level for entry (SLE) of independent variables into the model was 0.2, and the significant level for stay (SLS) was 0.05. Corn yield models and coefficients of determination (R2) were developed for each system. To test the transferability of the models, the models generated from data collected in 1999 and 2000 were used to predict corn yields in 2001.


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 SUMMARY
 REFERENCES
 
Research Site Characteristics
Yields were generally lower in summit/shoulder areas than footslope areas (Clay et al., 2001). For example, Clay et al. (2001) reported that corn yields in 1999 in summit areas averaged 7.1 Mg ha-1 while yields in footslope areas yield averaged 11.1 Mg ha-1. Yield differences were attributed to the summit/shoulder areas being drier than footslope areas. For example, on 10 June and 4 August in 1999, soil water (0–60 cm) in the footslope was 0.32 and 0.28 g g-1, respectively, whereas in the summit/shoulder areas, soil water at these dates was 0.24 and 0.15 g g-1, respectively (Clay et al., 2001). Landscape position differences in NIR pixel values were clearly visible in images collected at Moody in May (Fig. 1) . Lower NIR pixel values on 28 May in footslope than summit/shoulder areas were attributed to higher soil water contents in footslope than summit soils (Clay et al., 2001) and higher organic C in footslope (27.4 g kg-1, ±4.5) than summit (22.8 g kg-1, ±1.8) soils. In images collected on 27 July, it was difficult to identify different landscape positions whereas in images collected in September, NIR pixel values were higher in footslope than summit areas. Differences in September were attributed to different crop conditions at the different landscape positions.



View larger version (65K):
[in this window]
[in a new window]
 
Fig. 1. Near-infrared band aerial images for Moody taken in (a) 28 May 1999, (b) 27 July 1999, and (c) 21 Sept. 1999.

 
Correlation between Yield and Remote Sensed Data
At Moody in 1999, corn yield was negatively correlated to reflectance in the green, red, and NIR bands collected in May (Table 1). At the second and third sampling dates, NIR reflectance was positively correlated to yield. The change in the sign of the correlation coefficients between yield and NIR reflectance as the season progressed indicates that surface factors (soil and vegetations) responsible for spectral reflectance changed with sampling date. For example, at the first sampling date, a negative correlation between corn yield and NIR was attributed to the amount of soil water present in the profile. High reflectance indicated low water content that contributed to low yields (Huete et al., 1985) (Fig. 1a). The positive correlation between NIR reflectance at the third sampling date and yield was attributed to either biomass or maturity differences in these areas. Areas with high NIR reflectance at the first sampling date had low reflectance at the third sampling date (Fig. 1c). These results show that initially, remote sensing data provided information about soil moisture and/or organic C and that as the season progressed, it increasingly became influenced by plants. Ryerson and Curran (1997) had similar results and showed that sampling date of remote sensing data influenced the information collected.


View this table:
[in this window]
[in a new window]
 
Table 1. The correlation coefficients between remote sensing data collected at three sampling dates and corn yield at Moody in 1999.{dagger}

 
Within a sampling date, many of the spectral bands and indexes were also correlated to each other (Table 1). For example, at Moody green reflectance was generally positively correlated to red and NIR and negatively correlated to NDVI and GNDVI. Similar relationships between spectral bands were observed in Brookings in 2000 and Moody in 2001.

In summary, remote sensing data collected at different sampling dates provided information about different phenomena. In April and May, remote sensing data provided information about soil water and color while in August and September, remote sensing data provided information about crop conditions.

Model Development
The primary eigenvalues and eigenvectors for the Moody, Brookings, and combined analysis of Brookings and Moody are reported in Table 2. All of the eigenvectors developed during principal-component analysis are available in Chang (2002).


View this table:
[in this window]
[in a new window]
 
Table 2. Sample eigenvalues (parentheses are proportion of variability explained) and eigenvectors developed during principal-component (PC) analysis. PC1,1 for Moody, Brookings, and Moody and Brookings represents PCM1,1, PCB1,1, and PCMB1,1, respectively. The 2,1 indicates the first eigenvector of Model System 2.

 
At Moody in 1999, the amount of yield variability explained by the multiple regression models based on a single date of remote sensing data (System 1, 2, and 3) ranged from 17 (System 2) to 81% (System 3) (Table 3). When two dates of remote sensing data were used, the amount of yield variability explained by the models (System 4, 5, and 6) ranged from 78 to 85%. The model based on all three remote sensing dates (System 7) explained about 85% of the yield variability, which was similar to the amount of yield variability explained by Systems 5 and 6.


View this table:
[in this window]
[in a new window]
 
Table 3. The remote sensing principal-component (PC) yield models for data sets based on 1 yr of information (Moody or Brookings) and 2 yr of information (Moody and Brookings). The different models considered different information sources. Remote sensing data were collected at three dates (first sampling date, germination to second-leaf stage; second sampling date, six- to eight-leaf stage; and third sampling date, R2 to R4.5). The M1,2 in PCM1,2 indicates the second eigenvector in Model System 1 at Moody.

 
At Brookings in 2000, the amount of yield variability explained by the models based on either one or two dates (System 1 to 6) ranged from 66 to 91% (Table 3). Adding the third remote sensing date (System 7) had a small impact on improving yield predictions. These results were similar to those reported for 1999 at Moody.

A combined (1999 and 2000) model based on data from a single remote sensing sampling date explained between 44 and 75% of the yield variability observed in the data set (Table 3). When two different dates of remote sensing data were used to develop the models (System 4, 5, and 6), the amount of yield variability explained increased. Models that included data collected at the third sampling date (System 5 and 6) generally explained more yield variability than models that did not include this information. The model based on all three sampling dates (System 7) explained a similar amount of variation as the model based on data collected at the first and third sampling dates (System 5).

The models that were developed from reflectance explained similar amounts of yield variability as the models developed using principal components (Table 4). These models had similar relationships between data included in the regression and the amount of yield variability explained.


View this table:
[in this window]
[in a new window]
 
Table 4. The relative reflectance yield models for data sets based on 1 yr of information (Moody or Brookings) and 2 yr of information (Moody and Brookings). The different models considered different information sources. Remote sensing data was collected at three dates (first sampling date, germination to second-leaf stage; second sampling date, six- to eight-leaf stage; and third sampling date, R2 to R4.5).

 
These results suggest that combining data collected from different dates improved the models ability to explain yield variability and that models based on multiple regression of principal components and reflectance explained similar amounts of yield variability. The influence of sampling was attributed to different dates providing different types of information.

Model Testing
The y-intercepts and slopes of regression equations relating predicted and observed yield for principal-component models based on a single year of data collection (Moody or Brookings) were generally different from 0 and 1, respectively (Table 5). These results indicate that predicted yields based on principal-component models developed from a single year of data were biased. For models based on 2 yr of data, slightly different results were observed. For Model System 4, 5, 6, and 7, the y-intercepts were not different from 0. Model System 4 and 6 had slopes that were not different from 1 at the 5% level. The system with the slope closest to 1 and y-intercept closest to 0 was the system where data was collected at the first and second sampling dates (Model System 4).


View this table:
[in this window]
[in a new window]
 
Table 5. The amount of yield variability explained by the model using principal components (PC) and relative reflectance (RR) during testing and validation. In the validation process, models based on data collected in Moody (1999), Brookings (2000), and Moody and Brookings were used to predict corn yields at Moody in 2001. The models were based on remote sensing data collected at three sampling dates (first sampling date, germination to the second leaf; second sampling date, six- to eight-leaf stage; and third sampling date, R2 to R4.5).

 
The y-intercepts of models (based on reflectance) relating predicted and observed values for all data sets were greater than 0, and the slopes of these models were less than 1. Increasing the amount of data considered by the reflectance model did not have a consistent impact on reducing bias. The models based on reflectance generally overestimated corn yields.

A comparison of the two approaches to develop models is shown in Fig. 2 . This comparison shows that within the range of the measured values, the equation relating predicted and measured yields based on multiple regression of principal components (Model System 4) and reflectance was not and was different from the 1:1 line, respectively. Differences between the two approaches were attributed to multicollinearity. Multicollinearity in the data set can result in impact variables being dropped out of the regression equation. Principal-component analysis avoids this problem by developing new independent variables that are combinations of original data sets.



View larger version (16K):
[in this window]
[in a new window]
 
Fig. 2. A comparison between measured and estimated corn yields at Moody in 2001. The predictive equations were based on data collected at the two sampling dates (Model System 4, spring and summer) over 2 yr (1999 and 2000), tested on data collected in 2001, and developed from multiple regression using (a) principal components and (b) relative reflectance.

 

    SUMMARY
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 SUMMARY
 REFERENCES
 
The ability to explain yield variability was influenced by sample collection date and the number of sampling dates used to develop the models. Models based on several sampling dates generally explained more yield variability than models that relied on a single sampling date. These results were attributed to soil water and organic matter being related to remote sensing data collected in the spring while remote sensing data collected in August and September provides information about crop conditions. The approach used in this paper for developing the models is conceptually different than many other attempts. Qi et al. (1994), Bausch (1993), Cleaves (1989), and Plant et al. (2000) proposed techniques that minimize the contribution of soil in reflectance values. The approach proposed in this paper includes this information in the predictive models. A possible advantage of our approach was that remote sensing data from bare soil provides information about soil drainage, organic matter content, and soil texture. All of these factors influence soil water, which in turn, impacts yield.

Results from this study show that principal-component analysis followed by multiple regression may be used to develop remote sensing predictive models. Differences between the two approaches to develop predictive equations (multiple regression of reflectance or variables developed by principal-component analysis of reflectance data) may be related to multicollinearity.


    ACKNOWLEDGMENTS
 
Support for this project came in part from NASA, South Dakota Soybean Research and Promotion Council, United Soybean Board, North Central Soybean Research Program, and USDA-CSREES. South Dakota Agricultural Experimental Station no. 3361.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 SUMMARY
 REFERENCES
 





This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via ISI Web of Science (5)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Chang, J.
Right arrow Articles by O'Neill, M.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Chang, J.
Right arrow Articles by O'Neill, M.
Agricola
Right arrow Articles by Chang, J.
Right arrow Articles by O'Neill, M.
Related Collections
Right arrow Crop Models
Right arrow Site-Specific Analysis
Right arrow Statistics


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Crop Science Vadose Zone Journal
Journal of Natural Resources
and Life Sciences Education
Soil Science Society of America Journal
Journal of Plant Registrations Journal of
Environmental Quality
The Plant Genome