|
|
||||||||
a Dep. of Crop and Soil Science, 116 ASI Building, The Pennsylvania State Univ., University Park, PA, 16802
b Dep. of Soil Science, North Carolina State Univ., Raleigh, NC 27695-7619
c Dep. of Statistics, North Carolina State Univ., Raleigh, NC 27695-8203
d Dep. of Crop Science, North Carolina State Univ., Raleigh, NC 27695-7620
* Corresponding author (jeff_white{at}ncsu.edu)
Received for publication May 14, 2004.
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: AIC, Akaike Information Criterion ANOVA, analysis of variance CE, correlated errors CIR, color infrared aerial photography ddf, denominator df EGLS, estimated generalized least squares FA, field average GIS, geographic information systems GPS, global positioning system iid, independent and identically distributed LRT, likelihood ratio test RCB, randomized complete block RCBCE, randomized complete block with correlated errors REML, restricted maximum likelihood RYE, realistic yield expectation SSNM, site-specific nitrogen management VRT, variable rate technologies
| INTRODUCTION |
|---|
|
|
|---|
Spatial correlation in the observations and in the residuals of the statistical model used to analyze the data is also common in precision agriculture research. This is due to several factors. It has been well documented that soil physical properties (e.g., Davidoff and Selim, 1988), moisture relations (e.g., Achouri and Gifford, 1984), and soil nutrient status (e.g., Webster and Nortcliff, 1984) vary spatially across fields. Thus, it is always possible that observations such as crop yield will also be spatially correlated and have heterogeneous variances (Scharf and Alley, 1993). Precision agriculture research is often performed on a large spatial scale resulting in more spatial variability within and among blocks and plots than is typical of traditional small-plot research.
In traditional small-plot research, treatments are applied and harvest is conducted on a plot by plot basis. This is a central element in the usual statistical definition of a "plot." In precision agriculture research, plots are large compared with harvest and application equipment, and treatment application and crop harvest are not applied on a plot by plot basis. Instead, applicators and harvesters move across the field in a serpentine fashion changing application rates as they pass from plot to plot, or assigning yield data to individual points depending on their location at any given time. Consequently, the equipment may make passes through several plots before completing harvest or application for any single plot. Thus, the definition of an experimental unit as the smallest unit to which a treatment is applied does not fit exactly.
The classical randomized complete block (RCB) design, which focuses on analysis of variance (ANOVA), is commonly used in agricultural and precision agricultural research. The RCB analysis is based on the assumption that the model errors within blocks are independent and identically distributed (iid) with the same variance. When spatial variability is present at a scale that blocking cannot explain, this independence assumption is likely violated. In addition, in the classical RCB design and its traditional analysis, only one, or possibly a few, observations are taken per plot, whereas the distinctive feature of precision agriculture experiments is that the field is densely subsampled within each plot. While an RCB analysis with subsampling can still be justified based solely on the randomization of treatments within blocks (Brownie et al., 1993), the efficiency of the classical ANOVA in making treatment comparisons and in estimating treatment means is reduced (Stroup et al., 1994). In precision agriculture research where many factors can lead to spatial correlation and a large number of observations are taken within each plot, the RCBiid analysis to test treatment effects may be made more powerful by incorporating spatial correlation.
Zimmerman and Harville (1991) proposed a random field approach that can account for spatially correlated errors (CE) among spatially georeferenced observations, and thus may provide a more efficient estimation of treatment comparisons. Brownie and Gumpertz (1997) investigated the validity and robustness of CE analysis and concluded that the method could achieve substantial increases in precision compared with classical RCBiid analysis when errors are spatially correlated while still maintaining type I error rates close to the desired significance level (usually 0.05). In CE analysis, instead of blocking being used to account for spatial variability, spatial correlation is modeled using one of several possible functions. For example, as implemented in SAS PROC MIXED Version 8 (SAS Institute, Cary, NC), just considering isotropic spatial covariance models, the researcher may choose among linear, spherical, exponential, Gaussian, and power functions, all with or without the inclusion of a nugget effect. Consequently, there are 10 possible models from which to choose. Making matters more complex, some of these candidate models may result in different treatment p values and different estimates of treatment means.
Between the traditional RCBiid method and the CE analysis lies a rich land of hybrid models. These models are considered randomized complete block with correlated errors (RCBCE) because they consist of both random block and/or block x treatment effects in addition to using a spatial covariance structure to model correlated errors. The option to include up to three random effects in a model (i.e., block, or block x treatment, or both) and a choice of 10 spatial covariance structures results in 30 candidate RCBCE models, each of which could result in different conclusions concerning treatment effects and separation of treatment means.
As described above, models with correlated errors may include block effects and plot effects, or not, and quite often the block and plot effects are omitted. The idea of omitting plot effects from a model for an RCB design with subsampling contradicts traditional practice that dictates that the model for RCB designs with subsampling include terms for random block and plot effects. To explain how omitting the block and plot effects may be justified, we describe some of the foundations of the traditional analysis of variance.
The traditional analysis of variance can be derived from two very different starting points: (i) a randomization approach, and (ii) a model-based approach. The randomization approach does not require that the data be normally distributed or uncorrelated, but it does assume that the treatments have been randomly assigned to the plots. The basic unit of data, the experimental unit, in this analysis is the plot mean, because plots have been randomly assigned to treatments; within plots there is no randomization. The distributions of the test statistics in the analysis of variance are based solely on this randomization, and the statistical analysis must include block and plot effects to be valid.
The analysis of variance can be derived by a completely different approach, which does not depend on the random assignment of treatments to experimental units. It makes assumptions about the covariance structure, namely that the block effects, plot effects, and within-block errors are drawn from separate normal distributions that are independent of each other, and that the within-plot errors are independent. These assumptions imply the RCBiid correlation structure, in which all plots within a block are equally correlated, all observations within a plot are equally correlated, and observations in different blocks are uncorrelated. Under the RCBiid correlation structure, the traditional analysis of variance with random block and plot effects provides correct tests of the treatment effects. The validity of the analysis depends on the correctness of these assumptions, not on the randomization. Note, however, that random assignment of treatments is still crucial for guarding against bias caused by accidentally aligning treatments with other confounding factors.
The CE analysis replaces the RCBiid correlation structure with a spatial correlation structure estimated from the data and depends on finding an appropriate correlation structure. It often happens that the correlation pattern that best fits the data does not include random block effects and/or random plot effects. It is possible to model either plot means or individual subsamples. In this paper we model the individual subsamples.
There are some hazards to using a model-based approach because the assumptions of normality and known covariance structure are much stronger than in the randomization approach, which makes no distributional assumptions. If the model is not at least approximately adequate, the analysis can give wildly invalid results. We list three of the hazards: (i) the spatial correlation function is difficult to estimate and the significance test for treatment effects can give very different results for different correlation functions; (ii) treatments are randomly assigned to plots but not to subplots within a plot, so their effects could conceivably be confounded with some other processes, which cannot be disentangled within a plot, leading to biased estimates of the treatment effects; and (iii) there are concerns regarding the validity of using the subsample as the analytical unit, a practice that is termed pseudoreplication in the classical ANOVA, and which often leads to much higher error df (i.e., denominator df [ddf]) than the classical approach for testing treatment effects. We emphasize, however, that we are not advocating that the RCB be abandoned when designing and laying out experiments. It is the classic structure of the RCB with its inherent replication and randomization, augmented by georeferenced subsampling, that provides the framework for our analytical approach.
In the traditional RCBiid analysis of variance, the ddf for the tests of treatment effects are based on the number of plots. In the CE approach the ddf are computed based on the estimated covariance structure rather than on the randomization plan of the experiment, and are often much larger than in the RCBiid approach. A saving grace for precision agriculture trials is that if the RCBiid ddf are larger than 20 or so, increasing them may not have much impact on the critical values for F tests. For example, for testing the equality of three treatments, the critical F value is 3.32 with ddf = 30; whereas, for infinitely large ddf, the critical F value is 3.00, which is not too much different. We study the impact of the ddf on tests of treatment effects in our experiments in more detail in the subsection titled "Model Impact on P Values."
When considering a correlated errors approach to data analysis, two critical questions emerge. The first one is simply "Is this necessary, or would the traditional RCBiid method be sufficient?" The second and most daunting question is, "Given the abundance of potential CE and RCBCE models, which model or models adequately explain the covariance structure and should be used for data analysis?" It is not uncommon for agronomists to simply assume a given model is adequate. For example, Bruckler et al. (1997), analyzing soil NO3 spatial variability, and Weisz et al. (2003), analyzing variable-rate lime and P, simply assumed that the spherical covariance model they used to analyze precision agriculture data was adequate. Marx and Stroup (1993) suggested that no-nugget spatial covariance structures could generally be assumed to be adequate for correlated error analysis of agricultural data. Littell et al. (1996) suggested a general approach to covariance model selection using information criteria and variography. However, it remains a somewhat daunting task to choose among candidate models and determine which analysis may be best for a given data set.
The advent of precision agriculture research brings into high relief the question of how to make the best use of spatially correlated subsamples. In this light, our primary objective is to present a systematic, step-by-step approach that agronomists can use to select a covariance model for RCB analysis when spatial correlation is present. Our secondary objective is to demonstrate, using real examples, how the tests of treatment effects may be affected by the inclusion or exclusion of block effects, nugget effects, and spatial autocorrelation in the model. Toward these objectives, we present analyses of two separate datasets with different within-plot subsampling intensities from two trials in a site-specific N management experiment: wheat forage in 2001, and corn for grain in 2002.
| COVARIANCE MODEL SELECTION PROCEDURE |
|---|
|
|
|---|
![]() | [1] |
is an n x 1 vector of errors, where n, p, and q are the number of responses (i.e., subsamples), fixed effect parameters, and random effects, respectively. Key assumptions for this model are that u and
are uncorrelated and their expectations are zero. If we set the variance of random effects var (u) =
u, and V1 = var (Zu) = Z
uZ' and the variance of errors V2 = var (
), the variance of y is var (y) = V1 + V2. Thus, any correlation of observations can be specified in V2 and/or V1. Building on this foundation, we chose to focus on three potential methods for analyzing RCB designs in the presence of spatial correlation. Each of these methods is described in detail below.
Randomized Complete Block with iid Errors (RCBiid method)
This is the classical RCB analysis with subsampling. It assumes iid within-block errors (
) for which V1 = Z
uZ' and V2 =
2 In, where In denotes the n x n identity matrix. Consequently, any spatial correlation of observations is reflected only in V1.
Randomized Complete Block with Correlated Errors (RCBCE method)
This is a nonclassical RCB analysis with spatially correlated errors in which V1 = Z
uZ'. When a no-nugget model is used to describe the spatial covariance, V2 =
2W, where W is the n x n spatial covariance matrix whose ijth element is defined as a function of the distance (hij) between site i and j. If a nugget model is used to describe the spatial covariance, V2 = In
2g +
2W. The nugget, In
2g, is generally due to significant variation among observations within extremely short or zero separation distance and/or measurement errors (Isaaks and Srivastava, 1989). We consider three isotropic spatial covariance functions: the spherical, Gaussian, and exponential. We chose these because of their applicability in describing spatial covariance commonly encountered in agriculture, because of the form of our exploratory semivariograms, and because they are available in SAS PROC MIXED Version 8. In this approach, any spatial correlation of observations is reflected in both V1 and V2. Clearly the RCBiid model is a reduced form of the RCBCE model, because if no spatial correlation is present, W will reduce to In.
For an example of an RCBCE model, consider an RCB design with three plots in a block and two observations in a plot for which a no-nugget exponential covariance function is chosen to model the spatial covariance. (NB: This analysis can be extended to include any number of observations per plot.) Then, Cov (
i,
j) =
2exp(3hij/
) (Isaaks and Srivastava, 1989), and
![]() |
![]() |
![]() |
, which denote the covariance of two observations within the same plot. The variance components
2p and
2b denote the variance components within blocks, that is, among plots, and among blocks, respectively. Matrix V2 is an n x n symmetric matrix in which the sill
2 is the variance of the random errors. The sill is also the limit of the semivariance as distance increases; it appears at inter-observation distances large enough that observations are no longer correlated. The parameter
is the practical range defined as the value of h at which the semivariogram without nugget reaches 95% of the sill, since the correlation of two observations fit to exponential or Gaussian functions approaches zero only when h goes to infinity. If a spherical function is fit,
is the actual range (Isaaks and Srivastava, 1989). In our example where a no-nugget exponential function is used to model the spatial covariance, SAS PROC MIXED reports the value of the sill in the covariance parameter estimates table, and labels this parameter as the "residual." If a function with a nugget had been specified, SAS PROC MIXED would have labeled the nugget (i.e., microscale variation) value as the "residual," and added a new parameter to the parameter estimates table labeled "variance." In this case, the sill consists of the sum of the residual (i.e., nugget) and the component labeled variance.
Also in our example, we caution that SAS PROC MIXED defines Cov (
i,
j) =
2exp(hij/a) and labels the parameter a as "SP (EXP)." Here the parameter a does not equal the practical range
as stated previously but is one third of
. Thus, the practical range
is computed by multiplying the parameter a by 3 (SAS Institute, 1999a). If a Gaussian function is specified, Isaaks and Srivastava (1989) define Cov
=
2exp
with the parameter
as the practical range. However, PROC MIXED defines Cov
=
2exp
and labels the parameter a as "SP (GAU)," thus the practical range
is computed by multiplying the parameter a by
3 (SAS Institute, 1999a). If a spherical function is specified, SAS PROC MIXED labels the parameter a as "SP (SPH)," where this parameter is the actual range. For a more complete discussion of these functions see SAS Institute (1999a). There are many spatial covariance functions that can be specified in the RCBCE method.
Correlated Errors Analysis (CE method)
The CE method assumes that V1 = 0, which means the random block and block x treatment effects are not considered and are removed from the model in Eq. [1]. Any spatial correlation present is defined only in V2. Clearly the CE model is a reduced form of the RCBCE model. If the random effects of the RCBCE models are determined ineffective, that is, the elements of the matrix
u are all zeros, RCBCE models become CE models. Additionally, all the considerations discussed above for the RCBCE method about specifying a function to model the spatial covariance also apply to CE implementation.
Overview of Model Selection
We believe that a useful guideline for choosing between candidate covariance models is that the selected model should adequately describe the covariance structure but with the fewest covariance parameters. Toward this end, step-by-step procedures describing model selection for a given data set are outlined in Fig. 1
. The first step is to determine candidate RCBCE and CE models. This is done by finding the functions that best model the spatial correlation in the RCBCE and CE methods. Once this is complete, the field of all possible candidate covariance models has been narrowed down to only three: the RCBiid model, one candidate RCBCE, and one candidate CE model. In keeping with our guideline, we propose that analysis should begin with the simplest model (i.e., the RCBiid method) and a test of the hypothesis that the errors are not spatially correlated (Hypothesis 2 in Fig. 1). This is done by comparing the RCBiid and candidate RCBCE model. If this hypothesis is not rejected, the RCBiid method is selected and the process is complete. Conversely, if the hypothesis is rejected, then RCBCE or CE analysis is warranted and the third step is to compare the candidate CE and RCBCE models (Hypothesis 3 in Fig. 1). Also in keeping with our guideline, if blocking in the RCBCE method is not significant, then the CE approach with fewer covariance parameters is selected. If blocking is significant, then the candidate RCBCE model is chosen for data analysis.
|
(h) and of the RCBiid model-fit residuals
e(h)RCB were also computed and compared to determine qualitatively whether blocking accounted for spatial correlation among observations. All semivariograms were computed in PROC VARIOGRAM of SAS Version 8 (SAS Institute, Cary, NC).
The second tool is the Akaike Information Criterion (AIC; Akaike, 1974) that is used to compare the relative goodness-of-fit among non-nested candidate models. The AIC values are computed in SAS PROC MIXED Version 8 as:
![]() | [2] |
The third statistical tool is the likelihood ratio
2 test (LRT). Ideally, we would like to test all hypotheses concerning the relative goodness of fit of potential models with the LRT using REML. This test, however, is only applicable when comparing two nested covariance models with the same X matrix (fixed effects). Consequently, when testing a hypothesis that involves comparison of non-nested covariance models (still with the same X matrix), AIC is used (as described above). When models are nested, hypothesis testing may be possible with the LRT statistic (
) which approaches a
2 distribution, and which is defined as:
![]() | [4] |
In practice, there is another condition that must be met for the LRT to apply. The hypothesized value of the parameter being tested must not be on the boundary of the parameter space. For example, by definition a variance component must be nonnegative, therefore the value zero is on the boundary of the parameter space. In this case, the asymptotic distribution of
does not follow a
2 distribution. When the models being compared differ by such parameters, e.g., variance components or spatial ranges, the distribution of the test statistic
consists of mixtures of
2 distributions (Verbeke and Molenberghs, 2000). If the models being compared differ by only one such parameter, the appropriate p value for the LRT is one-half the reported p value from the
2 distribution with one degree of freedom (consistent with dropping one parameter from the full model), hereafter referred to as LRT01 (Littell et al., 1996). If the models being compared differ by more than one such parameter, then the actual distribution of
is unknown and we use AIC values (as described above) to determine which model has the better goodness of fit.
The use of AIC to determine which model has a better fit is the same as an LRT but uses a critical value of twice the difference in the number of random effect parameters between the two models under consideration (see Eq. [2]). Hence, if the two models have the same number of parameters, the AIC just selects the model with higher likelihood, which may not be a very reliable criterion. If the two models differ by one random effect parameter, the AIC uses a "critical value" of 2, which is less conservative than an LRT, which has a "critical value" of 3.84 (the 95th percentile of
12), or LRT01, which has a "critical value" of 2.71 (the 95th percentiles of LRT01) (Morgan and Gumpertz, 1996). Consequently, while the use of LRT or LRT01 can be seen as adding complexity to model selection compared with simply using AIC values, it also adds a degree of robustness and confidence in the selection process that are absent when using AIC values alone.
Spatial Covariance Structure Selection for the RCBCE and CE Methods
The first step in Fig. 1 is to use a semivariogram of the fixed-effect model-fit residuals to choose candidate nugget and/or no-nugget spatial covariance structures to use with the RCBCE and CE methods. In practice, the estimated sill, nugget, and range from the semivariogram will be good starting parameter values for the mixed model procedure, which will then iteratively search for its own best-fit values. Providing starting parameter values for each spatial covariance structure is important for two reasons. The first is that the mixed models analysis is very sensitive to starting parameters values and good starting values increase the likelihood that the algorithm will converge. The second reason is that SAS PROC MIXED can require a large amount of computer time with the large data sets typical of precision agriculture and yield monitoring (n > 2000). Good starting parameter values can reduce this run time considerably.
Once candidate spatial covariance structures and their associated starting parameter values are chosen (based on the semivariogram of the fixed-effect model-fit residuals), the resultant RCBCE models are run. If a tested model initially results in nonconvergence, infinite likelihood, or a nonpositive definite Hessian matrix (a multiple of the inverse of the estimated covariance matrix of the covariance parameter estimates), we recommend a "trial-and-error" approach for changing the starting parameter values. If subsequent attempts also result in one of these failures, the model is excluded (SAS Institute, 1999b).
Spatial covariance structures may or may not include a nugget. Consequently, the second step in our demonstration is to compare all candidate RCBCE models based on no-nugget functions. Because the various spatial covariance functions are not nested, we select the best no-nugget model using AIC values as described above (Fig. 1). Since the three spatial covariance functions under consideration have the same number of random effect parameters, this amounts to simply picking the covariance function with the highest likelihood value. This is then repeated for RCBCE models based on spatial covariance structures that include a nugget. Best CE models based on no-nugget and nugget spatial covariance structures are then determined in the same manner (Fig. 1).
Hypothesis Testing
Three hypotheses are tested to select a covariance model (Fig. 1). These hypotheses and the statistical tools used to test them are described below.
Hypothesis 1
The nugget effect
2g = 0, which means the no-nugget spatial covariance function is more appropriate than one with a nugget for the RCBCE or CE method. After selecting the best RCBCE nugget and no-nugget models (Fig. 1), the two are compared to determine which results in the better fit. If the two models are based on different spatial covariance structures, they are not nested and AIC values are used to determine the one with the better fit. If they are based on the same spatial covariance structure, then they are nested. However, because the hypothesis involves testing whether a variance component (the nugget) equals zero, the LRT01 is used and the p value of the test is half of the reported p value from the
2 distribution with one degree of freedom. The same procedure is used to determine whether the selected no-nugget or nugget CE model has the better fit.
Hypothesis 2
The errors are not spatially correlated; that is, the RCBiid method is more appropriate than the RCBCE method. When the best RCBCE model does not contain a nugget, the difference between the two models is only one parameter, the range. The LRT01 statistic can be used to test Hypothesis 2 using half of the reported p value from the
2 distribution with one degree of freedom.
When the best RCBCE model contains a nugget, the RCBiid model is still nested within it, but the models differ by two parameters: the range and the nugget. In this case the distribution of
is unknown, so determination of the model with the better fit is based on AIC values.
Hypothesis 3
The RCBCE block and plot effects are
2p =
2b = 0; that is, V1 = 0 and V2 =
2W, which means that blocking is not significant and the CE method is more appropriate than the RCBCE method. There is one case where Hypothesis 3 can be tested with the LRT01. If the best RCBCE and CE models are both based on the same spatial covariance structure they are nested and the first requirement for using the LRT01 is met. If the RCBCE model has dropped either the block or plot effect (i.e., the estimate for either of these parameters is zero), and this model only differs from the best CE model by the one remaining block or plot effect (a variance component), then the LRT01 can be implemented using half of the reported p value from the
2 distribution with one degree of freedom. In all other cases, we use AIC values to determine which model has the better fit.
Estimation of Treatment Significance and Mean
Once the covariance model is selected, it is used to estimate treatment means and conduct treatment comparisons using estimated generalized least squares (EGLS). The covariance parameters of the model are estimated using REML (Zimmerman and Harville, 1991).
| APPLYING THE PROCEDURE: A TWO-YEAR SITE-SPECIFIC NITROGEN MANAGEMENT STUDY |
|---|
|
|
|---|
|
All fertilizer N was applied in a serpentine fashion by an applicator tacking back and forth across the field in adjacent swaths, switching the amount of fertilizer applied whenever a plot or miniplot boundary was crossed. Thus, the treatments were not applied to a plot all at one time, but partially and sequentially to adjacent plots.
Nitrogen rates and timing for the RYE, FA, and SSNM treatments for wheat forage in 2001, and for corn for grain in 2002 are summarized in Table 1. For wheat forage, the total RYE N fertilizer rate was 44 and 34% greater than for FA and SSNM, respectively. For corn, the total FA N fertilizer rate was 95 and 51% greater than for RYE and SSNM, respectively. Nitrogen fertilizer was not applied to the 2001 soybean crop.
|
Model Selection Steps Common to Both Crops
To investigate spatial correlation within the 2001 wheat forage and the 2002 corn grain yields, the observations were fit to two models: (i) with only fixed treatment effects (i.e., N management) to compute the fixed-effect model-fit residuals, and (ii) with both fixed treatment effects and random block and block x treatment effects to compute the RCBiid model-fit residuals. The normality of the observations and these residuals was assessed by diagnostic plots (e.g., histograms) coupled with the Shapiro-Wilk test in SAS Version 8 PROC CAPABILITY (SAS Institute, Cary, NC). There was no strong evidence of nonnormality for the original observations nor for the residuals for either crop, so the data were analyzed in the original scale. An isotropic and stationary spatial process was assumed, and all tests were conducted in SAS Version 8 and determined at the 0.05 significance level.
Scaled plots of the empirical semivariograms of the fixed-effect model-fit residuals
e(h), the observations
(h), and the RCBiid model-fit residuals
e(h)RCB for the 2001 wheat forage crop, and for the 2002 corn yield are illustrated in Fig. 3
. For the wheat 2001 data, plots of
e(h) were used to estimate starting parameters for candidate RCBCE and CE models. It was assumed (based on Fig. 3) that the spatial covariance of the errors would be best modeled with either a Gaussian, exponential, or spherical function either with or without a nugget effect. Consequently, in addition to the RCBiid model, six candidate RCBCE and six CE models were evaluated in SAS PROC MIXED. The ddf of F tests for treatment effects were computed using the Kenward-Roger method for all models (Kenward and Roger, 1997). All candidate models and their related statistics and covariance parameter estimates for wheat forage are summarized in Table 2. This process was then repeated for the corn 2002 data set.
|
|
e(h)RCB was nearly a pure nugget effect, that is, a relatively constant semivariance over increasing lag distances. This suggested that blocking was effective in accounting for the spatial correlation among the observations, and that blocking was likely to be an important component of the selected covariance model. Following the flow chart (Fig. 1), the covariance model for wheat forage was selected as follows:
for the comparison of these models was computed as 8479 8468 = 11. Here
has a mixture of half
20 and half
21 distributions with one degree of freedom. The p value of the
21 with a value of 11 is 0.0009; therefore, the p value of LRT01 test is 0.0009/2
0.0005. Thus, the LRT01 was significant and the RCBCE nugget Gaussian model was selected as the covariance model to be used for wheat forage yield.
Corn 2002 Model Selection
As found in the previous year's wheat forage data, spatial correlation with an apparent nugget effect was evident in the plots of
(h) and
e(h) (Fig. 3). Furthermore, these plots were nearly identical, which was consistent with the fact that the RCBiid model indicated that treatment effects were not significant. Unlike the scaled semivariograms for 2001 wheat forage, the plot of 2002 corn
e(h)RBC continued to show spatial correlation. This suggested that blocking may not have been adequate to account for the spatial variability in these data. Additionally, all three plots indicated that functions with nugget effects might be necessary to model the spatial correlation.
Following Fig. 1, the covariance model for 2002 corn yield was performed as follows:
is unknown. The RCBCE nugget spherical model with a smaller AIC value was chosen over the RCBiid method.
for the comparison of these models was 0. Clearly, the LRT01 was not significant and the CE nugget spherical model was selected for 2002 corn yield. | INTERPRETATION AND DISCUSSION |
|---|
|
|
|---|
Semivariograms, Spatial Ranges, and Selected Covariance Models
The selection of covariance models for wheat and corn was supported by the semivariograms of the observations, the fixed-effect model-fit residuals, and the RCBiid model-fit residuals (Fig. 3). For wheat, the fixed-effect-model-fit residual semivariances were slightly lower than those of the observations, indicating that there was a small degree of variation accounted for by the fixed effects and that the data were spatially correlated. Additionally, the RCBiid model-fit residual semivariogram appeared to be essentially a pure nugget effect, indicating that blocking accounted for much of the spatial correlation (at distances greater than the minimum lag). Each of these facts is consistent with the selection of the RCBCE nugget Gaussian model (Table 2), which resulted in a significant treatment effect, included a nugget effect in the model of the spatial correlation, and incorporated blocking effects. The shape of the wheat semivariograms at the shortest lag distances (Fig. 3) was also consistent with the selection of a Gaussian spatial covariance structure.
If the wheat semivariograms had been the only criteria used to select the covariance model for the wheat analysis, one might have selected the RCBiid model based on the nearly pure nugget effect of the RCBiid model-fit residuals. The fact that the model selection procedure (Fig. 1) found the RCBCE nugget Gaussian model to be better is instructive. The covariance structure of the errors is not identical to that visualized by the semivariogram analysis of the residuals (Brownie and Gumpertz, 1997). This is the primary reason we emphasize using the semivariograms for estimating starting parameters, and not for selecting a final covariance model.
For wheat forage, the estimated range of spatial dependence of the errors from the selected model (RCBCE nugget Gaussian, Table 2) was 99 m, which was less than the maximum distance within a block (
172 m) but greater than the plot dimensions (
61 m). This indicated that spatially correlated errors were present within and among blocks, probably due to the omission of one or more unknown covariates from the model. In this case, the random block effect would not be expected to be fully effective in explaining the spatial structure. However, the random block x treatment effect was significant. Consequently, for wheat forage, both blocking and spatial covariance modeling of the errors were important.
For the corn data set, the fixed-effect-model-fit residual semivariogram was nearly identical to that of the observations, indicating that the treatment effects accounted for very little of the spatial variability. In contrast to wheat, the corn RCBiid model-fit-residual semivariogram indicated substantial spatial correlation despite the block effect being in the model. This semivariogram had about half the sill-to-nugget semivariance and half the range of the fixed-effect-model-fit residual semivariogram. This indicated that although blocking accounted for a substantial proportion of variability at longer lag distances, it was ineffective at accounting for spatial correlation within the scale of a plot. Additionally, all three corn semivariograms suggested the need for a spatial covariance structure that included a nugget effect. These facts are consistent with the model selection procedure's choice of the CE nugget spherical model for the corn analysis (Table 2), and indicated that the spatial correlation was better accounted for by ignoring the blocking and including an estimation of the spatial covariance structure of the errors in the model.
The estimated range of spatial dependence of the errors from the selected corn model was 52 m, slightly less than the plot size, indicating that spatially correlated errors were present within the scale of adjacent plots. In such a case, neither a block nor block x treatment effect could be expected to be effective in accounting for spatial heterogeneity. In this case, a spatial covariance function alone was better in accounting for the spatial correlation.
These two examples show that variography is an important tool in selecting covariance models. The semivariograms are useful in determining starting parameters. They give an indication of when spatial covariance structures may require inclusion of nugget effects, and they hint at the appropriate form the spatial covariance structure may take (i.e., spherical, exponential, etc.). However, consistent with the findings of Brownie and Gumpertz (1997), the exact model that will be selected cannot be fully predicted from the semivariograms alone.
Model Impact on P Values
In both the wheat forage and corn-for-grain trials, the errors were spatially correlated, the estimated range was less than the block size, and the RCBiid method tended to give higher estimates of the standard errors of treatment comparisons than the other models (Table 2). Consequently, for both data sets, the RCBiid and the selected covariance model resulted in different conclusions regarding treatment significance. The selected covariance model for wheat resulted in a smaller significance level for treatment comparison, p = 0.03 compared with p = 0.07 for the RCBiid model. This would have affected conclusions about treatment effects at the 0.05 level of significance. The selected covariance model for corn indicated significant treatment effects (p = 0.05), which were completely obscured by the RCBiid model (p = 0.43). This is because the RCBiid method assumes the spatial correlation persists for a longer distance (the width of a block,
172 m) than the selected CE model with an estimated range of 52 m (Table 2). Clearly, when the model errors are spatially correlated, as they were in both of these data sets, the use of the selection procedure may result in a more powerful statistical test than the classical RCBiid method.
The data in Table 2 indicate that the treatment p value associated with any given candidate covariance model is very sensitive to the form of that model. For the wheat data, all of the models that included block effects gave p values for the treatment effect in the range of 0.03 to 0.07. If block effects were omitted from the wheat models, the p values dropped to <0.0001. Clearly, arbitrary selection of a covariance model can result in dramatically different treatment conclusions. In the wheat example presented here, both the selection procedure based on AIC and LRT comparisons (Fig. 1) and the initial variography (Fig. 3) support the selection of the RCBCE nugget Gaussian model for data analysis. The LRT decisively selects the RCBCE nugget Gaussian model over the CE nugget Gaussian model (p value = 0.0009 against dropping the block effects).
In the corn data, the inclusion of block effects in the models also affected the p values of the F test for treatment effects, but the most dramatic impact was associated with the inclusion or omission of a nugget effect in the spatial covariance structures used in the RCBCE and CE methods (Table 2). The p values for models with block effects and no nugget ranged from 0.27 to 0.43. If a nugget was included in the RCBCE candidate models, the p values ranged from 0.06 to 0.15. If no nugget and no block effects were include, the p value dropped dramatically to 0.0003 for the CE no-nugget Gaussian model, which was the only one of this type that converged. The CE models that included a nugget effect had p values that ranged from 0.04 to 0.13. Consistent with the wheat trial, the corn data demonstrate that it is possible for results of tests of treatment effects to differ greatly under different spatial covariance models and that covariance models should not be selected arbitrarily. In this case, the AIC values made it obvious that models without nugget effects were not adequate, and variography supported this conclusion. The nugget spherical model provided the best fit, either with or without block effects (AIC = 41341 and AIC = 41339, respectively). The LRT for block effects showed that the block effects were not significant. Including nonsignificant blocks in the model should not have much effect one way or the other; and it did notthe p values for testing treatment effects were similar whether blocks were included or not (p = 0.06 with block effects and 0.05 without, Table 2).
The ddf of F tests for treatment effects were computed using the Kenward-Roger method provided in SAS PROC MIXED (Kenward and Roger, 1997). In the classical analysis of variance, the error df measure the number of independent pieces of information used to estimate the error variance, the idea being that observations that are correlated with each other are somewhat redundant, the more highly correlated they are, the more redundant is the information. In the correlated errors analysis the error variance is computed a different way. The Kenward-Roger method computes ddf based on the estimated covariance structure that summarizes the amount of nonredundant information available to estimate the error variance. When block and plot effects are present and have an impact on the covariance structure, the Kenward-Roger ddf for the test of treatment effects will be similar to the traditional ddf based on the number of plots. If the block and plot effects are small or nonexistent, the ddf computed by the Kenward-Roger method will be much larger. In the two trials presented here, the ddf varied considerably among candidate models (Table 2). For example, the ddf ranged from 27 to 347 and from 18 to 1033 for the wheat and corn candidate models, respectively. On first consideration, it might seem that this alone should result in the decrease in treatment p values associated with some of the CE candidate models. However, in our experience, this is not the case even in these types of experiments that involve dense subsampling within plots.
To determine how much the ddf affected the p values, we compared the F statistic for treatment effects in each of our selected models to three different critical values. The three critical values correspond to ddf computed three different ways: (i) using the "RESIDUAL" option in SAS Proc Mixed, (ii) using the "KENWARDROGER" option, and (iii) using the RCBiid method. Note that we did not change the covariance models or the F statistics; just the ddf for finding the critical values were changed. Table 3 shows the results. For the wheat data, ddf = 19, 27, or 527 computed by the Kenward-Roger method, computed as if the model were RCBiid, and using the "Residual" method, respectively. The associated p values ranged only from 0.03 to 0.01. For the corn data, when ddf ranged from 18 to 2539, the change in p values was also small, ranging only from 0.07 to 0.05. Clearly, the treatment p values of the selected covariance models were relatively insensitive to changes in ddf over the range exhibited by the candidate models. Differences in p values among candidate models cannot be accounted for simply by the changes in ddf among the RCBiid, RCBCE, and CE methods.
|
|
Other Strategies for Model Selection
In our two examples, if only AIC had been used for model selection, the same models would have been selected as we found using the LRT01 together with AIC. However, our experience with other data has shown that this is not always the case. Overall, we prefer using the LRT in all hypothesis tests if the distribution of a LRT is known. For cases where the distribution of a LRT is unknown, AIC gives a method of comparing models. As with the LRT, there are uncertainties about the behavior of the AIC statistic when parameters are on the boundary of the parameter space.
Other model selection sequences for choosing the best-fit RCBCE or CE model are possible. Within the RCBCE or CE method, we can first compare a nugget vs. no-nugget model based on the same spatial covariance structure using LRT01, and then compare the selected models across different spatial covariance structures using AIC. Additionally, we can also first compare the best-fit RCBCE and CE models to determine the blocking significance, and then compare the selected model to the RCBiid model to determine if spatial correlation is significant. We examined these procedures for our two examples, and they resulted in the selection of the same best-fit models as our original model selection sequence.
| CONCLUSIONS |
|---|
|
|
|---|
Further Considerations
Covariance model selection based on one dataset is difficult, and no single tool will necessarily select an appropriate model. For example, the candidate CE nugget exponential wheat model had AIC values that were nearly identical to the other wheat candidate CE nugget models (Table 2), but the estimated range and sill were nearly an order of magnitude larger. These estimates were not only inconsistent with the estimates from the other candidate models, but clearly did not fit the wheat data presented in Fig. 3. Our procedure rejected this candidate model, but it is still of concern that the AIC for this model gave little indication that its parameter estimates were apparently incorrect.
A further concern raised by the results presented here is the extreme sensitivity of the treatment p values to covariance model selection. Clearly, care must be taken in selecting a model for data analysis, and untested assumptions about the adequacy of a given spatial covariance structure, the presence or absence of nugget effects, or the need to include or omit block effects can lead to erroneous conclusions. We believe that a selection procedure like the one outlined in Fig. 1 combined with variography are important if adequate models are to be used for data analysis. However, given the concerns about using AIC and LRT comparisons, all conclusions about covariance model selection and the resultant treatment effect p values need to be carefully evaluated. For example, researchers need to ask if the estimated parameters are realistic, and if the conclusions regarding treatment effects are agronomically meaningful. Simulation studies would be useful in evaluating the probability of selecting an adequate covariance structure and the effect of the covariance structure on the estimation of treatment significance with or without the inclusion of block and block by treatment (plot) effects.
| APPENDIX |
|---|
|
|
|---|