Published Date
, Volume 73, Issue 3, pp 625–633
, Volume 73, Issue 3, pp 625–633
Title
Propagating uncertainty through individual tree volume model predictions to large-area volume estimates
- Author
- Ronald E. McRoberts,
- James A. Westfall
- Introduction
- 2. Materials and Methods
2.1 Study area
The study area was Minnesota Survey Unit 1 of the Forest Inventory and Analysis (FIA) program of the Northern Research Station, US Forest Service (Fig. 1). The study area includes approximately 33,353 km2 (12,877 mi2) and consists of forest land dominated by aspen-birch and spruce-fir associations, agricultural land, wetlands, and water.
2.2 Data
The FIA program conducts the National Forest Inventory (NFI) of the United States of America (USA) and has established field plot centers in permanent locations using a quasi-systematic sampling design that is regarded as producing an equal probability sample (McRoberts et al. 2010). Field crews observe species and measure diameter at breast height (dbh) (1.37 m, 4.5 ft) and height (ht) for all trees with dbh of at least 12.7 cm (5 in.). Volumes (V) for individual trees are predicted using statistical models, aggregated at plot level, expressed as volume per unit area, and typically considered to be observations without error. For this study, data were used for 2178 FIA plots on forest land with 50,176 trees representing 38 species. McRoberts and Westfall (2014, Table 1) describe these data in greater detail. For future reference, these data are characterized as the estimation dataset.
Table 1
Effects on large-area volume estimates of model prediction uncertainty due to uncertainty from underlying sourcesSource of uncertaintySimple random sampling (SRS) estimatorsStratified (STR) estimators (number of strata)248MeanSEMeanSEMeanSEMeanSENone90.171.6990.171.1190.170.7690.170.54Diameter measurement error90.171.6990.171.1190.170.7690.170.54Height measurement errora90.171.6990.171.1190.170.7790.170.54Height measurement errorb90.171.6990.171.1190.170.7790.170.55Diameter and height measurement errorsa90.171.6990.171.1190.170.7790.170.54Diameter and height measurement errorsb90.171.6990.171.1190.170.7790.170.55Parameter uncertainty and residual variance90.171.7190.171.1490.170.8290.170.61Alla90.171.7190.171.1490.170.8290.200.61Allb90.171.7190.171.1590.170.8390.170.62aWithin-plot height measurement error correlation: ρ = 0.00bWithin-plot height measurement error correlation: ρ = 0.25The data used to calibrate the allometric volume model were originally acquired for a taper model study (Westfall and Scott 2010) encompassing 24 northeastern states of the USA (Fig. 1). For the current study, the geographic source of the calibration data was restricted to the States of Michigan, Minnesota, and Wisconsin which span the ecological province that includes the study area. For the 2398 individual trees in the dataset, diameter measurements were obtained using a Barr and Stroud dendrometer at heights of 0.3, 0.6, 0.9, 1.4, and 1.8 m and at approximately 2.5-cm diameter taper intervals up to total tree height. Volumes of sections between height measurements were calculated using Smalian’s formula (Avery and Burkhart 2002, p. 101) as the product of mean cross-sectional area and section length, and total stem volumes for individual trees were calculated by adding volumes for all sections. McRoberts and Westfall (2014) describe the sampling procedure and protocols for individual tree measurements in detail. For future reference, these data are characterized as the calibration dataset.2.3 Volume models
An allometric model of the relationship between V as the response variable and dbh and ht as the predictor variables was formulated as
Vi=β0×dbhβ1i×htβ2i+εi, (1)Before model fitting, natural logarithmic (ln) transformations of the response and predictor variables were calculated, and the model was reformulated as
ln(Vi)=α0+α1×ln(dbhi)+α2×ln(hti)+εi, (2)
Vˆi=exp[αˆ0+αˆ1×ln(dbhi)+αˆ2×ln(hi)+σˆε2], (3)σˆε is the residual standard deviation on the ln-ln scale, and the termσˆε2 compensates for bias that accrues when transforming from the ln-ln scale back to the original scale (Baskerville 1972).The quality of model fit on both the ln-ln and original scales was assessed in terms of the proportion of the variability explained by the model. On the ln-ln scale, the proportion was calculated and denoted R 2. The predictions were also transformed from the ln-ln scale back to the original scale using Eq. (3), and the proportion of variability explained by the model was denoted pseudo-R 2 because the assumptions underlying R 2 are not completely satisfied when using non-linear models (Anderson-Sprecher 1994).2.4 Estimators
The simplest approach for estimating large-area parameters is to use the familiar simple random sampling (SRS) estimators,
μˆSRS=1n∑j=1nyj (4a)
Vaˆr(μˆSRS)=∑j=1n(yj-μˆSRS)2n(n-1), (4b)Vaˆr(μˆSRS) from Eq. (4b) may be biased when used with systematic sampling, it is usually conservative in the sense that it overestimates the variance (Särndal et al. 1992, p. 83). For this study, the finite population correction factor was ignored because of the small sampling intensity of approximately one 670 m2 plot per 1200 ha of study area.Because the uncertainty in volume model predictions is independent of the particular large-area estimators, the relative effects of model prediction uncertainty will be greater for estimators that reduce the effects of population variability than for the SRS estimators. Multiple regional FIA programs use post-stratified estimators to reduce variances of estimates with strata based on satellite spectral data. McRoberts et al. (2012) showed that stratifications derived from lidar-based maps of growing stock volume reduced the variance of mean volume per unit area by factors as great as 3.5 relative to the SRS estimators. Therefore, for this study, the effects of volume model uncertainty on large-area estimates of mean volume per unit area were also assessed using post-stratified estimation.Post-stratified (STR) estimates of means and variances are calculated using estimators provided by Cochran (1977, pp. 134–135),
μˆSTR=∑h=1Hwhμˆh (5a)
Vaˆr(μˆSTR)=∑h=1H[wh⋅σˆ2hn+(1−wh)σˆ2hn2], (5b)
μˆh=1nh∑j=1nhyhj,σˆ2h=1nh−1∑j=1nh(yhj-μˆh)2, μˆh andσˆ2h are the sample estimates of the mean and variance, respectively.Because lidar data were not available for the entire study area, stratifications were simulated by ordering the predicted plot volumes from smallest to largest and dividing the range into intervals with approximately the same number of plots per interval. These intervals simulated strata for which strata weights were estimated as the proportions of plots in the strata. The consequences on estimates of using these simulated strata rather than strata based on an actual lidar-based volume map were twofold. First, when using the same data, the SRS and STR estimates of the study area mean will be exactly the same, regardless of the number of strata used. Operationally, however, differences between stratum weights and proportions of plots per stratum cause SRS and STR estimates of means to differ. Second, the simulated approach assumes that each plot is assigned to the correct stratum, whereas operationally map prediction and geo-location errors cause some plots to be assigned to incorrect strata. These incorrect assignments do not induce bias into the estimators, but they increase the within-stratum variances and the overall variance. The overall result is that the relative effects of volume model prediction uncertainty for this study will be slightly overestimated for the simulated stratifications.Cochran (1977, pp. 132–134) suggests that more than six strata are usually not useful; McRoberts et al. (2012) reported that little was gained when using more than six lidar-based strata; and the FIA program uses five spectral-based strata in the study area. For this study, stratifications based on 2, 4, and 8 strata were considered as representative of the range of possibilities.2.5 Simulating uncertainty
The study focused on the effects on estimates of large-area mean volume per unit area of model prediction uncertainty arising from four sources: dbh measurement error, height measurement error, parameter uncertainty, and model residual variance. The tolerance for dbh measurement errors specified by the FIA protocols is that 95 % of measurements are to be within 0.5 % of the true dbh (U.S. Forest Service 2012). Assuming that dbh measurement errors follow a Gaussian distribution with mean 0, the standard deviation of the distribution isσdbhε=0.005×dbh1.96≈0.00255×dbh . The tolerance for height measurement errors specified by the FIA protocols is that 90 % of measurements are to be within 10 % of the true height (U.S. Forest Service 2012). Assuming that the height measurement errors follow a Gaussian distribution with mean 0, the standard deviation of the distribution isσhtε=0.10×ht1.645≈0.06079×ht .Uncertainty in the linear model parameter estimates on the ln-ln scale was assessed using a 3-step Monte Carlo approach: (i) the transformed calibration dataset was aggregated into 10 dbh size classes, each with approximately the same number of observations; (ii) each dbh size class was resampled with replacement until the original class sample size was achieved; and (iii) the model was fit to the resampled data and the parameters were estimated. Steps (i)–(iii) were then replicated until the means and standard deviations of the distributions of parameter estimates stabilized. The resulting multiparameter distribution of parameter estimates represented the uncertainty in estimates of the linear model parameters on the ln-ln scale.Residual uncertainty was assessed on the original scale where the models were applied using a 4-step procedure that accommodated heteroskedasticity: (i) the pairs(Vi,Vˆi) were ordered with respect to the model prediction,Vˆi ; (ii) the pairs were aggregated into groups of size 25; (iii) within each group, g, the mean of the observationsV¯¯¯¯g , the mean of the predictionsVˆ¯¯¯¯g , and the standard deviationσˆg of the residualsεi=Vi-Vˆi were calculated; and (iv) the relationship between the group standard deviations,σˆg , and the group prediction means,Vˆ¯¯¯¯g , was represented using the model,
σˆg=γ1×Vˆ¯¯¯¯gγ2+εg, (6)2.6 Uncertainty in large-area volume estimates
A 6-step Monte Carlo simulation procedure was used to estimate the effects of model prediction uncertainty on the uncertainty of large-area estimates of mean volume per unit area.
- Step 1.For the kth replication, a set of model parameter estimates,
βˆk , was randomly selected from the distribution constructed in “Section 2.5”. - Step 2.For the ith tree on the jth plot in the estimation dataset, a random number, ε, was drawn from a Gaussian (0,1) distribution; if |ε| > 2.5, ε was redrawn. A dbh observation was then simulated as
dbhij=dbh0ij+ε×σdbhε, - Step 3.For the ith tree on the jth plot in the estimation dataset, a random number, ε, was drawn from a Gaussian (0,1) distribution; if |ε| > 2.5, ε was redrawn. A height observation was then simulated as
htij=ht0ij+ε×σhtε, - Step 4.For the ith tree on the jth plot in the estimation dataset, an initial volume observation was calculated using the parameter values from step (1) and the simulated dbh and height observations from steps (2) and (3) as
Vk,0ij=βˆk1×dbhijβˆk2×htijβˆk3. A random number, ε, was drawn from a Gaussian (0,1) distribution; if |ε| > 2.5, ε was redrawn. The residual standard deviation,σˆij , was then calculated using Eq. (6) with V ij k,0 as the value of the predictor variable. The individual tree volume was then simulated as
Vkij=Vk,0ij+ε×σˆij. - Step 5.The total volume for the jth plot in the estimation dataset was calculated as
Vkj=∑i=1njVkij where n j is the number of trees on the plot. - Step 6.The overall study area mean,
V¯¯¯¯k , and variance of the mean,Vaˆr(V¯¯¯¯k) , for the kth replication were estimated using both the SRS and STR estimators as described in “Section 2.4.”
Steps (1)–(6) were replicated, and the mean and variance over replications were estimated as per Rubin (1987, pp.76–77),
μˆ=1nrep∑k=1nrepV¯¯¯¯k, (7)
Vaˆr(μˆ)=(1+1nrep)×W1+W2, (8)W1=1nrep-1∑k=1nrep(V¯¯¯¯k-μˆ)2 is the among-replications variance,W2=1nrep∑k=1nrepVaˆr(V¯¯¯¯k) is the mean within-replication variance, and n rep is the number of replications. The replications continued untilμˆ andSE(μˆ)=Vaˆr(μˆ)−−−−−−√ stabilized.In Steps (2) and (3), dbh and height measurement errors for the same tree were assumed to be independent, as were dbh errors for trees on the same plot. However, because of the greater difficulty in accurately measuring height and because plot canopy conditions tend to affect measurements of the heights of all trees on the same plot in a similar manner, height measurement errors for trees on the same plot may not be independent. Therefore, simulations were conducted separately for height correlations of ρ = 0.00 and ρ = 0.25. In Step (4), spatial correlations among residuals for trees on the same plot were ignored based on Berger et al. (2014), Breidenbach et al. (2014), and McRoberts and Westfall (2014) who all reported that the effects were negligible.Parameter uncertainty and residual variance are highly correlated because each necessarily affects the other. This phenomenon is easily understood by considering the parametric form of the parameter covariance matrix for a linear model; in particular,
Var(βˆ)=σ2×(X′X)-1, - Step 1.
- 3. Results
- The fit of the model to the data on the ln-ln scale produced R 2 = 0.98 with corresponding pseudo-R 2 = 0.97 on the original scale (Figs. 2 and 3). These large values justify the initial decision not to consider model misspecification for this study (“Section 1”) and also suggest that other model forms would likely not have produced more accurate predictions. However, as with most similar datasets, the number of observations for large trees is smaller than for other trees. Although this phenomenon could contribute to serious lack of fit of the model for large trees, such was not the case for this study (Figs. 2 and 3).
On the ln-ln scale, the distribution of simulated parameter estimates exhibited an ellipsoidal pattern as expected for linear models (Fig. 4). Parameter uncertainty was simulated by random draws from this distribution. The approach to estimating the relationship between heteroskedastic residual standard deviations and volume model predictions as described in “Section 2.5” was somewhat arbitrary, but the relationship was well estimated (Fig. 5).
For all combinations of dbh measurement error, height measurement error, and parameter uncertainty and residual variance, 5000 replications of the simulation procedure were sufficient for estimates of both means and SEs to stabilize (Fig. 6). Further, no prediction for any of the more than 50,000 trees in the estimation dataset over the 5000 replications was proportionally less than 0.87 or greater than 1.20 than the prediction with the original parameter estimates. For large trees, the proportions were 0.95 and 1.05.
Means of tree-level dbh measurement errors ranged from approximately −0.12 to 0.12 cm with nearly 98 % between −0.05 and 0.05 cm, and means of tree-level height measurement errors ranged from approximately −2.4 to 2.1 m with nearly 98 % between −1.0 and 1.0 m. These relatively small tree-level errors have minimal effects at the population level.For the STR estimators, the effects of uncertainty from the four sources on SEs increased as the number of strata increased, although nearly all the increase is attributed to the combined effects of parameter uncertainty and residual variance. For two strata, the proportional increase in SE was 0.036; for four strata, the proportional increase was 0.092; and for eight strata, the proportional increase was 0.148. Although the increase for two strata would likely be considered negligible, such may not be the case for four and eight strata. As previously noted, the simulated stratifications likely reduce SEs more than would be realized with actual stratifications with the result that the actual increases in SE may be slightly less than those reported in Table 1.
- 4. Discussion
- The simulated stratifications accomplished the intended objective by grouping plots into strata with greater homogeneity than the population as a whole and thereby reducing the variance of the estimate of the population mean relative to the variance of the SRS mean (Tables 1 and 2). As previously noted, the lack of differences among the SRS and STR estimates of the means is attributed to using proportions of plots per stratum as stratum weights rather than proportions of the population.
Table 2
Stratified estimatesStratumWeightSample sizeMeanSE10.2554512.420.3520.2554449.130.5430.25545100.140.7140.25544199.142.90Total1.00217890.170.76No uncertainty incorporatedWhen using the SRS estimators, no source of uncertainty individually or in combination had a meaningful effect on SEs (Table 1). This result is consistent with the results reported by McRoberts et al. (2014b) for a sub-tropical dataset. For both the SRS and STR estimators, the effects of dbh and height measurement errors, including correlated height measurement errors, were negligible. This result can be attributed to the fairly large number of trees per plot (mean 23, maximum 134) which resulted in relatively small mean dbh and height measurement errors at the plot level. Berger et al. (2014) and Qi et al. (2015) reported similar results. Overall, these results suggest that as long as height measurements satisfy FIA protocols, these sources of uncertainty produce no meaningful adverse consequences. However, experience suggests that height measurements may fail to satisfy the protocols. Nevertheless, results obtained using a 20 % rather than 10 % tolerance and an 80 % rather than 90 % satisfaction rate were essentially unchanged.The combined effects of parameter uncertainty and residual variance were greater than the combined effects of dbh and height measurement error. This result can be at least partially attributed to how parameter uncertainty affects plot-level estimates. Whereas measurement errors and prediction residuals are incorporated separately for individual trees and compensate for each other, parameter uncertainty is realized at the population level and therefore should be expected to have a greater population-level effect.The important finding is that as the effects on SEs of population variability are reduced by using the STR estimators rather than the SRS estimators, the relative effects of underlying sources of model prediction uncertainty increase. Model-assisted estimators, which are receiving increasing attention for inventory applications, often reduce the effects of population variability even more than do stratified estimators (McRoberts et al. 2013, 2014a). Thus, the proportional adverse effects of measurement error, parameter uncertainty, and residual variance on the uncertainty of the large-area volume estimates may be even greater when the effects of sampling variable populations are further reduced.
- 5. Conclusions
- Three conclusions may be drawn from the study. First, when using the simple random sampling estimators, the effects on large-area volume estimates of uncertainty in individual tree volume model predictions due to diameter and height measurement error, parameter uncertainty, and residual variance were negligible. Second, however, when the effects of variability in the population on uncertainty were reduced via stratified estimation, the effects of model prediction uncertainty on the large-area volume estimates increased as the number of strata increased. For four and eight strata, the proportional increases in the stratified SEs were as great as 0.092 and 0.148, respectively, which may not be negligible. Third, nearly all the effects of model prediction uncertainty can be attributed to parameter uncertainty and residual variance. Finally, all results for this study are contingent on the calibration dataset sample size and the quality of fit of the model to the data, both of which directly affect parameter uncertainty and residual variance.
- References
- Anderson-Sprecher R (1994) Model comparisons and R2. Am Stat 48:113–117
- Avery TE, Burkhart HE (2002) Forest measurements, 5th edn. McGraw-Hill, NewYork, p 456
- Baskerville GL (1972) Use of logarithmic regression in the estimation of plant biomass. Can J Forest Res 2:49–53CrossRef
- Bates DM, Watts DG (1988) Nonlinear regression analysis and its applications. Wiley, New York, p 365CrossRef
- Berger A, Gschwantner T, McRoberts RE, Schadauer K (2014) Effects of measurement errors on single tree stem volume estimates for the Austrian National Forest Inventory. For Sci 60:14–24
- Breidenbach J, Antón-Fernández C, Petersson H, Astrup P, McRoberts RE (2014) Quantifying the contribution of biomass model errors to the uncertainty of biomass stock and change estimates in Norway. For Sci 60:25–33
- Brown S, Gillespie AJR, Lugo AE (1989) Biomass estimation methods for tropical forests with application to forest inventory data. For Sci 35:881–902
- Chave J, Andalo C, Brown S, Cairns MA, Chambers JQ, Eamus D, Fölster H, Fromard F, Higuchi N, Kira T, Lescure J, Nelson BW, Ogaw H, Puig H, Riéra B, Yamakura T (2005) Tree allometry and improved estimation of carbon stocks and balance in tropical forests. Oecol 145:87–99CrossRef
- Cochran WG (1977) Sampling techniques, 3rd edn. Wiley, New York, p 428
- Cunia T (1965) Some theory on reliability of volume estimates in a forest inventory sample. For Sci 11:115–128
- Cunia T (1987) On the error of continuous forest inventory estimates. Can J Forest Res 17:436–441CrossRef
- Gertner GZ (1987) Approximating precision in simulation projections: an efficient alternative to Monte Carlo. For Sci 33:230–239
- Gertner GZ (1990) The sensitivity of measurement error in stand volume estimation. Can J Forest Res 20:800–804CrossRef
- Gertner GZ, Dzialowy PJ (1984) Effects of measurement error on an individual tree-based growth projection system. Can J Forest Res 14:311–316CrossRef
- Gertner G, Köhl M (1992) An assessment of some nonsampling errors in a national survey using an error budget. For Sci 38:525–538
- Kangas A (1996) On the bias and variance in tree volume predictions due to model and measurement errors. Scand J For Res 11:281–290CrossRef
- McRoberts RE (1996) Estimating variation in field crew estimates of site index. Can J Forest Res 26:560–565CrossRef
- McRoberts RE, Westfall JA (2014) The effects of uncertainty in model predictions of individual tree volume on large area volume estimates. For Sci 60:34–43
- McRoberts RE, Hahn JT, Hefty GJ, Van Cleve JR (1994) Variation in forest inventory field measurements. Can J Forest Res 24:1766–1770CrossRef
- McRoberts RE, Hansen MH, Smith WB (2010) United States of America. In: Tomppo E, Gschwantner T, Lawrence M, McRoberts RE (eds) National forest inventories, pathways for common reporting. Springer, Heidelberg, pp 567–582
- McRoberts RE, Gobakken T, Næsset E (2012) Post-stratified estimation of forest area and growing stock volume using lidar-based stratifications. Remote Sens Environ 125:157–166CrossRef
- McRoberts RE, Næsset E, Gobakken T (2013) Inference for lidar-assisted estimation of forest growing stock volume. Remote Sens Environ 128:268–275CrossRef
- McRoberts RE, Liknes GC, Domke GM (2014a) Using a remote sensing-based, percent tree cover map to enhance forest inventory estimation. For Ecol Manage 312:2–18
- McRoberts RE, Moser P, Oliveira Z, Vibrans AC (2014b) The effects of uncertainty in individual tree volume model predictions on large area volume estimates for the Brazilian State of Santa Catarina. Can J Forest Res 45:44–51CrossRef
- Qi C, Vaglio GL, Valentini R (2015) Uncertainty of remote sensed aboveground biomass over an African tropical forest: propagating errors from trees to plots to pixels. Remote Sens Environ
- Rubin DB (1987) Multiple imputation in non-response surveys. Wiley, New York, p 287CrossRef
- Särndal C-E, Swensson B, Wretman J (1992) Model assisted survey sampling. Springer-Verlag, Inc, New York, p 694CrossRef
- Ståhl G, Heikkinen J, Petersson H, Repola J, Holm S (2014) Adapting uncertainty assessments from sample based forest inventories to include the effects of model errors. For Sci 60:3–13
- US Forest Service (2012) Forest inventory and analysis national field guide. Volume 1: field data collection procedures for phase 2 plots, version 6.0. Available at: http://www.fia.fs.fed.us/library/field-guides-methods-proc/. Accessed: December 2014
- Westfall JA, Patterson PL (2007) Measurement variability error for estimates of volume change. Can J Forest Res 37:2201–2210CrossRef
- Westfall JA, Scott CT (2010) Taper models for commercial tree species in the northeastern United States. For Sci 56:515–528
For further details log on website :
No comments:
Post a Comment