首页   按字顺浏览 期刊浏览 卷期浏览 Estimating and using sampling precision in surveys of trace constituents of soils
Estimating and using sampling precision in surveys of trace constituents of soils

 

作者: Michael Thompson,  

 

期刊: Analyst  (RSC Available online 1993)
卷期: Volume 118, issue 9  

页码: 1107-1110

 

ISSN:0003-2654

 

年代: 1993

 

DOI:10.1039/AN9931801107

 

出版商: RSC

 

数据来源: RSC

 

摘要:

ANALYST, SEPTEMBER 1993, VOL. 118 1107 Estimating and Using Sampling Precision Constituents of Soils Michael Thompson and Michael Maguire Department of Chemistry, Birkbeck College, Gordon House, 29 in Surveys of Trace Gordon Square, London, UK WClH OPP Sampling variance and analytical variance have been estimated in a survey of concentrations of trace metals in soils from public gardens in an inner London borough. Robust analysis of variance was used for this purpose and found t o be appropriate. The resulting statistics were used to define criteria that identified unusual measurements for the purpose of checking or further investigation. The statistics were further used to assess the analytical and sampling protocols in respect of their fitness for the task of producing data of appropriate quality, in the light of the variation in the concentrations among the various gardens.Criteria for this latter purpose were also suggested. Keywords: Sampling precision; analytical precision; analysis of variance; robust statistics; soil survey It has become almost a cliche‘ to say that sampling errors are more of a problem than analytical errors, and yet it is seldom that any attempt is made to quantify sampling errors. It is perhaps more seldom that the combined errors resulting from sampling and analysis are formally demonstrated to be of a magnitude appropriate to the task of interpreting the resulting data. This omission is surprising because the requisite information can be obtained by means of a simple experi- ment.1-3 Moreover, the information on sampling and analy- tical errors is essential if reliable and economic decisions are to be made from the data.If the errors are too large they can obscure the information and lead to incorrect decisions. If the errors are very small, it may be that excessive effort (and therefore cost) is being put into the sampling and analytical methodology. This paper describes a hierarchical randomized replicated experiment applied to the study of toxic metals in the soils of urban public gardens. The purpose of the experiment, apart from obtaining preliminary values for the concentrations of the metals, was the estimation of sampling and analytical errors, and their assessment in relation to the actual variation between gardens. This procedure can be seen as a preliminary method of validating sampling and analytical protocols before undertaking a wider survey, an undertaking known as an ‘orientation survey’ in applied geochemistry.4 Although the immediate application of the present paper lies in the environmental field, the experimental design is suitable for a wide range of applications. Nested analysis of variance (ANOVA) is the standard method of treating the data from such experiments.In this paper the use of a relatively new approach is advocated, namely, robust ANOVA.3.5 can be estimated from the results of an experiment with the design shown in Fig. 1, with duplicated sampling and analysis. The computations for the classical ANOVA are shown in Table 1, using the following conventions, where xijh is the kth analysis on the jth sample taken from the ith of 1 sites: x i j = (Xijl + x,)/2 xi = (Zil + &)/2 .z = ci xi11 Estimates (6) of the variances are obtained by equating the expected mean square to the calculated value, i.e., s,”, = MS,, s,”,, = (MS,,, - MS,,)/2 = (MS,ite - MS,,,)/4 Robust ANOVA A classical estimate of a variance a* is based on a sum of squares of the form &(yi - j i ) 2 .Such a sum can be dominated by a small proportion of discrepant values of yi because of the squared terms. This has the unfortunate effect of unduly emphasizing the effects of unusual observations and discount- ing the majority of ‘normal’ observations. This is the antithesis of what is required in orientation surveys where the objective is to define criteria, based on normal observations, that enable the investigator to identify unusual observations in subsequent studies.An alternative approach that achieves the desired end is to use robust methods,6 where the square function is replaced, yielding ZiY(yi - j i ) . The function Y is the square near zero, and the absolute value is outside the limits of +ka. Theoretical Classical ANOVA Consider a single measurement ( x ) , obtained by applying the sampling protocol once to a particular site and analysing the sample once, then x = true value for site + sampling error + analytical error Assuming for the moment that the sources of variation are independent, the variance of x is given by var(x) = os?lte + o,2,, + aa2, where a&, is the variance of the true values of the sites, o:,, is the variance of true values of random samples taken from a site (the sampling variance), and is the variance of analytical measurements performed on random test portions from a sample (the analytical variance).These three variances Analysis 1 Analysis 2 Fig. 1 Design of the hierarchical duplicated experiment executed at 1 sites1108 ANALYST, SEPTEMBER 1993, VOL. 118 ~ ~ ~~~~ ~ _ _ _ _ _ _ _ 6' Table 1 ANOVA table for an experiment with duplicated sampling and analysis at I sites Degrees of Source of variation Sum of squares freedom Mean square Expected mean square Between sites 42& - f ) 2 I- 1 MSsite d n + 2 d a m + 4 d e Between samples 2@jl - f j # I MSsam d n + 2 4 a m Within sampleshetween analyses - Xi2,(q1 - xii2)2 21 MSan 0% 1 2 A common choice is k = 1.5.The application of this method to ANOVA has been discussed in the context of interlaboratory trials,5 and is adapted here for studies of sampling error. The method downweights the influence of outliers in the estima- tion of statistics, and simultaneously corrects for the down- weighting . According to a general principle stated by R O U S S ~ ~ U W , ~ 'outliers can easily be identified by comparing data with a robust fit'. This principle can be applied to the data obtained in orientation surveys and, with caution, to data collected in subsequent surveys. As the purpose is to enable the investiga- tor to identify unusual observations for further study, and not to conduct tests of significance , approximate criteria suffice. Moreover, it must be remembered that no simple exact frequency distributions can be attributed to robust estimates.Potentially anomalous sites are, therefore, a possibility when where .f or MS in bold typeface indicates that the statistic is obtained by the robust method. Unusually discrepant duplicate samples are indicated when IZj - f\/V(Z - 1)MSsite/4Z > 2 \xi1 - Xj21/VMSsam > 2 Possibly spurious duplicate determinations are indicated when lxijl - xi,*//- > 2 Appropriate Magnitudes of Sampling Variance and Analytical Variance Consider the variance of a mean result (X) for a site, based on the collection of rn samples each analysed n times. The value is var(X) = a$te + + a&/n)/rn If we define technical variance as the total variance introduced by the methodology, i.e., d e c h = (a?,, + dn/n)/m then we have var(x) = a;,, + oCch where the magnitude of OLch can be adjusted by altering the values of rn and n.Normally in environmental studies, one of two possibilities is required: to discriminate among background (normal) sites or to distinguish between background and anomalous sites. Both of these requirements are jeopardized if o:ech is relatively large. For example, if (7Lch = o$te, the information in X about variation among sites would be compromised by experimental 'noise'. Accordingly, we should seek to obtain lower values of (7;ech. However, reducing its value to below about 0.1 &, is probably unproductive, as it has little effect on var@), but almost certainly costs more to achieve than a higher value. Hence, an 'ideal' value of O:ech/o$te = 0.3 can be formulated.Consider now the relative sizes of a:,, and a&. Unless the survey material is almost homogeneous rare in soil sampling), it is normally observed that a:,, > o,,. However, there is little benefit to the investigator if >> a:", because a&,/n then makes little contribution to c&h. Again, we suggest an 'ideal' value of a&/n&,, = 0.3. 1 Fig. 2 Illustration of the sam ling protocol applied to a roughly rectangular site, showin the [rst sampling walk (bold line), the duplicate sampling walk bight line) and the points where increments were collected Experimental Sampling Sites Sixteen sites, consisting of grassed recreational spaces, within the London Borough of Lambeth were selected for investiga- tion, so as to provide a reasonably even coverage of the borough.Most were small gardens typically 2000-10000 m2. Some were discrete plots of comparable size at the edge of large parks. While some sites had remained virtually unmodi- fied since urbanization in the middle of the 19th century, others were sites where houses had previously been destroyed by war damage or demolition. The latter sites consisted of rubble covered with a layer of soil, possibly from a distant source. One site had been subject to remediation after an earlier industrial pollution event. All sites were, at the time of sampling, subject to varying degrees of pollution from road traffic exhaust fumes and, in the past, from coal smoke. Sites previously built upon were possibly also contaminated with paint residues.Sampling Protocol Duplicate samples were collected at each site. The first such sample was obtained by aggregating 13 increments, three collected on each leg of an 'M' that was walked across the site. Fig. 2 shows how this was done for a roughly rectangular site. The duplicate sample was obtained by reversing the walk, i.e., with a 'W'. Each increment was collected to a depth of 5 cm with a 25 mm auger. Mechanical Preparation The aggregate samples were dried, broken down by light compression, passed through a 2 mm sieve to remove stones, roots, etc., and reduced to a manageable bulk by dividing with a riffle. The resulting laboratory sample was ground to a fine powder (-80 mesh) and dried at 105 "C for 16 h. Analysis Duplicate test portions (0.250 g) of each sample were weighed into test-tubes and treated with nitric acid (70% m/m, 1.0 ml) at 100-105 "C for 1 h.The resulting mixture was diluted to 25 ml with water. Determination of cadmium, copper, lead and zinc was accomplished by flame atomic absorption spec- trometry under standard conditions. Analysis was conductedANALYST, SEPTEMBER 1993, VOL. 118 1109 Table 2 Analytical data (pg g-1) Site 1 2 3 4 5 6 7 8 Sample 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Analysis 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Cadmium 1.90 1.3Q 1.28 1.07 3.23 2.34 1.54 2.55 1.75 1.04 2.41 1.27 2.19 2.45 3.25 2.64 1.07 1.51 2.35 1.55 4.51 7.24 5.24 5.65 2.21 2.86 1.77 1.95 1.09 1.51 0.61 1.02 Copper 68 68 71 64 159 149 137 144 49 47 87 85 67 66 72 71 32 36 51 47 64 69 77 65 116 134 116 122 97 82 58 56 Lead 539 511 522 467 863 825 809 900 539 482 728 722 345 326 544 620 181 195 176 177 27 1 300 327 323 838 815 828 788 524 511 577 634 Zinc Site 295 9 297 314 307 1221 10 1070 508 557 260 11 241 407 419 242 12 246 420 379 154 13 149 294 297 229 14 239 286 268 643 15 67 1 455 462 178 16 177 183 185 Sample 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Analysis 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Cadmium 1.29 1.94 1.94 1.89 1.26 0.88 1.71 0.84 1.26 0.63 1.51 1.50 1.05 0.82 1.27 1.68 1.10 1.08 1.30 1.05 2.87 2.56 2.59 2.16 4.41 3.84 3.64 4.41 1.04 0.83 1.06 1.10 Copper 64 65 77 76 51 50 48 47 54 57 82 79 51 71 54 54 47 49 47 45 65 65 49 52 50 77 79 81 80 82 93 ' 93 Lead 406 406 405 386 377 383 34 1 348 51 1 537 692 715 365 382 356 333 486 486 336 329 281 289 33 1 339 474 488 741 739 503 526 630 645 Zinc 269 220 249 250 306 287 685 735 252 248 246 408 242 229 238 148 164 168 204 144 256 558 206 210 287 382 391 401 - 284 290 354 292 directly on the resulting solutions after the suspended matter had settled, or with dilution, as appropriate.The determina- tions for each element were carried out in a random order as a single analytical run. Computing Robust and classical estimates of the three variances were calculated by ANOVA on the unedited data in Table 2, by using a program supplied by Professor B. D. Ripley, Department of Statistics, University of Oxford. Table 3 Statistics obtained by robust and classical ANOVA (pg g-1) Analyte Method R &an 4 a m assite Cd Robust 1.81 0.45 0.21 0.74 Classical 2.05 0.52 0.00 1.24 c u Robust 67.5 3.3 10.5 14.3 Classical 72.8 5.8 11.0 25.7 Pb Robust 489 18.7 86.0 169.5 Classical 496 22.8 88.7 172.0 Zn Robust 310 22.2 107.3 78.9 Classical 341 48.5 147.5 130.4 Results and Discussion Results The raw results of the study are presented in Table 2, and the statistics in Table 3.It is of interest to note that the mean concentrations (in pg g-1) fall well below the levels recom- mended by the Department of the Environment as maxima for land to be used for parks and playing fields,* namely, Cd 15, Cu 1000, Pb 2000, and Zn 1000. The same comment applies to nearly all individual samples also. Applicability of Robust Statistics The results of robust and classical ANOVA can be compared in Table 3.No substantial difference between the respective pairs of estimates is evident in the instance of lead. Zinc, however, displays a different behaviour, with the robust estimates considerably smaller than the classical. It is interest- ing to analyse the differences. ~ For 6,, the classical and robust estimates are 48.5 and 22.2, respectively. The difference is almost entirely accounted for by a few discrepant analytical duplicates (absolute differences of 90, 95, 162, 194 and 302 pg g-1). These discrepancies greatly exceed those for any of the remaining 27 duplicate pairs (Fig. 3) and are readily identified as exceeding the criterion of 63 pg g-1 derived from MSan. No immediate explanation for the discrepancies is apparent, but once they are identified, the results can be checked by re-analysis. Use of a criterion based on classical statistics (137 pg g-l) would fail to identify the lower discrepant duplicates, so the robust value is preferred.For a,,,,, the difference between the estimates can be attributed largely to two occurrences of discrepant duplicate samples, yielding mean absolute differences of 413 and 591 pg g-1. These values are easily visible in Fig. 4 and are identified by exceeding the criterion of 307 pg g-1 derived from MS,,,. In these instances, there is no doubt that the duplicate samples are genuinely different. Large sampling1110 .... . . . . . . . . . I 1 ANALYST, SEPTEMBER 1993, VOL. 118 .. . . 1 I I I . . . . . . . . . . . . 1 I I I 1 I 0 120 240 I 360 480 600 Zinclpg 9-1 Fig. 4 Absolute differences between duplicate sample means 2- is also shown as a vertical bar 1- xil - xi21 - for zinc.The robust criterion for unusually high values ~ 0 100 2001 300 400 500 Zinclpg g-l Fig. 5 Absolute deviations of the site means from the robust grand mean ]Xi - 21 for zinc. The robust criterion for unusually high values v\/0/is also shown as a vertical bar variations are often observed at polluted sites, because, in general, contamination is unlikely to be uniform. This is particularly true for zinc where contamination of soil from corroded galvanized objects would be very localized. In fact, both of the sites that gave rise to discrepant samples have anomalously high average zinc contents. Moreover, one of these samples gave rise to a suspect analytical duplicate, which is thereby, explained (higher analytical variability is expected at higher concentrations).Again, for the orientation survey, the variation within the background sites is of primary concern, so the robust estimate is preferred, as the sites with higher sampling discrepancies are from anomalous sites. Moreover, a criterion based on the classical mean square (428 pg g-1) fails to identify one of the suspect samples. The use of the robust estimate also helps to maintain independence between the sources of variation. At an anomalously high (polluted) site, the sampling and analytical variances are almost certain to be considerably higher than at background sites. The use of the robust estimate removes this type of dependence. For &ite the difference between the robust and the classical estimates is explained by the influence of two sites identified as anomalous by the criterion based on MSsite, with mean concentrations of 829 and 558 pg g-l (Fig.5). As the purpose of an extended survey would be to characterize the variation among the background sites and thereby identify sites that do not conform to the general picture, the use of the robust statistic is justified. The criterion based on classical estimation is higher and less likely to identify anomalous sites. Observations similar to the above apply to the results for cadmium and copper. Suitability of Sampling and Analytical Protocols The robust variance estimates are presented in Table 4, together with the ratios S,”,/S,”,, and 6&h/62te, calculated for single samples at each site and single determinations on each Table 4 Variances and variance ratios obtained by robust ANOVA (basic unit: yg g-1) Analyte 6;” osam - 2 & s,”,/s:a, 6:ech/a:te Cd 0.202 0.044 0,548 4.6 0.45 Cu 10.9 110 204 0.1 0.59 Pb 350 7396 28651 0.05 0.27 Zn 493 11513 6 225 0.04 1.93 sample (n = m = 1).Only the statistics for lead fulfil the previously derived criteria, i.e., the two ratios below the ‘ideal’ value of 0.3. The results for cadmium show that the analysis is unduly variable compared with sampling. This is because the analy- tical method was being used at a concentration of cadmium not greatly above its detection limit. The only means by which the data quality could be significantly improved would be by using a different analytical method.Analytical replication (n > 1) would not help here unless an unacceptable number of replicates were performed. For copper and zinc, the analytical precision is satisfactory, but the sampling precision falls short of requirements. For copper, the situation could be alleviated by duplicate sampling (m = 2, n = 1) or by the equivalent method of preparing the aggregate sample by combining twice the number of incre- ments ’ven in the original protocol. This strategy would make could be remedied only by greatly increasing the number of increments that are aggregated, or perhaps by redefining the nature or size of the sampling targets. 6&/n6sam P = 0.2 and 6&,/6&, = 0.3. For zinc, the situation Conclusions The results of this study show that an orientation survey consisting of a hierarchical replicated experiment followed by robust ANOVA provided valuable information on the magni- tude of sampling and analytical errors and also on the concentrations of the analytes. Knowledge of the robust mean squares allowed unusual measurements to be identified for checking or further investigation more certainly than did their classical counterparts.Moreover, the results provided a direct means of establishing whether the sampling and analytical protocols were satisfactory for the application. In instances where sampling precision was found to be unsatisfac- tory, a remedial strategy was immediately apparent. It is also noteworthy that the sampling precision (in terms of relative standard deviation) was greater for zinc than for the other analytes. This serves to emphasize that the sampling protocol could need to be validated separately for each analyte. References Meisch, A. T., in Computers in the Mineral Industry. Part I , ed. Parks, G. A. , Stanford University Publications in Geological Science, 1964, vol. 9, pp. 156-170. Garrett, R. G., in Statistics and Data Analysis in Geochemical Prospecting, ed. Howarth, R. J . , Elsevier, Amsterdam, 1983, Ramsey, M. H., Thompson, M., and Hale, M., J. Geochem. Explor., 1992,44, 23. Rose, A. W., Hawkes, H. E., and Webb, J. S., Geochemistry in Mineral Exploration, Elsevier, Amsterdam, 1979, pp. 34-35, 32@-329. Analytical Methods Committee, Analyst, 1989, 114, 1699. Huber, P., Robust Statistics, Wiley, New York, 1981. Rousseeuw, P. J., J. Chemometr., 1991, 5 , 1. Problems Arising from the Redevelopment of Gas Works and Similar Sites, Department of the Environment, London, 2nd edn., 1986. Paper 3101533E Received March 17, 1993 Accepted April 21, 1993 pp. 83-107.

 

点击下载:  PDF (545KB)



返 回