首页   按字顺浏览 期刊浏览 卷期浏览 Multivariate analysis of a round-robin study on the measurement of chlorobiphenyls in f...
Multivariate analysis of a round-robin study on the measurement of chlorobiphenyls in fish oil

 

作者: Raj K. Misra,  

 

期刊: Analyst  (RSC Available online 1992)
卷期: Volume 117, issue 7  

页码: 1085-1091

 

ISSN:0003-2654

 

年代: 1992

 

DOI:10.1039/AN9921701085

 

出版商: RSC

 

数据来源: RSC

 

摘要:

ANALYST, JULY 1992, VOL. 117 1085 Multivariate Analysis of a Round-robin Study on the Measurement of Chlorobiphenyls in Fish Oil Raj K. Misra and John F. Uthe Marine Chemistry Division, Department of Fisheries and Oceans, Halifax Fisheries Research Laboratory, P. 0. Box 550, Halifax, Nova Scotia, Canada B3J 2S7 Charles J. Musial C. Musial Consulting Chemist Ltd., Halifax, Nova Scotia, Canada B3L 4J2 Participants identified and measured chlorobiphenyls (CBs) in fish oil both before and after spiking with undisclosed amounts of four CBs following two clean-up methods, a common one and their own. Complete data were received from 18 of 30 participants of which only 2 correctly identified IUPAC No. 86. All correctly identified the other three CBs (IUPAC Nos. 52,101 and 153). The presence of correlations (10 of 12 coefficients of correlation were moderate to high) among concentrations of the three CBs precluded univariate analysis. Following deletion of one obvious outlier, paired difference and mean data from 17 participants were analysed by a multivariate procedure.Four data pairs were compared. No effect of clean-up on precision was found. Spike recoveries differed significantly from expected values, and the patterns of differences varied with the clean-up method employed. Multivariate analysis of paired differences and means was used t o rank laboratory performance with respect t o laboratory precision and bias, and t o identify possible outliers. Laboratory bias (between laboratory) exceeded random (within laboratory) variations in all data sets.None of the rankings based on paired differences (random error) was correlated significantly with the equivalent rankings based on paired means (laboratory bias). However, rankings for data set 1 were correlated with rankings for data set 2, as were those of data set 3 with those of data set 4 for paired difference data and for paired mean data. The multivariate method allows comparisons and laboratory ranking based on si m u Ita neous a na I ysis of correlated mu Iti ple determ i na nds. Keywords: Chlorobiphenyls; interlaboratory round robin; multivariate analysis; laboratory performance ranking; types of error Earlier, we1 described a collaborative study in which partici- pants had identified and measured chlorobiphenyls (CBs) in a spiked and unspiked fish oil.Recoveries ranged from 24 to 294% for spikes of 63-85 ng g-1 of oil. Investigation of the results (see below) showed that univariate analysis was not appropriate due to the presence of significant correlations between CB concentrations. Univariate analysis of paired differences and paired means has been extensively considered.2-3 However, a multivariate approach, which analyses a number of variables (p; p 3 2) jointly, is needed69 when correlations exist among these variables, as a series of p univariate analyses will ignore these correlations and will, therefore, be inappropriate. Given the nature of CB analysis, i.e., each set of CB measurements uses a single chromatogram, it is probably not surprising to find such correlations. Experimental Experimental details have been reported previously.’ In summary, herring (Clupea harengus harengus) oil was spiked as follows: IUPAC No.52-2,2‘ ,5,5’-tetrachlorobiphenyl = 82 ng g-1 of oil; IUPAC No. 86-2,2’,3,4,5-pentachlorobi- phenyl = 77 ng g-1 of oil; IUPAC No. 153-2,2‘,4,4‘,5,5’- hexachlorobiphenyl = 85 ng g-1 of oil; and IUPAC No. 101-2,2,4,5,5’-pentachlorobiphenyl = 63 ng g-1 of oil. Participants also received sufficient amounts of these CBs to prepare standards. Participants were asked to identify these CBs and, inter alia, determine their concentrations in both oils using their own and the common clean-up10 method. Seven- teen (Table 1) reported complete data (one other complete set was dropped as an obvious outlier). Only 2 correctly identified IUPAC No.86; thus its data have been dropped. Partial results’ were also dropped from the present study because multivariate analysis is best carried out with complete data sets .4,11 Statistical Considerations The statistical methodology242 12-14 is presented in the Appen- dix. SYSTATlS was employed for various computations. All tests were carried out at the 5% probability level ( P ) unless stated otherwise. Results for the three-way (laboratory x oil x method) multivariate analysis of variance (MANOVA) showed the need for separate analysis of the differences in the methods for the separate oils and vice versa, as explained in the Appendix. Four paired data sets were compared: set 1, unspiked oil, common method versus laboratory clean-up method; set 2 , spiked oil, common method versus laboratory clean-up method; set 3, common clean-up method, spiked versus unspiked oils; and set 4, laboratory clean-up method, spiked versus unspiked oils.These data sets were chosen for obvious reasons. Although these paired comparisons (contrasts) are not independent, Harris8 states, ‘It is convenient for descriptive purposes to choose independent contrasts, but this should never get in the way of testing those contrasts among the means which are of greatest theoretical import to the researcher’. A laboratory’s two determinations are designated Xi, and Xi2, their differ- ence (Xi, - Xi2) as Di, their sum (Xi, + Xi,) as Si, and their average as pi. The importance of analysing such paired data has been emphasized,2.3 both noting that the dominant factor accounting for variations in the sum (or average) values is systematic error (laboratory bias), whereas difference ( Di) values reflect only random variations3 (laboratory precision). It is important to compare both precision and bias among laboratories.Results Analysis of Paired Differences (Intralaboratory Precision) Means and ranges of differences are shown in Table 2. Each set was analysed the same way and only the results for data set1086 ANALYST, JULY 1992, VOL. 117 1 are tabulated. Pearson coefficients of correlation, r,, (where u, v = 1, 2, 3) were computed for each data set. Several of these, i.e., 10 of 12, ranged from 0.2 to 0.9 in absolute value. Although large sample sizes (17 is hardly large) are desirable for tests of significance of correlations, 6 of these were significant with P ranging from <0.001 to C0.04, and 4 were significant by the Bonferroni test (P = CO.001-<0.05).Bartlett’s x 2 (chi-square) test, which tests the global hypoth- esis concerning the significance of all of the correlations, was significant (P <0.001) in data sets 2 and 3 and nearly so (P = 0.06) in data set 4. These observations lead to the following considerations.4~7-9 (1) When the number (p) of variables is two or more, the covariance structure of the population is described by p variances and %p(p - 1) covariances (or, equivalently, correlations). Univariate analyses, carried out on each of the p variables separately, will take account of p Table 1 Reported concentrations (pg kg-l) of CBs in fish oil IUPAC Sample No.52 153 101 52 153 101 Common Laboratory Labora- clean-up clean-up tory* method method No. Unspiked oil- 1 34.5 2 49 3 57 5 41 6 65.1 7 104 10 137 12 41 17 78.4 19 43 20 51.8 21 53.9 22 90.5 23 183 24 154 25 30 26 48 Spiked oil- 1 76.8 2 116 3 132 5 113 6 140 7 187 10 227 12 110 17 147.7 19 75 20 130.9 21 144 22 135.5 23 246 24 200 25 55 26 102 51.9 76 69 59.7 88.1 107 163 47 77.8 85 62.9 66.9 90.3 73 188 46 64 105.0 159 136 127.9 168 204 313 118 158.5 127 135.6 176 144 141 238 95 115 47.3 124 62 48.9 92.9 92.4 130 47 64.1 67 93.4 53.6 146.2 95 154 43 75 77.9 155 130 104.6 147 184 308 108 119.3 95 146.0 134 211.3 146 193 78 124 80.7 66 80 29.5 56.8 204 210 38 68.9 41 53.6 78.4 47.0 134 158 37 49 145.2 158 184 118.9 127 292 277 108 144.4 98 127.7 127 100.2 210 220 66 135 * Laboratory numbers as defined in ref.1. 83.0 94 63 68.4 122 92.4 159 53 72.7 121 71.8 152.0 54.0 85 163 42 88 140.9 186 141 129.8 171 141 229 125 149.5 157 148.9 182 109.9 156 208 93 146 106.0 137 79 121.1 91.6 69.7 222 62 61.4 54 95.8 81.2 77.4 97 157 48 90 132.2 200 160 135 236 232 105 116.6 102 156.2 124 137.7 15 1 190 87 138 97.2 variances, but will ignore the correlations. Therefore, in the present study, a multivariate procedure, which employs the entire covariance structure ofp + Y2p(p - 1) parameters, must be used. (2) The correlation coefficient is a measure of the closeness of the linear relationship between two variables. A positive value of ruv shows that Di, and Dj, increase together, i.e., if the difference value by the two methods is large for one CB, it is also likely to be large for the other.When ruv is negative, large values of Di, are associated with small values of Di,. Thus, there also exists a ‘third’ type of error, proliferating from the covariation of the CBs, which induces additional disparity among the laboratories. Set I , unspiked oil, common versus laboratory clean-up method Multivariate normality of difference variables was assessed (see Appendix).4 For observations generated from a three- variate normal distribution, about 95% of the dj2 values should be S7.815, which is the 95th percentile of a x 2 distribution with degrees of freedom (DF) = 3. Sixteen of our 17 values, i.e., 94%, were G7.815; thus evidence against three-vanate normality is insufficient.4 A x2 plot4 of ordered di2 (abscissa) and the corresponding x 2 percentile (ordinate) yielded a coefficient of linear correlation of 0.969, giving no evidence against normality.The average difference between determinations is expected to be zero for each CB if no real difference results from the two clean-up methods. In order to test this, the null hypothesis (&) of no average difference, null vector, Q, was specified for C in I& : pD = C (see Appendix). (A vector is denoted by an underscored letter and a matrix by a letter in bold print. Prime denotes a transposed vector or matrix. An unprimed vector denotes a column vector.) The observed Hotelling’s T2 value4 was 3.6757 (P = 0.39) and, therefore, Ho was accepted, i.e., the clean-up method had no significant effect. Squared generalized distances, di2, (Table 3) ranged from 0.0948 to 10.3372 and determined laboratory ranking with respect to precision, yielding an over-all judgement of each Table 3 Squared generalized distances and probabilities of associated chi-square values for paired difference, data set 1 Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Laboratory No.20 2 12 25 26 3 17 6 24 19 1 23 5 22 10 21 7 di2 0.0948 0.1430 0.2160 0.2535 0.3258 0.4999 0.5499 1.2893 1.6170 1.9689 2.0463 2.6857 4.8745 6.4687 6.9112 7.7183 10.3372 Probability 99.20 98.55 97.39 96.74 95.40 91.85 90.77 73.55 65.98 58.28 56.66 55.53 17.95 8.94 7.35 5.13 1.59 (Yo) Table 2 Mean, minimum (Min) and maximum (Max) values of differences between treatments in recoveries of three CBs Data set 1 2 3 4 IUPAC Mean Min Max Mean Min Max Mean Min Max Mean Min Max N0.52 -10.04 -100.0 49.0 -17.68 -105.0 36.0 63.34 25.0 90.1 70.97 29.0 104.0 NO.153 -9.92 -85.1 36.3 2.76 -35.9 84.0 73.26 42.0 150.0 60.57 30.0 92.0 N0.101 -12.61 -92.0 68.8 -2.28 -54.3 76.0 60.31 28.0 178.0 49.98 -23.9 166.3ANALYST, JULY 1992, VOL. 117 1087 Table 4 Laboratory means of CBs for four data sets (minima are underlined; maxima are in bold print) IUPAC sample No. Labora- tory No. 1 2 3 5 6 7 10 12 17 19 20 21 22 23 24 25 26 52 153 Data set 1 101 57.60 57.50 68.50 35.25 60.95 154.00 173.50 39.50 73.65 42.00 52.70 66.15 68.75 158.50 156.00 33.50 48.50 67.45 85 .OO 66.00 64.05 105.05 99.70 161.00 50.00 75.25 103.00 67.35 109.45 72.15 79.00 175.50 44.00 76.00 76.65 130.50 70.50 85 .OO 92.25 81.05 176.00 54.50 62.75 60.50 94.60 67.40 111.80 96.00 155.50 45.50 82.50 52 111.00 137.00 158.00 115.95 133.50 239.50 252.00 109.00 146.05 86.50 129.30 135.50 117.85 228.00 210.00 60.50 118.50 153 Data set 2 122.95 172.50 138.50 128.85 169.50 172.50 271.00 121.50 154.00 142.00 142.25 179.00 126.95 148.50 223 .OO 94.00 130.50 101 105.05 177.50 145.00 100.90 141 .OO 210.00 270.00 106.50 117.95 98.50 151.10 129.00 174.50 148.50 191 S O 82.50 131.00 52 55.65 82.50 94.50 77.00 102.55 145.50 182.00 75.50 113.05 59.00 91.35 98.95 113.00 214.50 177.00 42.50 75.00 153 Data set 3 78.45 117.50 102.50 93.80 128.05 155.50 238.00 82.50 118.15 106.00 99.25 121.45 117.15 107.00 213.00 70.50 89.50 101 52 153 Data set 4 101 62.60 139.50 96.00 76.75 119.95 138.20 219.00 77.50 91.70 81 .OO 119.70 93.80 178.75 120.50 173.50 60.50 99.50 112.95 112.00 132.00 74.20 91.90 248.00 243.50 7 3 .0 106.65 69.50 90.65 102.70 73.60 172.00 189.00 51.50 92.00 111.95 140.00 102.00 99.10 146.50 116.70 194.00 89.00 111.10 139.00 110.35 167.00 81.95 120.50 185.50 67.50 117.00 119.10 168.50 119.50 109.15 113.30 152.85 227.00 83.50 89.00 78.00 126.00 102.60 107.55 124.00 173.50 67.50 114.00 laboratory’s ‘nearness’ to the expected value of zero. Extreme laboratories, i. e . , laboratories for which the squared general- ized distance (see Appendix) yielded values of P S0.05, are shown in bold, but were not excluded because there is no reason to suspect that they are ‘discordant’5 or belong to a different population. Laboratories ranked: 20 (best pre- cision), 2, 12,25,26,3,17,6,24,19,1,23,5,22,10,21 and 7 (worst).Table 3 also gives probabilities (expressed in per cent.) at which their xz ( x 2 with 3 DF) approximations are significant. Set 2, spiked oil, common versus laboratory clean-up method These findings were similar to data set 1, i.e., there was no significant difference between the two clean-up methods (72 = 6.3363; P = 0.18). The coefficient of linear correlation of the x2 plot was 0.946. Fifteen of the 17 di2 values (88%) were S7.815 of xz. The range of di2 was 0.1852-10.9729. Labora- toriesranked: 5 (best), 12,25,6,20,21,17,3,24,2,19,26,23, 1, 22, 10 and 7 (worst). Set 3, common clean-up method, spiked versus unspiked oils If all recoveries of added CBs were loo%, expected values for the means of differences between the spiked and unspiked oils would be 82 for IUPAC No.52, 85 for IUPAC No. 153 and 63 for IUPAC No. 101. To test recoveries (Ho: pD = C), vector C’ was specified as (82, 85, 63). The estimated T2 was 23.8699 with P = 0.004, showing significant average differ- ences in spike recoveries and/or of their linear combinations [see Appendix, eqn. (S)]. As individual variables are of interest, simultaneous 95% confidence intervals (CIS) for the three individual mean differences were estimated by T2 and also by the Bonferroni procedures. None of the three CIS included zero (in either procedure), showing, thereby, that all three CBs significantly contributed to poor recovery. The coefficient of linear correlation of the x 2 plot was 0.952. Sixteen of the 17 di2 values (94%) were a7.815 of x:. The range of di2 values was 0.1185-13.0115.Ranks of laboratories are: 23 (best), 12, 17,24,6,7,5,1,26,20, 19,22,21,3,25,2 and 10 (worst). Set 4, laboratory clean-up method, spiked versus unspiked oils Findings were similar to data set 3. Ho : F~ = C was rejected (72 = 42.7370 yielding P <0.001), showing, again, poor spike recoveries. Simultaneous 95% CIS (both T2 and Bonferroni procedures) showed that the difference variable for each CB contributed significantly to the rejection of Ho. The coefficient of linear correlation of the x 2 plot was 0.969. Sixteen of the 17 di2 values (94%) were S7.815 of x:. The range of di2 values was 0.4329-10.4723. Laboratories ranked: 1 (best), 23,12, 6, 24, 17, 22, 26,20, 10, 19, 3, 21, 2, 25, 5 and 7 (worst).Comparison of spike recoveries by the two clean-up methods Following the observation of no average difference between the two clean-up methods for either oil, differences in recoveries between the two methods for the two oils were compared. The difference between the two treatments is a specific case of the linear comparison among treatments.13 Instead of eqn. (2) (in Appendix) for laboratory i, variable Di is given by Di = (ei, - ei2)unspiked - (ei, - ei2)spiked A laboratory’s systematic error drops out and Di should contain only random errors. Null hypothesis & : F~ = 0 of no average difference between two differences was rejected (T2 = 31.9428 yielding P = O.OOl), showing that the patterns of variation for the two oils were dissimilar.No individual CB contributed significantly to the average difference, as judged by their 95% CIS (by Bonferroni and T2 procedures). The coefficient of linear correlation of the x2 plot was 0.953. Fifteen of 17 d,2 values (88%) were ~ 7 . 8 1 5 of x;. The range of di2 values was 0.2293-10.8385. Laboratories ranked: 24 (best), 23, 17, 6, 1, 22,2,3,25, 19, 12,26,5,20,21,7 and 10 (worst). Analysis of Paired Means As is usual in these studies,2J widely divergent laboratory means were found for all four data sets (Table 4). Laboratory bias effects are apparent as all minima are associated with one laboratory and 9 of 12 maxima with another. Consequently, more laboratories are expected to be identified as extreme in paired mean data than in paired difference data.Set I , unspiked oil, common versus laboratory clean-up methods The null hypothesis of equality of laboratory means was rejected ( P <0.001) by the two-way MANOVA of model (1 1). Squared generalized distances 7’2 [eqn. (13), Appendix] of laboratory means ranged from 2.5500 to 54.2206 (Table 5 ) . Laboratories ranked (Table 5 ) : 6 (best, i.e., least deviant from1088 ANALYST, JULY 1992, VOL. 117 Table 5 Squared generalized distances by Hotelling’s P and probabil- ity levels of significance for paired mean data set 1 Laboratory Probability Rank No. P (Yo ) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 6 17 3 1 26 22 21 20 19 5 2 12 25 23 7 10 24 2.5500 2.6300 3.0196 3.4576 3.6622 5.4149 5.7241 6.0008 7.2962 9.9097 10.8434 11.7925 15.9671 20.8363 23.7602 52.8010 54.2206 54.62 53.38 52.33 41.95 39.52 23.82 21.83 20.20 14.16 7.22 5.75 4.60 1.85 0.73 0.44 0.01 0.01 the over-all mean vector), 17,3,1,26,22,21,20,19,5,2,12, 25,23, 7, 10 and 24 (worst).Set 2, spiked oil, common versus laboratory clean-up methods The & of equal laboratory effects was rejected ( P <O.OOl). The Tz values ranged from 1.3308 to 114.9904. Laboratory rankings were: 3 (best), 6,5, 17,2,20, 1,21,26,12,19,7,23, 22, 24, 25 and 10 (worst). Set 3, common clean-up method, spiked versus unspiked oils The & of equal laboratory effects was rejected ( P <0.001). The T2 values ranged from 1.7390 to 477.4613. Laboratories ranked: 3 (best), 6,17,12,21,5,26,20,7,2,1,25,19,22,24, 10 and 23 (worst). Set 4, laboratory clean-up method, spiked versus unspiked oils The & of equal laboratory effects was rejected ( P <0.001). The 72 values ranged from 2.0029 to 337.8852.Laboratories ranked: 1 (best), 17,26,20,3,2,5,12,22,6,23,21,25,19,24, 10 and 7 (worst). Investigation of Ranks for Correlations Spearman’s coefficient of rank correlation14 analysis of ranks based on paired differences and those based on paired means showed: (1) For each data set, ranks based on paired differences were not significantly correlated with those based on paired means ( P values = 0.55, 0.068,0.605 and 0.210 for data sets 1,2,3 and 4, respectively). (2) Ranks based on paired differences for data set 1 were significantly correlated with those of data set 2 (P = 0.039) as were ranks for data set 3 with those of data set 4 (P = 0.016).(3) Ranks based on paired means for data set 1 were significantly correlated with those based on data set 2 (P <0.001) as were ranks for data set 3 with those of data set 4 (P = 0.048). Discussion Both Youden and Steiner2 and Wernimont3 have noted that analytical results generally vary far more between laboratories (laboratory bias) than within laboratories (precision); our data are no exception. Considerably larger variabilities between laboratories, relative to those within laboratory, were shown in the two-way MANOVAs of all four data sets. More outliers and larger individual deviations from the over-all means of all laboratories were found in the analyses of paired means than in those of paired differences. The need for chemists to develop and apply appropriate techniques for controlling both precision and bias is obvious.In this study, CBs were generally mutually correlated, i.e. , 10 of 12 coefficients of correlation between paired differences ranged from 0.2 to 0.9. Given that each measurement of a CB is not fully independent ‘from the others (all CBs are determined from a chromatogram resulting from a single injection), it is not surprising that error affects all CBs similarly. Unlike the univariate procedure, the multivariate procedure analysed the entire suite of response variables simultaneously and did not ignore these correlations. The multivariate procedure allows laboratory performance to be judged on the basis of an entire suite of determinands simultaneously. Judgements, based on a series of univariate analyses on individual determinands can lead to conflicting results, which are difficult to interpret (because these uni- variate analyses do not account for correlations among determinands), and might unfairly penalize a laboratory in specific instances. Undoubtedly, there are other data sets that would benefit from a multivariate analysis.A variety of newer analytical methods, e. g., inductively coupled plasma mass spectrometry for elements, gashiquid chromatography-mass spectrometry for families of organics, suggest that future intercalibrations will require simultaneous statistical analysis of suites of determinands in order to rate laboratory perfor- mance. The difference between a laboratory’s two determinations in paired data should only contain that laboratory’s random error (precision).However, analysis of data on paired differences showed that their variations could not be explained by random variation alone. Thus, other explana- tions are needed beyond random error. Analysis of paired difference data sets showed that additional causes are not likely to be associated with the two clean-up methods for the following reasons: ( i ) mean differences between the two methods were not significant (data sets 1 and 2); (ii) within the same clean-up method, the means of the spiked (less the added amount of each CB) and unspiked oils differed significantly (data sets 3 and 4); (iii) the differences in measurements by the two clean-up methods in the spiked oil were not similar to those in the unspiked oil; and (iv) the results of the ranking procedure were not consistent over the four data sets.This should be considered in the design of future intercomparative studies. Laboratory rankings based on paired differences were uncorrelated with rankings based on paired means, reflecting the separation of error into random (paired difference) and bias (paired mean)*,3 and reinforcing the necessity for analysts to focus quality control on both precision and accuracy. However, rankings for data set 1 were correlated with rankings for data set 2, as were those of data set 3 with those of data set 4 for paired difference data and for paired mean data, reflecting an over-all consistency in the relative magnitudes of a laboratory’s random error and bias. The multivariate approach allows for two integrated rank- ings, one for precision, one for bias, to be assigned to a laboratory based on its performance for multiple determi- nands, a better approach, in our opinion, than a series of rankings based on analysis of the individual determinands.Identification as a possible outlier serves to identify a laboratory’s performance as being significantly different from the other participants. Laboratories so identified might still be performing adequately with respect to other criteria, but should be aware of a need for improved performance.ANALYST, JULY 1992, VOL. 117 1089 Appendix Three-way MANOVA For value XjjK of a determinand reported by the ith laboratory in the jth oil by the Kth method, the three-way MANOVA model will be XjjK = p + Ai + Bj + CK + (AB)ij + (AC)iK + ( B c ) j K + (ABC)ijK where p is the over-all mean; Ai, Bj and CK are the treatment effects for the zth, jth and Kth groups of treatments, A , B and C , respectively; (AB),,., (AC)iK and (BC)jK are first-order interaction effects; and (ABC),, is the second-order interac- tion effect.Without replication, there is no way of testing interaction ABC. Three-way MANOVA was carried out by employing the main effect, the laboratory, as ‘random’ and the main effects, the oil and method, as ‘fixed’. All three first-order interactions were significant, with P in the range ~0.001-~0.05. These considerations led to the use of a simpler (than three-way) layout for data analysis. Paired Differences At the univariate level, in the general case where q determina- tions are made by laboratory i of a determinand (X), the model for Xij, the jth determination in laboratory i, is written as15 Xjj = pi + eij, j = 1, .. ., q It is assumed that the measurements Xij are N(pi, 02), i.e., normally distributed about the laboratory mean pi, which can vary from one laboratory to another, and that the variance (02) of Xij is the same in all laboratories. Participating laboratories are considered to be a sample of all laboratories that will use the method.3 This model can be written as Xij = p + ai + eij (1) where p is the over-all mean, aj is the laboratory deviation (bias) and where the random elements eij are N(0,02). When q = 2, for duplicate determinations made by n laboratories, the difference ( Di) between determinations from laboratory i is given by A laboratory’s systematic error (bias) drops out of Di and the sample of n measurements ( D j ) on the difference variable (6) should contain only random errors of both determinations.2 However, with Youden replicates, i .e . , single determinations on two similar materials, values of Dj are not necessarily trivial and merit investigation.’ We consider, for example, data set 1 where replicates are measurements from two treatments, which are the common clean-up method (M1) and the laboratory’s own clean-up method (M2) for the unspiked oil. The Dj for laboratory i will be given by Dj = M1 - M2 + ei, - ei2 (3) instead of the value shown in eqn. (2). As the number of treatments is two, data can be analysed as ‘paired’ data. From eqn.(3) we note the following: in addition to random errors of determinations, D j values reflect only the differential effects of treatments. The difference (Di) between two members of a pair (within laboratory i) is an estimate of the average difference (PO) in the effects of two treatments. To determine if a specified value (C; which is often specified as zero) is a plausible value for the mean difference between treatments, the null hypothesis, Ho: pD = C , is tested against the alternative H1 : p D # C. The appropriate test statistic is where D and sD2 are estimates of the population mean and variance, respectively, of the differences. It has a Student’s t-distribution with n- 1 degrees of freedom (DF). Equivalent to rejecting & when It1 is large is if its square is large.Thejariable t2 is the squared distance from the sample mean D to the test value C.4 Ho is rejected in favour of HI at a significance level (a) if this t2 > tn-12(0(/2) denotes the upper (100a)th percentile of the t-distribution with n - 1 DF.4 We note that the variance ratio F with 1 DF, n - 1 has the same frequency distributions as t2 with n - 1 DF. Equi- valently, the F-test can be used, instead of t2, to test &. Deviations, Di - p ~ , are assumed to be independent and normally distributed with a population mean of zero. These deviations reflect random variations of measurements, free from systematic laboratory effects (laboratory bias), and the test on the mean difference remains fairly precise.3 The error in the paired design is a measure of the failure of the pair differences to be identical with the mean difference. These deviations can be analysed for purposes such as ranking of laboratories and identification of laboratories that deviate markedly from the other laboratories.Barnett and Lewis5 point out: ( a ) a simple visual inspection of a data set is unlikely to identify an outlier; and ( b ) an outlier is revealed only when the parameters of the postulated probability model are fitted to the data set and the deviations of the observed responses from the fitted values analysed in terms of the variational properties of a random sample generated by the model. The above presentation is concerned with univariate analysis, i.e., where a data set consists of only one response variable X , e.g., recovery of one CB.Data reported in this study comprise 3 (or p ) responses (XK) where K = 1 for CB IUPAC No. 52, = 2 for CB IUPAC No. 153 and = 3 for CB IUPAC No. 101. It is also noted that measurements of the three CBs resulted from a single analysis of the oil in each case and are, generally, mutually correlated. These will generate p paired difference random variables (aK). Denoting the obser- vation in laboratory i for replicate j on response variable K by XijK, measurements on these difference variables become We denote the Pearson coefficient of correlation between two difference variables, 6, and 6, by ruv; u,v = 1, . . ., p . When two or more of these p variables are mutually correlated, multivariate analysis, which analyses p variables jointly, should be employed rather than p separate univariate analyses.4.69 The following is noted regarding the significance of the correlations.7J5 The joint distribution of the sample correlations r,,; u,v = 1, .. ., p , required for simultaneous inferences about their parameters, is not available in a closed and practical form. Multivariate normality of the distribution is realized only for large samples. SYSTAT15 will test the null hypothesis that the population correlation matrix is an identity, i. e . , the global hypothesis concerning the significance of all correlations in the matrix, by Bartlett x2 and give the matrix of probabilities associated with individual correlation coefficients. However, these probabilities do not reflect the number of correlations that are under consideration here. The Bonferroni criterion provides protection for mul- tiple tests by identifying correlations as significant with the assurance that the family comparison error rate will not exceed the critical value chosen by the user.The scope of the procedure is restricted because the multiple tests and intervals for the parameters offered by it are conservative. Bonferroni- adjusted probabilities are also provided in SYSTAT.15 A natural generalization of the univariate squared distance (t2) of eqn. (4) is its multivariate anal~gue,~ Hotelling’s 72. In brief, the multivariate extension is as follows: The difference variable (6) for a CB is replaced by a vector (tj) of difference variables al, . . . , 6p for p CBs. The multivariate analogue of the variance of one difference variable of the univariate procedure will be a p x p E matrix of variances 0,“ (u = v) and1090 ANALYST, JULY 1992, VOL.117 covariances (u # v ) ; u, v = 1, . . . , p of p difference variables. We denote a vector by an underscored letter and a matrix by a letter in bold print. Prime denotes a transposed vector or matrix. An unprimed vector denotes a column vector. Let be a random vector with p components, Dil, . . ., Dip of measurements on difference variables. The p-variate normal distribution of QIi is Np ( F ~ , Z) with parameters ~0 and Z for the mean vector and the variance4ovariance matrix, respect- ively. A sample of n independent random vectors (Qi, i = 1, . . ., n) is analysed for inferences about the vector of mean differences based on the P-statistic.To test the null hypoth- esis, & : = C against HI : F D # C (where Cis a null vector Q or a vector of specified values), P is estimated as T z = ( Q - L3 where @ and S are the estimated (from the data) mean vector and variance-covariance matrix, respectively, of Qi,S/n is the estimated variance- covariance matrix of the vector Q of mean differences and (S/n)-l is its Special tables for T2 percentage points are not required as we can test its significance by Fp,n-p which is the random variable with an F-distribution with p DF, n - p , in the following manner: reject Ho if the observed 7’2 > [(n - l ) p / ( n - p)]Fp,,.-p(a) where Fp,,-p(a) is the upper (100a)th percentile of the F-distribution.4.7 A test of significance should preferably be put in the form of a confidence interval statement.3 In the T2 procedure: ( i ) a 100(1 - a)”/.confidence region for the meag of a p-variate normal distribution is the set of all values of 8 for which n(B - g)’S-l(D - 5) d [(n - l ) p / ( n - p)]Fp,,-p(a) (6) and (ii) lOO(1 - a)% simultaneous cpfidence intervals (CIS) for the individual mean differences ( 6 K ) are given by where .sik is the Kth element, i.e., mean of the Kth difference variable, of 0.497 The following are noted about confidence interval statements.4.7 (1). It is possible that Ho: pD = Q is rejected by the P procedure, and yet each of thep CIS of eqn. (7) includes zero (thereby showing that the mean difference is not significant for any individual CB). This can happen easily when variables are correlated.T2 will reject Ho if at least one of the several possible linear combinations, is significant. The CIS of eqn. (7) are only p particular combinations, viz., those corresponding to choices where one WK = 1 and every other wK is zero. The Bonferroni approach will yield narrower (and thus, more precise) CIS than will the simultaneous 7’2-intervals when the number of linear combi- nations is small. By the Bonferroni method, CIS with an over-all confidence level bl - a for the p components of individual mean differences ?jK are given by eqn. (7) following the replacement of the coefficient of (SDt/n)t in the equation by t(d2p) with n - 1 DF (2) CIS of linear combinations estimated by T2 are ideal for ‘data snooping’, i.e., we can choose values of wK of the linear combinations based upon examination of the data without changing the confidence coefficient, 1 - a.I f n and n - p are large, T2 will behave approximately like a chi-square random variable, x Z p , with p DF, even if the underlying population of differences is not normally distri- buted.4 When the population is normal, this approximation is acquired rapidly.4 For all practical purposes, the distributions encountered in chemical analysis conform to the normal distribution.3 The following are noted4.5.12 pertaining to the checks for multivariate normality and for possible multivariate outliers. (1) Unlike an outlier in a unidimensional data set, an outlying observation in a sample of multivariate data does not have a simple manifestation as an observation which ‘sticks out at the end’ of the sample as the sample has no ‘end’.Vector Qi for laboratory i might be an outlier for various reasons, e.g., because of correlation distortion or a gross error in one of its components or systematic mild errors in all its components. (2) The multivariate case is too complex to provide a unique, unambiguous form of total ordering for data and to express extremeness of observations. (3) A reduced sub-ordering, i.e., less than total ordering, principle employs the distance measure to represent multi- variate observations. Squared generalized distances for indi- vidual laboratories from their mean vector are given4 by (9) (4) For the multivariate normal model, this distance measure has broad statistical support and practical appeal in terms of constant probability density contours.(5) Employing the generalized distances (d:) as approxi- mately distributed as x2, is useful to check for multivariate normality and for possible suspect multivariate outliers. Johnson and Wichern4 state, ‘Although these distances are not independent or exactly chi-square distributed, it is helpful to plot them as if they were’. When estimated values of p-vector of means and of p x p variance-covariance matrix are used instead of their true values, this procedure ‘retains a measure of informal propriety and appeal’.5 Multivariate normality of difference variables is assessed as follows? (a) for observations generated from a three-variate normal distribu- tion, we would expect about 95% of these di2 values to be S7.815, which is the critical value of x; at the 5% probability level; and ( b ) a chi-square plot of the ordered distances (abscissa) and the corresponding chi-square percentiles (ordi- nate) would yield a straight line for a normal distribution.(6) These notes indicate that the x2 approximation of T2 can be meaningfully employed for comparing laboratory per- formance. The chi-square procedure has the advantages of familiarity, simplicity and availability of tables of probabil- ities. Laboratory ranking and distance will not change with use of 72 and x2 procedures, only its outlying status. The 7’2 procedure adds nothing of practical importance to the findings based on the x2 procedure. Here, analysis for outliers was carried out to note a laboratory’s extremity, not to discard it.This is because, although a laboratory might be extreme, there is no reason to consider it either as ‘discordant’5 or as belonging to a different population. Paired Means With respect to the univariate analysis of a paired data set, Youden and Steiner2 and Wernimont3 note that the concept of blocks is very general and that a paired experimental design is a special case of a randomized complete blocks design with two treatments. Here, laboratories are considered as blocks. For the two criteria of classification, laboratories and treat- ments, the model for ANOVA without replication can be written as Xij = p + Ai + Bj + eij, i = 1, . . ., n a n d j = 1 , 2 (10) where p is the over-all mean, Ai (= pi - p) is the ith laboratory effect (bias), and Bj is the jth treatment effect.The null hypothesis Ho : A l = . . . = A , = 0, (that the laboratory means are all equal, or, equivalently, the hypothesis of no laboratory effects) is tested by comparing the laboratory mean square (MSA) with the error (discrepancy) mean square (MSE). If the ratio MSA : MSE d F, - l , n - (a), Ho is accepted. Rejec- tion of & will lead to the conclusion that not all laboratory means are equal, showing the existence of systematic labora- tory effects. Two-way ANOVA also provides for comparisons of treatment means. However, tests for these comparisons are also provided in the Paired Differences section by defining the null hypothesis in terms of Di values. In order to judge theANALYST, JULY 1992, VOL. 117 1091 over-all difference between processes, the procedure that uses differences between results within blocks is the equivalent of that which uses the interaction between blocks and treat- ments.3 The need for multivariate analysis when p variables are generally mutually correlated has been demonstrated.4,”9 In identifying these correlations, the MSE for model (10) provides an unbiased estimate of the error variance, 02, for each determination.4 The variance of the differences, Di [eqn. (2)], is an estimate of 202, as these differences include random errors of both determinations.2 As these two variances differ only by a constant, i.e., 2, correlation matrices of e values in the two-way multivariate classification of CBs will be the same as those of D values.For the p = 3 response variables, X K , where K = 1 for IUPAC No.52, = 2 for IUPAC No. 153, and = 3 for IUPAC No. 101, we denote the observation in laboratory i by treatment j on response variable K by Xi_.. The observed megn vgctor for_laboratory i is denoted by xi with components Xi,, Xi2 and X3 for means of individual CBs, and the over-all average observation vector by X. A two-way MANOVA will test the null hypothesis (H,) that laboratory mean vectors are all equal against HI that at least two mean vectors are not equal. For a vector response consisting of p components, the generalization of the ANOVA model (10) will be where vectors are all p-dimensional. Rejection of & is followed by an examination of the pattern of variations of laboratory mean vectors.Each laboratory can have its own systematic error, which will offset its result from the correct result.2 At the univariate level, a specific rational way to compare laboratory means based on the t-statistic has been recommended.3 For laboratory i and response variable XK, this t is given3 as: It measures the distance of the mean Z K , K = 1, . . ., p for laboratory i from the over-all mean xK in distance units expressed in terms of the standard error of the mean, Sx,. In using this to rank laboratories and to identify possible outliers, we note: ( i ) MSE provides an unbiased estimate of error variance for each determination, so that SyiK = (MSE/r)l where r is the number of deter-minations in laboratory i; (ii) considering that XK includes XiK, S x K could be adjusted by multiplying it by [(n - l)/n]i. Obviously, this adjustment can affect only the outlying status of a laboratory, not its rank; and (iii) again, the use of t2 provides an equivalent (to t) approach. The variable t2 is used as a measure of the squared distance of a laboratory mean at the univariate level .4 The multivariate analogue of the squared distance is the squared generalized distance, given by Hotelling’s 72,4 i.e., where S is the error variance-covariance matrix of the two-way MANOVA.Again, special tables of probabilities associated with 72 are not required as we can test its significance using F. The authors acknowledge the review and commentary on the manuscript received from J. M. Bewers, T. King and J. van der Meer. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 References Uthe, J. F., Musial, C. J . , and Misra, R. K.,J. Assoc. Off. Anal. Chem., 1988,71, 369. Youden, W. J . , and Steiner, E. H., Statistical Manual of the AOAC, Association of Official Analytical Chemists, Arlington, VA, 1975. Wernimont, G. T., Use of Statistics to Develop and Evaluate Analytical Methods, Association of Official Analytical Chemists, Arlington, VA, 1987. Johnson, R. A . , and Wichern, D. W., Applied Multivariate Statistical Analysis, Prentice Hall, Englewood Cliffs, NJ, 2nd edn., 1988. Barnett, V., and Lewis, T., Outliers in Statistical Data, Wiley, New York, 2nd edn., 1984. Bray, J. H., and Maxwell, S. E . , Multivariate Analysis of Variance, Sage University Papers Series on Quantitative Appli- cations in the Social Sciences 07-054, Beverly Hills, CA, 1986. Morrison, D. F., Multivariate Statistical Methods, McGraw-Hill, New York, 2nd edn., 1976. Harris, R. A . , A Primer of Multivariate Statistics, Academic Press, New York. 1975. Kshirsagar, A . M. , Multivariate Analysis, Marcel Dekker, New York, 1972. Reynolds, L. M., and Cooper, T . , Water Quality Parameters ASTM STP 573, American Society for Testing and Materials, Philadelphia, PA, 1975. Pimentel, R. A . , Morphometrics, Kendall-Hunt, Dubuque, IA, 1979. Gnanadesikan, R., and Kettenring, J . R., Biometrics, 1972,28, 81. Snedecor, G. W., and Cachran, W. G . , Statistical Methods, Iowa State University Press, Ames, IA, 7th edn., 1980. Steel, R. G. D., and Torrie, J. H . , Principles and Procedures of Statistics, McGraw-Hill, New York, 1960. Wilkinson, L., SYSTAT: The System for Statistics, SYSTAT, Evanston, IL, 1988. Paper 1104279C Received August 15, 1991 Accepted January 28, 1992

 

点击下载:  PDF (1080KB)



返 回