|
21. |
Power Approximations to Multinomial Tests of Fit |
|
Journal of the American Statistical Association,
Volume 84,
Issue 405,
1989,
Page 130-141
F.C. Drost,
W.C. M. Kallenberg,
D.S. Moore,
J. Oosterhoff,
Preview
|
PDF (1694KB)
|
|
摘要:
Multinomial tests for the fit of iid observationsX1…,Xnto a specified distributionFare based on the countsNiof observations falling inkcellsE1, …,Ekthat partition the range of theXj. The earliest such test is based on the Pearson (1900) chi-squared statistic:X2= Σki=1(Ni–npi)2/npi, wherepi=PF(XjinEi) are the cell probabilities under the null hypothesis. A common competing test is the likelihood ratio test based onLR= 2 Σki=1Nilog(Ni/npi). Cressie and Read (1984) introduced a class of multinomial goodness-of-fit statistics,Rλ, based on measures of the divergence between discrete distributions. This class includes bothX2(when λ = 1) andLR(when λ = 0). All of theRλhave the same chi-squared limiting null distribution. The power of the commonly used members of the class is usually approximated from a noncentral chi-squared distribution that is also the same for all λ. We propose new approximations to the power that vary with the statistic chosen. Both the computation and results on asymptotic error rates suggest that the new approximations are greatly superior to the traditional power approximation for statisticsRλother than the PearsonX2. The derivation of the limiting null distribution for the Cressie—Read statistics, following that forLR, is based on a Taylor series expansion ofRλ, in whichX2is the dominant term. The same expansion produces the traditional noncentral chi-squared power approximation by considering sequences of alternative distributions for theXjthat approach the hypothesisFat a suitable rate. Our power approximations are obtained from a Taylor series expansion that is valid for arbitrary sequences of alternatives. When linear and quadratic terms are retained, an accurate but computationally difficult approximation,Aλ, in terms of linear combinations of noncentral chi-squares is obtained. A second approximation,Bλ, in terms of a single noncentral chi-squared distribution results from averaging the coefficients inAλ, This simple approximation performs well. In the important case of the statisticLR, Aλ=Bλand this new noncentral chi-squared approximation is very accurate. Retaining only linear terms in the expansion produces an approximationLλbased on a normal distribution; this is generally much inferior toAλand Bλ.
ISSN:0162-1459
DOI:10.1080/01621459.1989.10478748
出版商:Taylor & Francis Group
年代:1989
数据来源: Taylor
|
22. |
Analysis of Sets of Two-Way Contingency Tables Using Association Models |
|
Journal of the American Statistical Association,
Volume 84,
Issue 405,
1989,
Page 142-151
MarkP. Becker,
CliffordC. Clogg,
Preview
|
PDF (1830KB)
|
|
摘要:
A class of models is introduced for the analysis of group differences in the association between two discrete variables. The RC(M) association model for two-way tables is reviewed, and alternative weighting systems for identifying interaction parameters are presented. This model is generalized for the setting where a two-way contingency table is available for two or more groups. Various restricted models can be used to examine possible sources of intergroup heterogeneity in the association. These sources pertain to heterogeneity in the intrinsic association and/or in the scores for the row and column variables. The importance of weights used to identify the row and column scores is emphasized. A classical set of data previously analyzed by many authors is used to illustrate the advantages of the models and methods developed here.
ISSN:0162-1459
DOI:10.1080/01621459.1989.10478749
出版商:Taylor & Francis Group
年代:1989
数据来源: Taylor
|
23. |
Compatible Conditional Distributions |
|
Journal of the American Statistical Association,
Volume 84,
Issue 405,
1989,
Page 152-156
BarryC. Arnold,
S.James Press,
Preview
|
PDF (747KB)
|
|
摘要:
Consider two families of candidate conditional densities (or probability mass functions), {f(x | y);y∈Sy} and {f(y|x):x∈Sx}. This article investigates necessary and sufficient conditions for the existence of a joint density (or joint probability mass function)f(x, y) with the given families as its associated conditional densities. This supplements previous work that has addressed the question of uniqueness off(x, y) assuming its existence.
ISSN:0162-1459
DOI:10.1080/01621459.1989.10478750
出版商:Taylor & Francis Group
年代:1989
数据来源: Taylor
|
24. |
Relative Entropy Measures of Multivariate Dependence |
|
Journal of the American Statistical Association,
Volume 84,
Issue 405,
1989,
Page 157-164
Harry Joe,
Preview
|
PDF (1441KB)
|
|
摘要:
There has been a lot of work on measures of dependence or association for bivariate probability distributions or bivariate data. These measures usually assume that the variables are both continuous or both categorical. In comparison, there is very little work on multivariate or conditional measures of dependence. The purpose of this article is to discuss measures of multivariate dependence and measures of conditional dependence based on relative entropies. These measures are conceptually very general, as they can be used for a set of variables that can be a mixture of continuous, ordinal-categorical, and nominal-categorical variables. For continuous or ordinal-categorical variables, a certain transformation of relative entropy to the interval [0, 1] leads to generalizations of the correlation, multiple-correlation, and partial-correlation coefficients. If all variables are nominal categorical, the relative entropies are standardized to take a maximum of 1 and then transformed so that in the bivariate case, there is arelative reduction in variabilityinterpretation like that for the correlation coefficient. The relative entropy measures of dependence are compared with commonly used bivariate measures of association such as Kendall'sτband Goodman and Kruskal's λ and with measures of dependence based on Pearson's ϕ2distance. Examples suggest that these new measures of dependence should be useful additional summary values for nonmonotonic or nonlinear dependence. Assuming that the multivariate data are a random sample, the statistical measures of dependence with estimated probability density or mass functions can be studied asymptotically. Standard errors are obtained when all variables are categorical, and an outline of what must be done in the case of all continuous variables is given.
ISSN:0162-1459
DOI:10.1080/01621459.1989.10478751
出版商:Taylor & Francis Group
年代:1989
数据来源: Taylor
|
25. |
Regularized Discriminant Analysis |
|
Journal of the American Statistical Association,
Volume 84,
Issue 405,
1989,
Page 165-175
JeromeH. Friedman,
Preview
|
PDF (1984KB)
|
|
摘要:
Linear and quadratic discriminant analysis are considered in the small-sample, high-dimensional setting. Alternatives to the usual maximum likelihood (plug-in) estimates for the covariance matrices are proposed. These alternatives are characterized by two parameters, the values of which are customized to individual situations by jointly minimizing a sample-based estimate of future misclassification risk. Computationally fast implementations are presented, and the efficacy of the approach is examined through simulation studies and application to data. These studies indicate that in many circumstances dramatic gains in classification accuracy can be achieved.
ISSN:0162-1459
DOI:10.1080/01621459.1989.10478752
出版商:Taylor & Francis Group
年代:1989
数据来源: Taylor
|
26. |
A Nonmetric Approach to Linear Discriminant Analysis |
|
Journal of the American Statistical Association,
Volume 84,
Issue 405,
1989,
Page 176-183
Adi Raveh,
Preview
|
PDF (1384KB)
|
|
摘要:
A new nonmetric linear discriminant analysis approach is proposed that is based on the maximization of an index of separation differing from that used by the classical method. The possibility of choosing between Fisher's classical discriminant function and the one proposed here enables us to reduce the number of misclassifications for given data. The method is exemplified on empirical data and various simulations and is compared with the classical linear discriminant analysis.
ISSN:0162-1459
DOI:10.1080/01621459.1989.10478753
出版商:Taylor & Francis Group
年代:1989
数据来源: Taylor
|
27. |
Approximate Confidence Intervals for the Number of Clusters |
|
Journal of the American Statistical Association,
Volume 84,
Issue 405,
1989,
Page 184-191
Roger Peck,
Lloyd Fisher,
John Van Ness,
Preview
|
PDF (1249KB)
|
|
摘要:
We consider clustering for the purpose of data reduction. Similar objects are grouped together in clusters so that one can then work with the few cluster descriptors instead of the many data points. The quality of any given clustering is measured by a loss function that takes into account both the parsimony of the clustering and the loss of information due to clustering. An optimal clustering can be obtained by minimizing the theoretical loss function. It is shown that a sample version of the loss function and optimal clustering converge strongly to their theoretical counterparts as the sample size tends to infinity. We then develop a bootstrap-based procedure for obtaining approximate confidence bounds on the number of clusters in the “best” clustering. The effectiveness of this procedure is evaluated in a simulation study. An application is presented.
ISSN:0162-1459
DOI:10.1080/01621459.1989.10478754
出版商:Taylor & Francis Group
年代:1989
数据来源: Taylor
|
28. |
Uniformly More Powerful Tests for Hypotheses concerning Linear Inequalities and Normal Means |
|
Journal of the American Statistical Association,
Volume 84,
Issue 405,
1989,
Page 192-199
RogerL. Berger,
Preview
|
PDF (1390KB)
|
|
摘要:
This article considers some hypothesis-testing problems regarding normal means. In these problems, the hypotheses are defined by linear inequalities on the means. We show that in certain problems the likelihood ratio test (LRT) is not very powerful. We describe a test that has the same size, α, as the LRT and is uniformly more powerful. The test is easily implemented, since its critical values are standard normal percentiles. The increase in power with the new test can be substantial. For example, the new test's power is 1/2α times bigger (10 times bigger for α = .05) than the LRT's power for some parameter points in a simple example.
ISSN:0162-1459
DOI:10.1080/01621459.1989.10478755
出版商:Taylor & Francis Group
年代:1989
数据来源: Taylor
|
29. |
Estimating a Product of Means: Bayesian Analysis with Reference Priors |
|
Journal of the American Statistical Association,
Volume 84,
Issue 405,
1989,
Page 200-207
JamesO. Berger,
JoséM. Bernardo,
Preview
|
PDF (1214KB)
|
|
摘要:
Suppose that we observeX∼N(α, 1) and, independently,Y∼N(β, 1), and are concerned with inference (mainly estimation and confidence statements) about the product of means θ =αβ. This problem arises, most obviously, in situations of determining area based on measurements of length and width. It also arises in other practical contexts, however. For instance, in gypsy moth studies, the hatching rate of larvae per unit area can be estimated as the product of the mean of egg masses per unit area times the mean number of larvae hatching per egg mass. Approximately independent samples can be obtained for each mean (see Southwood 1978). Noninformative prior Bayesian approaches to the problem are considered, in particular thereferenceprior approach of Bernardo (1979). An appropriate reference prior for the problem is developed, and relatively easily implementable formulas for posterior moments (e.g., the posterior mean and variance) and credible sets are derived. Comparisons with alternative noninformative priors and with classical procedures are also given. The motivation for this work was in part the statistical importance of the problem and the difficulty in producing reasonable classical analyses, and in part to provide an interestingly complex example of a recently developed method of deriving reference priors for problems with nuisance parameters. This new method is briefly described. The problem is also of interest because of its mention by Efron (1986) as a situation for which standard noninformative prior Bayesian theories encounter difficulties.
ISSN:0162-1459
DOI:10.1080/01621459.1989.10478756
出版商:Taylor & Francis Group
年代:1989
数据来源: Taylor
|
30. |
Pairwise Comparisons of Generally Correlated Means |
|
Journal of the American Statistical Association,
Volume 84,
Issue 405,
1989,
Page 208-213
A.J. Hayter,
Preview
|
PDF (989KB)
|
|
摘要:
A commonly occurring statistical inference problem in practice is that of making simultaneous comparisons among three or more treatment meansμi(1 ≤i≤k) for certain experimental designs. One way to do this, theTmethod proposed by Tukey (1953), constructs simultaneous confidence intervals for all pairwise differences of the treatment meansμi–μj(1 ≤i, j ≤ k, i≈j). The joint confidence level of these confidence intervals, however, depends in a very complicated fashion on the covariance structure of the treatment mean estimatesμi, and, therefore, for designs where the covariance structure is not “simple,” the joint confidence level is not readily apparent. In general, this is the case for any unbalanced design or for designs in which the treatment mean estimatesμihave unequal correlations. Tukey conjectured in 1953 that whatever the correlation structure of the treatment mean estimatesμi, theTmethod would always provide a conservative set of confidence intervals, that is, that the actual joint confidence level of the confidence intervals would always be at least as great as the nominal joint confidence level 1 – α. In this article a discussion is undertaken of the evidence that theTmethod in general provides a conservative set of simultaneous confidence intervals. It is shown that the coverage probability of the simultaneous confidence intervals depends on the covariance structure only through thek(k− 1)/2 variances of the pairwise differences of the treatment mean estimates. A set of sufficient, but not necessary, conditions on these variances is given, which ensures that theT-method confidence intervals are conservative. In addition, the application of theTmethod to various common experimental designs that produce correlated treatment mean estimates is discussed. An integral expression is derived for calculating the exact joint confidence level of theT-method confidence intervals or for calculating confidence intervals of joint confidence level exactly equal to a given value. This expression is used to evaluate the coverage probabilities for a wide variety of covariance structures withk= 4. In each case considered, theT-method confidence intervals are conservative, and, furthermore, the amount of conservativeness is very small unless the population mean estimates have radically different variances and covariances.
ISSN:0162-1459
DOI:10.1080/01621459.1989.10478757
出版商:Taylor & Francis Group
年代:1989
数据来源: Taylor
|
|