|
1. |
Evaluating the tradeoff between bias and variance through use of prior probabilities |
|
Communications in Statistics - Simulation and Computation,
Volume 26,
Issue 2,
1997,
Page 399-421
Paul F. Pinsky,
Laurence S. Magder,
Preview
|
PDF (679KB)
|
|
摘要:
Increasing the complexity of a model generally decreases the bias but increases the variance of estimates of features of the true distribution. This finding, known as the bias-variance tradeoff, indicates that fitting an extra parameter to a model may either increase or decrease the mean squared error of estimates. In this paper we adopt a Bayesian framework from which to evaluate the bias-variance tradeoff. Using the balanced ANOVA model to illustrate the approach, we define a loss function based on the squared error of the cell mean estimates. Then, using the prior on the parameter of interest θ, we calculate an expected loss under the decisions to fit or not fit 0. Fitting θ is defined as using the conventional frequentist least squares estimate; not fitting θ means setting it equal to 0,
ISSN:0361-0918
DOI:10.1080/03610919708813388
出版商:Marcel Dekker, Inc.
年代:1997
数据来源: Taylor
|
2. |
On the conditional and unconditional distributions of the number of runs in a sample from a multisymbol alphabet |
|
Communications in Statistics - Simulation and Computation,
Volume 26,
Issue 2,
1997,
Page 423-442
Eugene F. Schuster,
Gu Xiangjun,
Preview
|
PDF (2046KB)
|
|
摘要:
Letbe a randomly ordered vector of lengthwhere Ni is the number of symbols of typei,i= 1, .kin. In the unconditional problem, S is the outcome of N independent trials of a multinomial experiment with k classes andhas the multinomial distribution. In the conditional problem, eachis a known fixed number andis a random arrangement of theNsymbols. The main results in this paper are new recursion formulas for the pdf of the total number of runs, say R, infor both the unconditional and conditional problems. We also give formulas for the pdf of the number of runs of a given symbol type. Finally, we demonstrate the utility of our recursion algorithms for the pdf of R in the software systemMathematica(Mathematicais a registered trademark of Wolfram Research, Incorporated),discuss reasons that our algorithm in the conditional problem is much faster and easier to program than the 1957 algorithm of Barton and David, and correct three errors in their table.
ISSN:0361-0918
DOI:10.1080/03610919708813389
出版商:Marcel Dekker, Inc.
年代:1997
数据来源: Taylor
|
3. |
Lower percentage points of hartley's extremal quotient statistic and their applications |
|
Communications in Statistics - Simulation and Computation,
Volume 26,
Issue 2,
1997,
Page 443-465
J. J. Bau,
Hubert J. Chen,
Shun-Yi Chen,
Preview
|
PDF (1901KB)
|
|
摘要:
Consider K(>2) independent populations π1,..,πksuch that observations obtained from πkare independent and normally distributed with unknown meanµiand unknown varianceθii = 1,…,k. In this paper, we provide lower percentage points of Hartley's extremal quotient statistic for testing an interval hypothesisH0θ[k]θ[k]> δ vs.Ha:θ[k]θ[1]≤ δ , where δ ≥ 1 is a predetermined constant andθ[k](θ[1]) is the max (min) of the θi,…,θk. The least favorable configuration (LFC) for the test underH0is determined in order to obtain the lower percentage points. These percentage points can also be used to construct an upper confidence bound for θ[k]/θ[1].
ISSN:0361-0918
DOI:10.1080/03610919708813390
出版商:Marcel Dekker, Inc.
年代:1997
数据来源: Taylor
|
4. |
On preconditioning the data for the wavelet transform when the sample size is not a power of two |
|
Communications in Statistics - Simulation and Computation,
Volume 26,
Issue 2,
1997,
Page 467-486
R. Todd Ogden,
Preview
|
PDF (3690KB)
|
|
摘要:
A powerful and efficient method for nonparametric regression involves taking the discrete wavelet transform (DWT) of data, shrinking the resulting wavelet coefficients, and then computing the inverse wavelet transform to get an estimate of the regression function. Currently, most wavelet decomposition software packages require that the original set of data have sample size n equal to a power of two in order to achieve an exact orthogonal wavelet transform. In statistical data analysis, such is rarely the case, so in an effort to broaden the applicability of such methods, various ways of preconditioning data not meeting this restriction are discussed and compared. These results illustrate the important point that wavelet coefficients resulting from preconditioned data should never be thrown blindly into a threshold selection procedure which depends on the coefficients being independent with equal variance. Such procedures can still be used, but great care must be taken to choose an appropriate preconditioning method. Also, the resulting wavelet vector can certainly be variance-corrected (with only rather light computational burden) before a thresholding procedure is applied to it. Some of the correlation can also be removed, though this is certain to be quite computationally expensive. (The coefficients can never, of course, be completely orthogonalized, however, since n < 2J.)
ISSN:0361-0918
DOI:10.1080/03610919708813391
出版商:Marcel Dekker, Inc.
年代:1997
数据来源: Taylor
|
5. |
Hypothesis testing for quantiles of two-parameter binary models |
|
Communications in Statistics - Simulation and Computation,
Volume 26,
Issue 2,
1997,
Page 487-494
Barry Kurt Moser,
Mark E. Payton,
Preview
|
PDF (701KB)
|
|
摘要:
A procedure is developed to test the equality of the quantiles from k populations, assuming the responses follow a two-parameter binary model.The method utilizes the asymptotic distribution of the maximum likelihood estimators.The exact distribution of the test statistic is discussed in general.This exact distribution is generated for the logit model in order to investgate the convergence properties of the asymptotic procedure.
ISSN:0361-0918
DOI:10.1080/03610919708813392
出版商:Marcel Dekker, Inc.
年代:1997
数据来源: Taylor
|
6. |
Properties of simultaneous confidence intervals for multinomial proportions |
|
Communications in Statistics - Simulation and Computation,
Volume 26,
Issue 2,
1997,
Page 495-518
Warren L. May,
William D Johnson,
Preview
|
PDF (2230KB)
|
|
摘要:
Inversion of Pearson's chi-square statistic yields a confidence ellipsoid that can be used for simultaneous inference concerning multinomial proportions. Because the ellipsoid is difficult to interpret, methods of simultaneous confidence interval construction have been proposed by Quesenberry and hurst,goodman,fitzpatrick and scott and sison and glaz . Based on simulation results, we discuss the performance of these methods in terms of empirical coverage probabilities and enclosed volume. None of the methods is uniformly better than all others, but the Goodman intervals control the empirical coverage probability with smaller volume than other methods when the sample size supports the large sample theory. If the expected cell counts are small and nearly equal across cells, we recommend the sison and glaz intervals.
ISSN:0361-0918
DOI:10.1080/03610919708813393
出版商:Marcel Dekker, Inc.
年代:1997
数据来源: Taylor
|
7. |
A note on combining parametric and non-parametric regression |
|
Communications in Statistics - Simulation and Computation,
Volume 26,
Issue 2,
1997,
Page 519-529
Mezbahur Rahman,
D.V Gokhale,
Aman Ullah,
Preview
|
PDF (1379KB)
|
|
摘要:
A combination of a parametric estimate and a nonparametric estimate of a model for a regression function is considered. The optimal linear combination is estimated from the data by the least squares estimate of the combining coefficient. The estimate so obtained is compared with the one proposed by Wooldridge (1992) and Burman and Chaud-huri,the latter being based on Stein (1956). A test procedure for deciding about the parametric specification is also studied.
ISSN:0361-0918
DOI:10.1080/03610919708813394
出版商:Marcel Dekker, Inc.
年代:1997
数据来源: Taylor
|
8. |
On l1regression coeficients |
|
Communications in Statistics - Simulation and Computation,
Volume 26,
Issue 2,
1997,
Page 531-537
Chong Sun Hong,
Hyun Jip Choi,
Preview
|
PDF (594KB)
|
|
摘要:
We propose an alternative method to find the well-known L1estimators for general regression models. We find that the L1regression coefficients can be defined in terms of the convergent weighted medians of the slopes from each data point to the point that is assumed to be on a predicted regression line. Since the L1estimators are obtained by the convergent weighted medians, the L1estimation method could be regarded as finding the point on a predicted regression line. Moreover, we extend the above method to the multiple regression models. In these cases, the regression coefficients cannot be mentioned as the medians of some slopes, but can be defined in terms of me convergent weighted medians of some slopes. With these situations, we do the analogous argument to find the point on the Lx regression line so that the sum of absolute residuals would be minimized.
ISSN:0361-0918
DOI:10.1080/03610919708813395
出版商:Marcel Dekker, Inc.
年代:1997
数据来源: Taylor
|
9. |
Model diagnostics for marginal regression analysis of correlated binary data |
|
Communications in Statistics - Simulation and Computation,
Volume 26,
Issue 2,
1997,
Page 539-558
Ming Tan,
Yingsheng Qu,
Michael H.Kutner,
Preview
|
PDF (2029KB)
|
|
摘要:
We propose several diagnostic methods for checking the adequacy of marginal regression models for analyzing correlated binary data. We use a parametric marginal model based on latent variables and derive the projection (hat) matrix, Cook's distance, various residuals and Mahalanobis distance between the observed binary responses and the estimated probabilities for a cluster. Emphasized are several graphical methods including the simulated Q-Q plot, the half-normal probability plot with a simulated envelope, and the partial residual plot. The methods are illustrated with a real life example.
ISSN:0361-0918
DOI:10.1080/03610919708813396
出版商:Marcel Dekker, Inc.
年代:1997
数据来源: Taylor
|
10. |
Determination of the best significance level in forward stepwise logistic regression |
|
Communications in Statistics - Simulation and Computation,
Volume 26,
Issue 2,
1997,
Page 559-575
Kang In Lee,
John J. Koval,
Preview
|
PDF (1564KB)
|
|
摘要:
The χ2test based on a fixed a level (χ2(a)) is the standard stopping criterion in forward stepwise logistic regression. SAS/IML programs were written to provide Monte Carlo simulations to determine the best a level for the χ2(a)stopping criterion. Performance was evaluated using Efron's (1986) estimated true error rate of prediction. The best a varied between 0.05 and 0,40. In all cases, it increased linearly the number of predictor variables; in the multivariate binary case, it also depended upon the mean of the binary variables in one population and the difference between the means of the binary variables in the two populations. An overall recommendation is that 0.15 ≤a≤ 0.20 should be used for the χ2(a)stopping criterion.
ISSN:0361-0918
DOI:10.1080/03610919708813397
出版商:Marcel Dekker, Inc.
年代:1997
数据来源: Taylor
|
|