|
1. |
On the Quality of Reinterview Data with Application to the Current Population Survey |
|
Journal of the American Statistical Association,
Volume 87,
Issue 420,
1992,
Page 915-923
PaulP. Biemer,
Gösta Forsman,
Preview
|
PDF (910KB)
|
|
摘要:
The Current Population Survey (CPS) reinterview sample consists of two subsamples: (a) a sample of CPS households is reinterviewed and the discrepancies between the reinterview responses and the original interview responses are reconciled for the purpose of obtaining more accurate responses (i.e., response bias estimates), and (b) a sample of CPS households, nonoverlapping with sample (a), is reinterviewed “independently” of the original interview for the purpose of estimating simple response variance (SRV). In this article a model and estimation procedure are proposed for obtaining estimates of SRV from subsample (a) as well as the customary estimates of SRV from subsample (b). In this way, an improved estimator of SRV that combines data from both subsamples can be computed. Additionally, under conditions that are usually satisfied in practice, several inequalities involving statistics computed from both subsamples are derived. These inequalities can be used to check the validity of the reinterview assumptions and the quality of the estimates of SRV and response bias from the reinterview program. Data from the CPS reinterview program for both subsamples (a) and (b) are analyzed both (1) to illustrate the methodology and (2) to check the validity of the CPS reinterview data. Our results indicate that data from subsample (a) are not consistent with the data from subsample (b) and provide convincing evidence that errors in subsample (a) are the source of the inconsistency.
ISSN:0162-1459
DOI:10.1080/01621459.1992.10476245
出版商:Taylor & Francis Group
年代:1992
数据来源: Taylor
|
2. |
Estimation Using Multiyear Rotation Design Sampling in Agricultural Surveys |
|
Journal of the American Statistical Association,
Volume 87,
Issue 420,
1992,
Page 924-932
RajS. Chhikara,
Lih-Yuan Deng,
Preview
|
PDF (871KB)
|
|
摘要:
In the annual June Enumerative Surveys (JES) of the U.S. Department of Agriculture (USDA), the area frame sampling involves multiyear rotation designs with 20% replacement of sample units each year. Currently, USDA uses the latest year sample data almost exclusively in its estimation procedure. We propose that the multiyear sample survey data should be used for estimation of crop acreages and livestock. We develop a multiyear estimation method based on an analysis of variance model that takes into account the successive sampling of units in the area frame from year to year. The proposed method is applied to estimate the 1989 hogs and soybean acreages using JES data for three years: 1987, 1988, and 1989. These estimates are compared with those obtained using the current USDA estimation method. We give relative efficiencies of the multiyear estimates compared to the single-year estimates, often showing a substantial improvement in precision. We also make an evaluation study of the proposed estimation method using simulations and show it to be fairly robust to misspecification of the model parameter.
ISSN:0162-1459
DOI:10.1080/01621459.1992.10476246
出版商:Taylor & Francis Group
年代:1992
数据来源: Taylor
|
3. |
The Chilean Plebiscite: Projections without Historic Data |
|
Journal of the American Statistical Association,
Volume 87,
Issue 420,
1992,
Page 933-941
Eduardo Engel,
Achilles Venetoulias,
Preview
|
PDF (943KB)
|
|
摘要:
On October 5, 1988, Chileans decided by plebiscite to oust General Pinochet from power and have free presidential elections in 1989. This article describes the projections that the authors made for the results of the plebiscite from early returns. From a statistical point of view, what made these projections different from those made in other countries was the complete lack of historic data. Furthermore, the Pinochet government carried out a campaign to discredit the projection effort. Uncertainty about both the data and the unpredictable political climate on the night of the plebiscite influenced the choice of the statistical methodology. The predictions, based on a 10% sample of the first one-third of the votes counted, were within one-half a percentage point of the true outcome. The described methodology could prove useful in projections of other elections that will take place under similar conditions (e.g., in Eastern Europe).
ISSN:0162-1459
DOI:10.1080/01621459.1992.10476247
出版商:Taylor & Francis Group
年代:1992
数据来源: Taylor
|
4. |
Flexible Methods for Analyzing Survival Data Using Splines, with Applications to Breast Cancer Prognosis |
|
Journal of the American Statistical Association,
Volume 87,
Issue 420,
1992,
Page 942-951
RobertJ. Gray,
Preview
|
PDF (1036KB)
|
|
摘要:
In this article some flexible methods for modeling censored survival data using splines are applied to the problem of modeling the time to recurrence of breast cancer patients. The basic idea is to use fixed knot splines with a fairly modest number of knots to model aspects of the data, and then to use penalized partial likelihood to estimate the parameters of the model. Test statistics are proposed which are analogs of those used in traditional likelihood analysis, and approximations to the distributions of these statistics are suggested. In an analysis of a large data set taken from clinical trials conducted by the Eastern Cooperative Oncology Group, these methods are seen to give useful insight into how prognosis varies as a function of continuous covariates, and also into how covariate effects change with follow-up time.
ISSN:0162-1459
DOI:10.1080/01621459.1992.10476248
出版商:Taylor & Francis Group
年代:1992
数据来源: Taylor
|
5. |
Secondary Data Analysis when there are Missing Observations |
|
Journal of the American Statistical Association,
Volume 87,
Issue 420,
1992,
Page 952-961
R. Wang,
J. Sedransk,
J.H. Jinn,
Preview
|
PDF (977KB)
|
|
摘要:
A data set having missing observations often is completed by using imputed values. Our objective is to improve the practice of secondary data analysis by looking at the interplay of different imputation techniques and different methods that secondary data analysts use when there are both observed and imputed values. Secondary data analysts typically either treat the completed data set as if it has only observed values or ignore the imputations and analyze only the observed values. The first objective of our research is to investigate the effect on the properties of standard statistical techniques of proceeding in these ways. We assume that the missing data cannot be regarded as missing at random (MAR), and that the secondary data analyst's objectives are confidence intervals for the regression coefficients in a simple linear regression. Standard, “general purpose” imputation methods are emphasized. The second objective is to investigate the performance of confidence intervals based on multiple imputations. Obtaining moments of statistics requires averaging using a weighted distribution. Because analytical results typically cannot be obtained, we show how to obtain lower and upper bounds that can be computed easily. We also present a simple parametric function for the probability of response given variables of interest, and validate it using data from the 1987 Economic Censuses. We also summarize our findings and make recommendations to the organizations providing the imputations. Finally, we delineate the options available to secondary data analysts.
ISSN:0162-1459
DOI:10.1080/01621459.1992.10476249
出版商:Taylor & Francis Group
年代:1992
数据来源: Taylor
|
6. |
Estimating the Size-selectivity of Fishing Gear by Conditioning on the Total Catch |
|
Journal of the American Statistical Association,
Volume 87,
Issue 420,
1992,
Page 962-968
RussellB. Millar,
Preview
|
PDF (693KB)
|
|
摘要:
A conditional maximum likelihood model is used to estimate the size-selectivity of trawls, gillnets, and hooks when the data are obtained by simultaneous fishing with meshes or hooks of different size and/or shape. Size-selectivity is expressed here by the selection curve,r(l), the probability that a fish of lengthl, if contacting the gear, will be retained (caught). In many selectivity studiesr(l) is fitted either by eye, by heuristic means, or by improper application of generalized linear models. Then it is not possible to make legitimate statistical inference aboutr(l), or about assessments of the state of the fishery if those assessments user(l). It is shown here that by conditioning on the total catch, selectivity data can be modeled as binary data, or polytomous data on interval scales. Application of the model to trawl and hook data demonstrates that selection curves can be fitted using generalized linear models, which may require nonstandard link functions or link functions with parameters.
ISSN:0162-1459
DOI:10.1080/01621459.1992.10476250
出版商:Taylor & Francis Group
年代:1992
数据来源: Taylor
|
7. |
Estimating a Multivariate Proportional Hazards Model for Clustered Data Using the EM Algorithm, with an Application to Child Survival in Guatemala |
|
Journal of the American Statistical Association,
Volume 87,
Issue 420,
1992,
Page 969-976
Guang Guo,
Germán Rodríguez,
Preview
|
PDF (845KB)
|
|
摘要:
This article discusses a random-effects model for the analysis of clustered survival times, such as those reflecting the mortality experience of children in the same family. We describe parametric and nonparametric approaches to the specification of the random effect and show how the model may be fitted using an accelerated EM algorithm. We then fit two specifications of the model to child survival data from Guatemala. These data had been analyzed before using standard hazard models that ignore cluster effects.
ISSN:0162-1459
DOI:10.1080/01621459.1992.10476251
出版商:Taylor & Francis Group
年代:1992
数据来源: Taylor
|
8. |
Age Patterns of Marital Fertility: Revising the Coale–Trussell Method |
|
Journal of the American Statistical Association,
Volume 87,
Issue 420,
1992,
Page 977-984
Yu Xie,
EllenEfron Pimentel,
Preview
|
PDF (825KB)
|
|
摘要:
This article revises the Coale–Trussell method for analyzing data from the World Fertility Survey by proposing and testing alternative log-linear and log-multiplicative models. The models, in one form or another, represent the structural constraint underlying the Coale–Trussell method on the variation in the age pattern of human fertility. With a Poisson distribution assumption for the number of births, several parameters of the models are simultaneously estimated via maximum likelihood. It is shown that the new approach can be adopted whenever fertility limitation is compared across multiple populations or subpopulations. Future users of the Coale–Trussell method for single populations or subpopulations are advised to use the newvaestimates from this study in place of those of Coale and Trussell.
ISSN:0162-1459
DOI:10.1080/01621459.1992.10476252
出版商:Taylor & Francis Group
年代:1992
数据来源: Taylor
|
9. |
Leverage and Superleverage in Nonlinear Regression |
|
Journal of the American Statistical Association,
Volume 87,
Issue 420,
1992,
Page 985-990
RoyT. St Laurent,
R.Dennis Cook,
Preview
|
PDF (571KB)
|
|
摘要:
Several measures of the leverage of an observation in a nonlinear regression model are defined and developed. In contrast to the upper bound on the leverage in a linear model, it is found that in a nonlinear model the leverage of an observation may exceed 1. Such a case is said to exhibit superleverage. Relationships between the leverage measures are explored, and several examples are developed to illustrate the proposed methodology.
ISSN:0162-1459
DOI:10.1080/01621459.1992.10476253
出版商:Taylor & Francis Group
年代:1992
数据来源: Taylor
|
10. |
Breakdown in Nonlinear Regression |
|
Journal of the American Statistical Association,
Volume 87,
Issue 420,
1992,
Page 991-997
ArnoldJ. Stromberg,
David Ruppert,
Preview
|
PDF (687KB)
|
|
摘要:
The breakdown point is considered an important measure of the robustness of a linear regression estimator. This article addresses the concept of breakdown in nonlinear regression. Because it is not invariant to nonlinear reparameterization, the usual definition of the breakdown point in linear regression is inadequate for nonlinear regression. The original definition of breakdown due to Hampel is more suitable for nonlinear problems but may indicate breakdown when the fitted values change very little. We introduce breakdown functions, which measure breakdown of the fitted values. Using the breakdown functions, we introduce a new definition of the breakdown point. For the linear regression model, our definition of the breakdown point coincides with the usual definition for linear regression as well as with Hampel's definition. For most nonlinear regression functions, we show that the breakdown point of the least squares estimator is 1/n. We prove that for a large class of unbounded regression functions, the breakdown point of the least median of squares or the least trimmed sum of squares estimator is close to ½. For monotonic regression functions of the typeg(α + βx), wheregis bounded above and/or below, we establish upper and lower bounds for the breakdown points that depend on the data.
ISSN:0162-1459
DOI:10.1080/01621459.1992.10476254
出版商:Taylor & Francis Group
年代:1992
数据来源: Taylor
|
|