Secondary Data Analysis when there are Missing Observations
作者:
R. Wang,
J. Sedransk,
J.H. Jinn,
期刊:
Journal of the American Statistical Association
(Taylor Available online 1992)
卷期:
Volume 87,
issue 420
页码: 952-961
ISSN:0162-1459
年代: 1992
DOI:10.1080/01621459.1992.10476249
出版商: Taylor & Francis Group
关键词: Incomplete data;Multiple imputations;Nonignorable nonresponse;Weighted distributions
数据来源: Taylor
摘要:
A data set having missing observations often is completed by using imputed values. Our objective is to improve the practice of secondary data analysis by looking at the interplay of different imputation techniques and different methods that secondary data analysts use when there are both observed and imputed values. Secondary data analysts typically either treat the completed data set as if it has only observed values or ignore the imputations and analyze only the observed values. The first objective of our research is to investigate the effect on the properties of standard statistical techniques of proceeding in these ways. We assume that the missing data cannot be regarded as missing at random (MAR), and that the secondary data analyst's objectives are confidence intervals for the regression coefficients in a simple linear regression. Standard, “general purpose” imputation methods are emphasized. The second objective is to investigate the performance of confidence intervals based on multiple imputations. Obtaining moments of statistics requires averaging using a weighted distribution. Because analytical results typically cannot be obtained, we show how to obtain lower and upper bounds that can be computed easily. We also present a simple parametric function for the probability of response given variables of interest, and validate it using data from the 1987 Economic Censuses. We also summarize our findings and make recommendations to the organizations providing the imputations. Finally, we delineate the options available to secondary data analysts.
点击下载:
PDF (977KB)
返 回