首页   按字顺浏览 期刊浏览 卷期浏览 PRESS‐related statisticsregression tools for cross‐validation and case di...
PRESS‐related statisticsregression tools for cross‐validation and case diagnostics

 

作者: DAVID HOLIDAY,   JOYCE BALLARD,   BARRY MCKEOWN,  

 

期刊: Medicine and Science in Sports and Exercise  (OVID Available online 1995)
卷期: Volume 27, issue 4  

页码: 612-620

 

ISSN:0195-9131

 

年代: 1995

 

出版商: OVID

 

数据来源: OVID

 

摘要:

HOLIDAY, D. B., J. E. BALLARD, and B. C. McKEOWN. PRESS-related statistics: regression tools for cross-validation and case diagnostics.Med. Sci. Sports Exerc., Vol. 27, No. 4, pp. 612–620, 1995. In the health science literature, a common approach of validating a regression equation is data-splitting, where a portion of the data fits the model (fitting sample) and the remainder (validation sample) estimates future performance. TheR2and SEE obtained by predicting the validation sample with the fitting sample equation is a proper estimate of future performance, tending to correct for the natural upward bias of theR2and SEE obtained from fitting sample alone. Data-splitting has several disadvantages, however. These include: 1) difficulty, arbitrariness, and inconvenience of matching samples; 2) the need to report two sets of statistics to determine homogeneity; and 3) the lack of equation stability due to diluted sample size. The PRESS statistic and associated residuals do not require the data to be split, yield alternative unbiased estimates ofR2and SEE, and provide useful case diagnostics. This procedure is easy to use, is widely available in modern statistical packages, but is rarely utilized. The two methods are contrasted here using a simulation from original data for predicting body density from anthropomctric measurements of a group of 117 women. The PRESS approach is particularly appropriate for smaller datascts; methods of reporting these statistics are recommended.

 

点击下载:  PDF (791KB)



返 回