The effect of mislabeled samples on the performance of the linear learning machine
作者:
Barry K. Lavine,
Anthony J. I. Ward,
Jian Hwa Han,
Roy‐Keith Smith,
Orley R. Taylor,
期刊:
Journal of Chemometrics
(WILEY Available online 1990)
卷期:
Volume 4,
issue 1
页码: 47-50
ISSN:0886-9383
年代: 1990
DOI:10.1002/cem.1180040106
出版商: John Wiley&Sons, Ltd.
关键词: Classification;Pattern recognition;Preprocessing
数据来源: WILEY
摘要:
AbstractOver the past 15 years the linear learning machine has been applied to a large number of chemical problems. The learning machine approach is conceptually simple and does not require knowledge about the statistical distribution of the data. However, there are problems associated with this approach. One problem which has not been investigated is the influence of mislabeled samples on the positioning of the hyperplane in feature space. If a few samples in a data set are incorrectly tagged prior to training (i.e. the samples are labeled as members of class 2 even though they are actually members of class 1), it is still possible using the linear learning machine to achieve a classification success rate of 100% for the training set. However, unfavorable results will be obtained for the prediction set. The magnitude of this effect and its potential implications regarding the proper use of the linear learning machine are discussed.
点击下载:
PDF
(276KB)
返 回