TY - CHAP

T1 - Baseline Methods for Active Learning

AU - Cawley, G

PY - 2011

Y1 - 2011

N2 - In many potential applications of machine learning, unlabelled data are abundantly available at low cost, but there is a paucity of labelled data, and labeling unlabelled examples is expensive and/or time-consuming. This motivates the development of active learning methods, that seek to direct the collection of labelled examples such that the greatest performance gains can be achieved using the smallest quantity of labelled data. In this paper, we describe some simple pool-based active learning strategies, based on optimally regularised linear [kernel] ridge regression, providing a set of baseline submissions for the Active Learning Challenge. A simple random strategy, where unlabelled patterns are submitted to the oracle purely at random, is found to be surprisingly e?ective, being competitive with more complex approaches.

AB - In many potential applications of machine learning, unlabelled data are abundantly available at low cost, but there is a paucity of labelled data, and labeling unlabelled examples is expensive and/or time-consuming. This motivates the development of active learning methods, that seek to direct the collection of labelled examples such that the greatest performance gains can be achieved using the smallest quantity of labelled data. In this paper, we describe some simple pool-based active learning strategies, based on optimally regularised linear [kernel] ridge regression, providing a set of baseline submissions for the Active Learning Challenge. A simple random strategy, where unlabelled patterns are submitted to the oracle purely at random, is found to be surprisingly e?ective, being competitive with more complex approaches.

M3 - Chapter

VL - 16

T3 - JMLR Workshop and Conference Proceedings

SP - 47

EP - 57

BT - JMLR: Workshop and Conference Proceedings 16

A2 - Guyon, I

A2 - Cawley, G

A2 - Dror, G

A2 - Lemaire, V

A2 - Statnikov, A

PB - Microtome

ER -