A simple trick for constructing Bayesian formulations of sparse kernel learning methods

Research output: Contribution to conferencePaper

2 Citations (Scopus)

Abstract

In this paper, we present a simple mathematical trick that simplifies the derivation of Bayesian treatments of a variety of sparse kernel learning methods. The incomplete Cholesky factorisation due to (Fine and Scheinberg, 2001) is used to transform the dual parameter space, such that the covariance matrix of the Gaussian prior over model parameters becomes the identity matrix. The regularisation term is then the familiar weight-decay regulariser, allowing the Bayesian analysis to proceed straight-forwardly via the methods developed by MacKay (1992). As a bye-product, the incomplete Cholesky factorisation algorithm also identifies a subset of the training data forming an approximate basis for the remaining data in feature space, resulting in a sparse model. Bayesian treatments of the kernel ridge regression algorithm (Saunders et al., 1998), with both constant and input dependent variance structures, arc given as illustrative examples of the proposed technique, which we hope will be more widely applicable.
Original languageEnglish
Pages1425-1430
Number of pages6
DOIs
Publication statusPublished - 2005
EventProceedings of the International Joint Conference on Neural Networks (IJCNN-2005) -
Duration: 31 Jul 20054 Aug 2005

Conference

ConferenceProceedings of the International Joint Conference on Neural Networks (IJCNN-2005)
Period31/07/054/08/05

Cite this