Predicting sugar regulation in Arabidopsis thaliana using kernel learning methods

K. Saadi, Kee-Khoon Lee, G. C. Cawley, M. W. Bevan

Research output: Contribution to conferenceOther


The ability to predict the transcriptional regulation of genes, based on the composition of the upstream promoter region, would be a useful step in deciphering gene regulatory networks in eukaryotic organisms. In this paper we perform optimally regularised kernel Fisher discriminant (ORKFD) analysis of the upstream promoter sequences of genes to predict whether they are up- or down-regulated in response to glucose in the model plant Arahuiopsis thaliana. Three feature selection strategies are investigated, namely use of known promoter motifs drawn from the PLACE database, explicit enumeration of all possible k-mers and the use of the mismatch kernels (which effectively permits the construction of a linear model in the space of all possible k-mers with up to in mismatches). The leave-one-out cross-validation (LOOCV) error rate indicates that approximately two-thirds of the observed regulatory behaviour can be inferred by the presence of particular motifs in the upstream promoter sequence. The analysis has yielded novel biological insight, which has since been confirmed experimentally in vivo.
Original languageEnglish
Number of pages6
Publication statusPublished - 2005
Event2005 International Joint Conference on Neural Networks - Montreal, Canada
Duration: 31 Jul 20054 Aug 2005


Conference2005 International Joint Conference on Neural Networks
Abbreviated titleIJCNN-2005

Cite this