TY - JOUR
T1 - A comparison of variate pre-selection methods for use in partial least squares regression
T2 - A case study on NIR spectroscopy applied to monitoring beer fermentation
AU - McLeod, Georgina
AU - Clelland, Kirsty
AU - Tapp, Henri
AU - Kemsley, E. Katherine
AU - Wilson, Reginald H.
AU - Poulter, Graham
AU - Coombs, David
AU - Hewitt, Christopher J.
N1 - Funding Information:
The authors thank the BBSRC for funding this work and Coors brewery Ltd. for providing the Grolsch lager wort.
PY - 2009/1
Y1 - 2009/1
N2 - This work investigates four methods of selecting variates from near-infrared (NIR) spectra for use in partial least squares (PLS) regression models to predict biomass and chemical changes during beer fermentation. The fermentation parameters studied were ethanol concentration, specific gravity (SG), optical density (OD) and dry cell weight (DCW). The four selection methods investigated were: Simple, where a fingerprint region is chosen manually; CovProc, a covariance procedure where variates are introduced based on the magnitude of the first PLS vector coefficients; CovProc-SavGo, a modification to CovProc where the window size of a Savitzky-Golay filter applied to the spectra is also optimised; and genetic algorithm (GA), where variates are selected based on the frequency of appearance in 8-variate multiple linear regression models found from repeated execution of the GA routine. The analysis found that all four methods produced good predictive models. The GA approach produced the lowest standard error in prediction (SEP) based on leave-one-out cross-validation (LOO-CV), although this advantage was not reflected in the standard error in validation values, SEV, where all four models performed comparably. From this work, we would recommend using the Simple approach if a suitable fingerprint region can be identified, and using CovProc otherwise.
AB - This work investigates four methods of selecting variates from near-infrared (NIR) spectra for use in partial least squares (PLS) regression models to predict biomass and chemical changes during beer fermentation. The fermentation parameters studied were ethanol concentration, specific gravity (SG), optical density (OD) and dry cell weight (DCW). The four selection methods investigated were: Simple, where a fingerprint region is chosen manually; CovProc, a covariance procedure where variates are introduced based on the magnitude of the first PLS vector coefficients; CovProc-SavGo, a modification to CovProc where the window size of a Savitzky-Golay filter applied to the spectra is also optimised; and genetic algorithm (GA), where variates are selected based on the frequency of appearance in 8-variate multiple linear regression models found from repeated execution of the GA routine. The analysis found that all four methods produced good predictive models. The GA approach produced the lowest standard error in prediction (SEP) based on leave-one-out cross-validation (LOO-CV), although this advantage was not reflected in the standard error in validation values, SEV, where all four models performed comparably. From this work, we would recommend using the Simple approach if a suitable fingerprint region can be identified, and using CovProc otherwise.
KW - Brewing
KW - Genetic algorithm
KW - NIR spectroscopy
KW - PLS regression
KW - Variate selection
UR - http://www.scopus.com/inward/record.url?scp=49749141422&partnerID=8YFLogxK
U2 - 10.1016/j.jfoodeng.2008.06.037
DO - 10.1016/j.jfoodeng.2008.06.037
M3 - Article
AN - SCOPUS:49749141422
VL - 90
SP - 300
EP - 307
JO - Journal of Food Engineering
JF - Journal of Food Engineering
SN - 0260-8774
IS - 2
ER -