Preprocessing, classification modeling and feature selection using flow injection electrospray mass spectrometry metabolite fingerprint data

Authors Organisations
Type Article
Original languageEnglish
Pages (from-to)446-470
Number of pages25
JournalNature Protocols
Publication statusPublished - 23 Feb 2008
Permanent link
View graph of relations
Citation formats


Metabolome analysis by flow injection electrospray mass spectrometry (FIE-MS) fingerprinting generates measurements relating to large numbers of m/z signals. Such data sets often exhibit high variance with a paucity of replicates, thus providing a challenge for data mining. We describe data preprocessing and modeling methods that have proved reliable in projects involving samples from a range of organisms. The protocols interact with software resources specifically for metabolomics provided in a Web-accessible data analysis package FIEmspro ( written in the R environment and requiring a moderate knowledge of R command-line usage. Specific emphasis is placed on describing the outcome of modeling experiments using FIE-MS data that require further preprocessing to improve quality. The salient features of both poor and robust (i.e., highly generalizable) multivariate models are outlined together with advice on validating classifiers and avoiding false discovery when seeking explanatory variables.