On the interpretation of high throughput MS based metabolomics fingerprints with Random Forest

Type Conference Proceeding (Non-Journal item)
Original languageEnglish
Title of host publicationSecond International Symposium, CompLife 2006, Cambridge, UK, September 27-29, 2006. Proceedings
PublisherSpringer Nature
Pages226-235
Number of pages10
ISBN (Electronic)978-3-540-45768-8
ISBN (Print)978-3-540-45767-1
DOI
Publication statusPublished - 2006
EventInternational Symposium, CompLife - Cambridge, United Kingdom of Great Britain and Northern Ireland
Duration: 27 Sept 200629 Sept 2006

Conference

ConferenceInternational Symposium, CompLife
Country/TerritoryUnited Kingdom of Great Britain and Northern Ireland
CityCambridge
Period27 Sept 200629 Sept 2006
Links
Permanent link
View graph of relations
Citation formats

Abstract

We discuss application of a machine learning method, Random Forest (RF), for the extraction of relevant biological knowledge from metabolomics fingerprinting experiments. The importance of RF margins and variable significance as well as prediction accuracy is discussed to provide insight into model generalisability and explanatory power. A method is described for detection of relevant features while conserving the redundant structure of the fingerprint data. The methodology is illustrated using two datasets from electrospray ionisation mass spectrometry from 27 Arabidopsis genotypes and a set of transgenic potato lines.