Computational modelling of the relationship between Miscanthus genotype, phenotype and environment

Authors Organisations
Type

Student thesis: Doctoral ThesisDoctor of Philosophy

Original languageEnglish
Awarding Institution
Supervisors/Advisors
Thesis sponsors
  • Biotechnology and Biological Sciences Research Council
Award date01 Jun 2015
Links
Show download statistics
View graph of relations

Abstract

Several major global challenges being faced in the 21st century, ranging from climate change, energy security and food security to the sustainable living. Innovative solutions are needed to address those challenges. Miscanthus is a highly productive C4 grass which naturally occurs in Asia with the potential use for as a bioenergy crop. Recent advances in technologies such as genomics, phenomics, bioinformatics and modelling, provide a unique opportunity to accelerate the domestication process of Miscanthus. Modern breeding programmes aim to utilise genetic information to assist in breeding decisions. High-throughput technologies such as genotyping-by-sequencing (GBS) generate massive datasets. Conventional analysis methods cannot handle large multi-dimensional datasets, therefore new methodologies are needed.
This research aims to use machine learning to model marker trait association and genotype by environmental interaction on Miscanthus. Three studies were performed in this research: 1) Develop a machine learning based QTL analysis tool to detect QTL on a Miscanthus flowering time mapping population. 2) Conduct marker-trait associations in a GBS analysis. 3) Establish a predictive model to understand drought and thermal effects on flowering time in Miscanthus. The machine learning algorithm, random forest, was used to develop a QTL analysis tool, referred to as RFQTL. RFQTL identified several flowering QTL, with reduced computation time, consistent with conventional QTL analysis. Within the GBS study machine learning detected markers which when aligned with the Sorghum genome several homolog QTLs were found for the traits investigated. Using the prediction model of flowering time we were able to show that drought delays flowering whereas increased temperature led to earlier flowering. This research has demonstrated the power of machine learning as an effective method for marker trait association and genotype by environment modelling. It has great potential to play a crucial role in crop improvement and provide further scientific insights for genetic research.