Recovery of gene haplotypes from a metagenome

Authors Organisations
Type Working paper
Original languageEnglish
Publication statusPublished - 17 Oct 2019
Show download statistics
View graph of relations
Citation formats


Elucidation of population-level diversity of microbiomes is a significant step towards a complete understanding of the evolutionary, ecological and functional importance of microbial communities. Characterizing this diversity requires the recovery of the exact DNA sequence (haplotype) of each gene isoform from every individual present in the community. To address this, we present Hansel and Gretel: a freely-available data structure and algorithm, providing a software package that reconstructs the most likely haplotypes from metagenomes. We demonstrate recovery of haplotypes from short-read Illumina data for a bovine rumen microbiome, and verify our predictions are 100% accurate with long-read PacBio CCS sequencing. We show that Gretel’s haplotypes can be analyzed to determine a significant difference in mutation rates between core and accessory gene families in an ovine rumen microbiome. All tools, documentation and data for evaluation are open source and available via our repository: