Towards a unified paradigm for sequence-based identification of Fungi

The nuclear ribosomal internal transcribed spacer (ITS) region is the formal fungal
barcode and in most cases the marker of choice for exploration of fungal diversity in
environmental samples. Two problems are particularly acute in the pursuit of satisfactory
taxonomic assignment of newly generated ITS sequences: (i) the lack of an inclusive, reliable
public reference dataset, and (ii) the lack of means to refer to fungal species, for which no Latin
name is available in a standardized stable way. Here we report on progress in these regards
through further development of the UNITE database ( for molecular
identification of fungi. All fungal species represented by at least two ITS sequences in the
international nucleotide sequence databases are now given a unique, stable name of the accession
number type (e.g., Hymenoscyphus pseudoalbidus|GU586904|SH133781.05FU), and their
taxonomic and ecological annotations were corrected as far as possible through a distributed,
third-party annotation effort. We introduce the term “species hypothesis” (SH) for the taxa
discovered in clustering on different similarity tresholds (97-99%). An automatically or manually
designated sequence is chosen to represent each such species hypothesis. These reference
sequences are released ( for use by the scientific community in,
e.g., local sequence similarity searches and in the QIIME pipeline. The system and the data will
be updated automatically as the number of public fungal ITS sequences grows. We invite
everybody in the position to improve the annotation or metadata associated with their particular fungal lineages of expertise to do so through the new web-based sequence management system