A survey of string orderings and their application to the Burrows-Wheeler transform

Authors Organisations
  • Jacqueline Daykin(Author)
    Royal Holloway, University of London
    Normandy University
  • Richard Groult(Author)
    University of Picardie Jules Verne
    Normandy University
  • Yannick Guesnet(Author)
    Normandy University
  • Thierry Lecroq(Author)
    Normandy University
  • Arnaud Lefebvre(Author)
    Normandy University
  • Martine Léonard(Author)
    Normandy University
  • Élise Prieur-Gaston(Author)
    Normandy University
Type Article
Original languageEnglish
Pages (from-to)52-65
Number of pages14
JournalTheoretical Computer Science
Volume710
Early online date01 Mar 2017
DOI
Publication statusPublished - 01 Feb 2018
Permanent link
View graph of relations
Citation formats

Abstract

For over 20 years the data clustering properties and applications of the efficient Burrows–Wheeler transform have been researched. Lexicographic suffix-sorting is induced during the transformation, and more recently a new direction has considered alternative ordering strategies for suffix arrays and thus the transforms. In this survey we look at these distinctly ordered bijective and linear transforms. For arbitrary alphabets we discuss the V-BWT derived from V-order and the D-BWT based on lex-extension order. The binary case yields a pair of transforms, the binary Rouen B-BWT, defined using binary block order. Lyndon words are relevant to implementing the original transform; the new transforms are defined for analogous structures: V-words, indeterminate Lyndon words, and B-words, respectively. There is plenty of scope for further non-lexicographic transforms as indicated in the conclusion.

Keywords

  • algorithm, bijective alphabet, block order, Burrown-Wheeler transform, B-word, data clustering, degenerate, GB-word, generic alphabet, generic block order, indeterminate Lyndon word, inverse transform, lexicographic order, linear, Lyndon word, string, suffix array, suffix-sorting, T-order, V-order, word