Efficient pattern matching in degenerate strings with the Burrows–Wheeler transform

Standard

Efficient pattern matching in degenerate strings with the Burrows–Wheeler transform. / Daykin, Jacqueline; Groult, Richard; Guesnet, Yannick et al.

In: Information Processing Letters, 15.03.2019.

Research output: Contribution to journalArticlepeer-review

Harvard

Daykin, J, Groult, R, Guesnet, Y, Lecroq, T, Lefebvre, A, Léonard, M, Mouchard, L, Prieur-Gaston, É & Watson, B 2019, 'Efficient pattern matching in degenerate strings with the Burrows–Wheeler transform', Information Processing Letters. https://doi.org/10.1016/j.ipl.2019.03.003

APA

Daykin, J., Groult, R., Guesnet, Y., Lecroq, T., Lefebvre, A., Léonard, M., Mouchard, L., Prieur-Gaston, É., & Watson, B. (2019). Efficient pattern matching in degenerate strings with the Burrows–Wheeler transform. Information Processing Letters. https://doi.org/10.1016/j.ipl.2019.03.003

Vancouver

Daykin J, Groult R, Guesnet Y, Lecroq T, Lefebvre A, Léonard M et al. Efficient pattern matching in degenerate strings with the Burrows–Wheeler transform. Information Processing Letters. 2019 Mar 15. Epub 2019 Mar 15. doi: 10.1016/j.ipl.2019.03.003

Author

Daykin, Jacqueline ; Groult, Richard ; Guesnet, Yannick et al. / Efficient pattern matching in degenerate strings with the Burrows–Wheeler transform. In: Information Processing Letters. 2019.

Bibtex - Download

@article{74748a11910248f19aab5d8070f76e5a,
title = "Efficient pattern matching in degenerate strings with the Burrows–Wheeler transform",
abstract = "A degenerate or indeterminate string on an alphabet Σ is a sequence of non-empty subsets of Σ. Given a degenerate string t of length n and its Burrows–Wheeler transform we present a new method for searching for a degenerate pattern of length m in t running in O(mn) time on a constant size alphabet Σ. Furthermore, it is a hybrid pattern matching technique that works on both regular and degenerate strings. A degenerate string is said to be conservative if its number of non-solid letters is upper-bounded by a fixed positive constant q; in this case we show that the search time complexity is O(qm2) for counting the number of occurrences andO(qm2+occ) for reporting the found occurrences where occ is the number of occurrences of the pattern in t. Experimental results show that our method performs well in practice",
keywords = "algorithm, Burrows-Wheeler transform, conservative, degenerate, pattern matching, string",
author = "Jacqueline Daykin and Richard Groult and Yannick Guesnet and Thierry Lecroq and Arnaud Lefebvre and Martine L{\'e}onard and Laurent Mouchard and {\'E}lise Prieur-Gaston and Bruce Watson",
year = "2019",
month = mar,
day = "15",
doi = "10.1016/j.ipl.2019.03.003",
language = "English",
journal = "Information Processing Letters",
issn = "0020-0190",
publisher = "Elsevier",

}

RIS (suitable for import to EndNote) - Download

TY - JOUR

T1 - Efficient pattern matching in degenerate strings with the Burrows–Wheeler transform

AU - Daykin, Jacqueline

AU - Groult, Richard

AU - Guesnet, Yannick

AU - Lecroq, Thierry

AU - Lefebvre, Arnaud

AU - Léonard, Martine

AU - Mouchard, Laurent

AU - Prieur-Gaston, Élise

AU - Watson, Bruce

PY - 2019/3/15

Y1 - 2019/3/15

N2 - A degenerate or indeterminate string on an alphabet Σ is a sequence of non-empty subsets of Σ. Given a degenerate string t of length n and its Burrows–Wheeler transform we present a new method for searching for a degenerate pattern of length m in t running in O(mn) time on a constant size alphabet Σ. Furthermore, it is a hybrid pattern matching technique that works on both regular and degenerate strings. A degenerate string is said to be conservative if its number of non-solid letters is upper-bounded by a fixed positive constant q; in this case we show that the search time complexity is O(qm2) for counting the number of occurrences andO(qm2+occ) for reporting the found occurrences where occ is the number of occurrences of the pattern in t. Experimental results show that our method performs well in practice

AB - A degenerate or indeterminate string on an alphabet Σ is a sequence of non-empty subsets of Σ. Given a degenerate string t of length n and its Burrows–Wheeler transform we present a new method for searching for a degenerate pattern of length m in t running in O(mn) time on a constant size alphabet Σ. Furthermore, it is a hybrid pattern matching technique that works on both regular and degenerate strings. A degenerate string is said to be conservative if its number of non-solid letters is upper-bounded by a fixed positive constant q; in this case we show that the search time complexity is O(qm2) for counting the number of occurrences andO(qm2+occ) for reporting the found occurrences where occ is the number of occurrences of the pattern in t. Experimental results show that our method performs well in practice

KW - algorithm

KW - Burrows-Wheeler transform

KW - conservative

KW - degenerate

KW - pattern matching

KW - string

U2 - 10.1016/j.ipl.2019.03.003

DO - 10.1016/j.ipl.2019.03.003

M3 - Article

JO - Information Processing Letters

JF - Information Processing Letters

SN - 0020-0190

ER -

Show download statistics
View graph of relations
Citation formats