Skip to main navigation Skip to search Skip to main content

Hybrid indexing revisited

  • University of Helsinki

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

19 Scopus citations

Abstract

Hybrid indexing is a recent approach to text indexing that allows the space-usage of conventional text indexes (e.g., su x trees, su x arrays, FM-indexes) to scale well with the text size, n, when z, the size of the Lempel-Ziv parsing of the text, is small relative to n. The price for this improved scalability is that an upper bound M on the pattern length that can be searched for must be declared at index construction time. Because the size of the resulting index contains an O(Mz) term, M must be kept reasonably small, though it has been shown that M ? 100 leads to acceptable performance in some genomic applications. However, despite its promise, the practical performance of hybrid indexing relative to other compressed index data structures is poorly understood. This paper addresses that need, detailing experiments that show hybrid indexing — when carefully implemented — to be significantly smaller and faster than alternative approaches on a broad range of data of di erent levels of compressibility. We also describe practical extensions to hybrid indexing that obviate the restriction on M, supporting search for patterns of arbitrary length.

Original languageEnglish
Title of host publication2018 Proceedings of the 20th Workshop on Algorithm Engineering and Experiments, ALENEX 2018
EditorsSuresh Venkatasubramanian, Rasmus Pagh
PublisherSociety for Industrial and Applied Mathematics Publications
Pages1-8
Number of pages8
ISBN (Electronic)9781611975055
DOIs
StatePublished - 2018
Event20th Workshop on Algorithm Engineering and Experiments, ALENEX 2018 - New Orleans, United States
Duration: Jan 7 2018Jan 8 2018

Publication series

NameProceedings of the Workshop on Algorithm Engineering and Experiments
Volume2018-January
ISSN (Print)2164-0300

Conference

Conference20th Workshop on Algorithm Engineering and Experiments, ALENEX 2018
Country/TerritoryUnited States
CityNew Orleans
Period01/7/1801/8/18

Fingerprint

Dive into the research topics of 'Hybrid indexing revisited'. Together they form a unique fingerprint.

Cite this