TY - GEN
T1 - Explaining the Inherent Tradeoffs for Suffix Array Functionality
T2 - 37th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2026
AU - Kempa, Dominik
AU - Kociumaka, Tomasz
N1 - Publisher Copyright:
© 2026 Association for Computing Machinery. All rights reserved.
PY - 2026
Y1 - 2026
N2 - We study the fundamental question of how efficiently suffix array entries can be accessed when the array cannot be stored explicitly. The suffix array SAT [1 . . n] of a text T of length n encodes the lexicographic order of its suffixes and underlies numerous applications in pattern matching, data compression, and bioinformatics. Previous work established one-way reductions showing how suffix array queries can be answered using, for example, rank queries on the Burrows–Wheeler Transform. More recently, a new class of prefix queries was introduced, together with reductions that, among others, transform a simple tradeoff for prefix-select queries into a suffix array tradeoff matching state-of-the-art space and query-time bounds, while achieving sublinear construction time. For binary texts, the resulting data structure achieves space O(n) bits, preprocessing time O(n/√log n), preprocessing space O(n) bits, and query time O(logϵ n) for any constant ϵ > 0. However, whether these bounds could be improved using different techniques has remained open. We resolve this question by presenting the first bidirectional reduction showing that suffix array queries are, up to an additive O(log log n) term in query time, equivalent to prefix-select queries in all parameters. This result unifies prior approaches and shows that essentially all efficient suffix array representations can be expressed via prefix-select structures. Moreover, we prove analogous equivalences for inverse suffix array queries, pattern ranking, lexicographic range, and SA-interval queries, identifying six core problem pairs that connect string and prefix query models. Our framework thus provides a unified foundation for analyzing and improving the efficiency of fundamental string-processing problems through the lens of prefix queries.
AB - We study the fundamental question of how efficiently suffix array entries can be accessed when the array cannot be stored explicitly. The suffix array SAT [1 . . n] of a text T of length n encodes the lexicographic order of its suffixes and underlies numerous applications in pattern matching, data compression, and bioinformatics. Previous work established one-way reductions showing how suffix array queries can be answered using, for example, rank queries on the Burrows–Wheeler Transform. More recently, a new class of prefix queries was introduced, together with reductions that, among others, transform a simple tradeoff for prefix-select queries into a suffix array tradeoff matching state-of-the-art space and query-time bounds, while achieving sublinear construction time. For binary texts, the resulting data structure achieves space O(n) bits, preprocessing time O(n/√log n), preprocessing space O(n) bits, and query time O(logϵ n) for any constant ϵ > 0. However, whether these bounds could be improved using different techniques has remained open. We resolve this question by presenting the first bidirectional reduction showing that suffix array queries are, up to an additive O(log log n) term in query time, equivalent to prefix-select queries in all parameters. This result unifies prior approaches and shows that essentially all efficient suffix array representations can be expressed via prefix-select structures. Moreover, we prove analogous equivalences for inverse suffix array queries, pattern ranking, lexicographic range, and SA-interval queries, identifying six core problem pairs that connect string and prefix query models. Our framework thus provides a unified foundation for analyzing and improving the efficiency of fundamental string-processing problems through the lens of prefix queries.
UR - https://www.scopus.com/pages/publications/105033622349
U2 - 10.1137/1.9781611978971.66
DO - 10.1137/1.9781611978971.66
M3 - Conference contribution
AN - SCOPUS:105033622349
T3 - Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms
SP - 1841
EP - 1858
BT - Proceedings of the 2026 Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2026
A2 - Larsen, Kasper Green
A2 - Saha, Barna
PB - Association for Computing Machinery
Y2 - 11 January 2026 through 14 January 2026
ER -