Skip to main navigation Skip to search Skip to main content

Anti-persistence on persistent storage: History-independent sparse tables and dictionaries

  • Michael A. Bender
  • , Jonathan W. Berry
  • , Rob Johnson
  • , Thomas M. Kroeger
  • , Samuel McCauley
  • , Cynthia A. Phillips
  • , Bertrand Simon
  • , Shikha Singh
  • , David Zage
  • Sandia National Laboratories, New Mexico
  • Stony Brook University
  • Sandia National Laboratories
  • École normale supérieure de Lyon
  • Intel

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

22 Scopus citations

Abstract

We present history-independent alternatives to a B-tree, the primary indexing data structure used in databases. A data structure is history independent (HI) if it is impossible to deduce any information by examining the bit representation of the data structure that is not already available through the API. We show how to build a history-independent cache-oblivious B-tree and a history-independent external-memory skip list. One of the main contributions is a data structure we build on the way - a history-independent packed-memory array (PMA). The PMA supports efficient range queries, one of the most important operations for answering database queries. Our HI PMA matches the asymptotic bounds of prior non-HI packed-memory arrays and sparse tables. Specifically, a PMA maintains a dynamic set of elements in sorted order in a linearsized array. Inserts and deletes take an amortized O(log2 N) element moves with high probability. Simple experiments with our implementation of HI PMAs corroborate our theoretical analysis. Comparisons to regular PMAs give preliminary indications that the practical cost of adding history-independence is not too large. Our HI cache-oblivious B-tree bounds match those of prior non-HI cache-oblivious B-trees. Searches take O(logB N) I/Os; inserts and deletes take O(log2N/B + logB N) amortized I/Os with high probability; and range queries returning k elements take O(logB N + k/B) I/Os. Our HI external-memory skip list achieves optimal bounds with high probability, analogous to in-memory skip lists: O(logB N) I/Os for point queries and amortized O(logB N) I/Os for inserts/deletes. Range queries returning k elements run in O(logB N + k/B) I/Os. In contrast, the best possible high-probability bounds for inserting into the folklore B-skip list, which promotes elements with probability 1/B, is just Θ(log N) I/Os. This is no better than the bounds one gets from running an inmemory skip list in external memory.

Original languageEnglish
Title of host publicationPODS 2016 - Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
PublisherAssociation for Computing Machinery
Pages289-302
Number of pages14
ISBN (Electronic)9781450341912
DOIs
StatePublished - Jun 15 2016
Event35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2016 - San Francisco, United States
Duration: Jun 26 2016Jul 1 2016

Publication series

NameProceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems
Volume26-June-01-July-2016
ISSN (Print)1055-6338

Conference

Conference35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2016
Country/TerritoryUnited States
CitySan Francisco
Period06/26/1607/1/16

Fingerprint

Dive into the research topics of 'Anti-persistence on persistent storage: History-independent sparse tables and dictionaries'. Together they form a unique fingerprint.

Cite this