Skip to main navigation Skip to search Skip to main content

MIROSTAT: A NEURAL TEXT DECODING ALGORITHM THAT DIRECTLY CONTROLS PERPLEXITY

  • Sourya Basu
  • , Govardana Sachitanandam Ramachandran
  • , Nitish Shirish Keskar
  • , Lav R. Varshney
  • University of Illinois at Urbana-Champaign
  • Salesforce Research

Research output: Contribution to conferencePaperpeer-review

35 Scopus citations

Abstract

Neural text decoding algorithms strongly influence the quality of texts generated using language models, but popular algorithms like top-k, top-p (nucleus), and temperature-based sampling may yield texts that have objectionable repetition or incoherence. Although these methods generate high-quality text after ad hoc parameter tuning that depends on the language model and the length of generated text, not much is known about the control they provide over the statistics of the output. This is important, however, since recent reports show that humans prefer when perplexity is neither too much nor too little and since we experimentally show that cross-entropy (log of perplexity) has a near-linear relation with repetition. First we provide a theoretical analysis of perplexity in top-k, top-p, and temperature sampling, under Zipfian statistics. Then, we use this analysis to design a feedback-based adaptive top-k text decoding algorithm called mirostat that generates text (of any length) with a predetermined target value of perplexity without any tuning. Experiments show that for low values of k and p, perplexity drops significantly with generated text length and leads to excessive repetitions (the boredom trap). Contrarily, for large values of k and p, perplexity increases with generated text length and leads to incoherence (confusion trap). Mirostat avoids both traps. Specifically, we show that setting target perplexity value beyond a threshold yields negligible sentence-level repetitions. Experiments with human raters for fluency, coherence, and quality further verify our findings.

Original languageEnglish
StatePublished - 2021
Event9th International Conference on Learning Representations, ICLR 2021 - Virtual, Online, Austria
Duration: May 3 2021May 7 2021

Conference

Conference9th International Conference on Learning Representations, ICLR 2021
Country/TerritoryAustria
CityVirtual, Online
Period05/3/2105/7/21

Fingerprint

Dive into the research topics of 'MIROSTAT: A NEURAL TEXT DECODING ALGORITHM THAT DIRECTLY CONTROLS PERPLEXITY'. Together they form a unique fingerprint.

Cite this