Branching Processes for QuickCheck Generators

Agustín Mista, Alejandro Russo and John Hughes
ACM SIGPLAN Haskell Symposium (2018)

Abstract

In QuickCheck (or, more generally, random testing), it is challenging to control random data generators’ distributions—specially when it comes to user-defined algebraic data types (ADT). In this paper, we adapt results from an area of mathematics known as branching processes, and show how they help to analytically predict (at compile-time) the expected number of generated constructors, even in the presence of mutually recursive or composite ADTs. Using our probabilistic formulas, we design heuristics capable of automatically adjusting probabilities in order to synthesize generators which distributions are aligned with users’ demands. We provide a Haskell implementation of our mechanism in a tool called DRaGeN and perform case studies with real-world applications. When generating random values, our synthesized QuickCheck generators show improvements in code coverage when compared with those automatically derived by state-of-the-art tools.