Testing with QuickCheck

QuickCheck is a useful way to automatically test some kinds of functions. QuickCheck does property testing, i.e. it tests if some property of a function is true.

To understand what this means, here are a few examples of properties for quicksort:

  • quicksort (quicksort lst) == quicksort lst, for any list lst; this says quicksort is idempotent
  • quicksort (reverse lst) == quicksort lst, for any list lst
  • quicksort [1..n] == [1..n], for any integer n greater than 0
  • length lst == length (quicksort lst), for any list lst

By implementing each of these properties as Haskell functions (that start with prop_), QuickCheck can automatically generate and run lots of random tests:

-- quicksort_testing.hs

--
-- To use QuickCheck, you need to install it. The easiest way to do this is
-- to use the cabal package manager:
--
--   $ cabal update
--   $ cabal install QuickCheck
--
--
-- To run all tests, type runTests in the interpreter:
--
-- > runTests
--
-- Some useful references for quickCheck:
--  - http://www.cse.chalmers.se/~rjmh/QuickCheck/manual.html
--  - https://begriffs.com/posts/2017-01-14-design-use-quickcheck.html
--

import Test.QuickCheck

quicksort :: Ord a => [a] -> [a]
quicksort []     = []
quicksort (x:xs) = smalls ++ [x] ++ bigs
                 where smalls = quicksort [n | n <- xs, n <= x]
                       bigs   = quicksort [n | n <- xs, n > x]

-- sorting is idempotent
--
-- in general, a function f is idempotent if f (f x) = f x for all x, i.e.
-- applying f twice gives the same result as applying it once
prop_qs1 :: [Int] -> Bool
prop_qs1 lst = quicksort (quicksort lst) == quicksort lst

-- sorting a list, or its reverse, returns the same result
prop_qs2 :: [Int] -> Bool
prop_qs2 lst = quicksort lst == quicksort (reverse lst)

-- sorting [1,2,..n] returns the same list
prop_qs3 :: Int -> Bool
prop_qs3 n = quicksort [1..n] == [1..n]

-- sorting doesn't change the length of a list
prop_qs4 :: [Int] -> Bool
prop_qs4 lst = length lst == length (quicksort lst)

-- this is a do environment --- the way that Haskell lets you perform actions
-- in sequence
runTests = do
  quickCheck prop_qs1
  quickCheck prop_qs2
  quickCheck prop_qs3
  quickCheck prop_qs4

Type runTests do the testing, e.g.:

> runTests
+++ OK, passed 100 tests.
+++ OK, passed 100 tests.
+++ OK, passed 100 tests.
+++ OK, passed 100 tests.

quickCheck generates 100 random inputs for each prop_ function and verifies that the property is true for each input. Of course, this does not guarantee quicksort is bug-free, but in practice random testing often does a good job of catching bugs in small functions. Plus, many programmers find property tests more interesting to create than low-level (input, output) pairs.

Lets see what happens when QuickCheck finds a bug. Consider this incorrect quicksort:

quicksortBuggy :: Ord a => [a] -> [a]
quicksortBuggy []     = []
quicksortBuggy (x:xs) = smalls ++ [x] ++ bigs
                        where smalls = quicksortBuggy [n | n <- xs, n < x] -- oops
                              bigs   = quicksortBuggy [n | n <- xs, n > x]

The bug is in the equation for smalls: n < x is written, when it is meant to be n <= x. Here’s the testing:

> runTests
+++ OK, passed 100 tests.
+++ OK, passed 100 tests.
+++ OK, passed 100 tests.
*** Failed! Falsifiable (after 10 tests and 4 shrinks):
[4,4]

The first 4 properties don’t catch the bug — QuickCheck can’t guarantee to catch all problems! It’s prop_qs4 that catches the error, i.e. prop_qs4 notices that the list output by quicksortBuggy is not always the same length as the input list.

QuickTest reports that the list [4,4] violates prop_qs4. Trying this with quicksortBuggy shows that it indeed returns the wrong result:

> quicksortBuggy [4,4]
[4]

Notice that QuickCheck said it did “4 shrinks” to get [4,4]. Humans often find randomly generated data quite messy and hard to read, and so QuickCheck uses various heuristics to “shrink” a failing test case into one that is smaller and simpler. While there is no guarantee how well these heuristics will work, in practice they often do a good job of finding small and simple tests cases that can then be used to help debug the function.

The idea of testing using properties has proven to be quite popular with some programmers, and so you can find many QuickCheck-like tools in other languages. For example, in Python the Hypothesis package offers quickcheck-like testing for Python code.

It’s also worth noting that discovering and stating properties of functions is an interesting nad useful way of thinking about, even if you don’t end up using those properties for testing.

QuickCheck Example: last item of a list

Let’s use QuickCheck to test a function called mylast that is supposed to work the same as the standard last function (which returns the last element of a list):

mylast :: [a] -> a
mylast []     = error "empty list"
mylast [x]    = x
mylast (_:xs) = mylast xs

-- mylast :: [a] -> a
-- mylast []  = error "empty list"
-- mylast lst = head (drop (n-1) lst)
--              where n = length lst

-- mylast :: [a] -> a
-- mylast []  = error "empty list"
-- mylast lst = head (reverse lst)

A couple of alternate implementations are provided in case you want to test those as well (practice makes perfect!).

First, we need to think of some properties that mylast has. Here are three properties that hold for all non-empty lists a and b:

  • last (a ++ b) == last b, where b is not empty
  • last [a] == a
  • last (reverse a) == head a
  • last (map f lst) == f (last lst)

Lets go through an example that shows that the last property holds. If lst is [a,b,c], then the left-hand side of the property reduces as follows:

  last (map f lst)

= last (map f [a,b,c])

= last [f a, f b, f c]

= f c

The right-hand side reduces to this:

  f (last lst)

= f (last [a,b,c])

= f (c)

= f c

So the left-hand side and the right-hand side reduce to the same thing.

Looking at our implementation of mylast, the third property (last [a] == a) doesn’t seem like a very useful test because the second equation in mylast explicitly handles that case. So we will not implement it as a property.

You might try writing the first property like this:

prop_mylast1_bad :: [Int] -> [Int] -> Bool
prop_mylast1_bad a b = mylast (a ++ b) == mylast b

But when you run it you get a failure very quickly:

> quickCheck prop_mylast1_bad
*** Failed! (after 1 test):
Exception:
  empty list
[]
[]

The problem is that prop_mylast1_bad generates any list of Int values, including the empty list []. But our property requires that b is non-empty.

There are two ways to fix this. The first is to re-write the test using QuickCheck’s conditional operator ==>:

prop_mylast1 :: [Int] -> [Int] -> Property
prop_mylast1 a b = b /= [] ==> mylast (a ++ b) == mylast b

This test only checks if the property holds when b is non-empty. Also, the return type of (==>) is Property, and so the return type of prop_mylast1 must also be Property (instead of Bool). Here’s a sample run:

> quickCheck prop_mylast1
+++ OK, passed 100 tests; 16 discarded.

16 of the randomly generated tests were discarded because b was empty. So be careful when using ==>: if your condition is very restrictive you may not end up doing many tests.

A second approach to writing this test is to use the special type NonEmptyList a provided by QuickCheck:

prop_mylast1b :: NonEmptyList Int -> NonEmptyList Int -> Bool
prop_mylast1b (NonEmpty a) (NonEmpty b) = mylast (a ++ b) == mylast b

This causes the random lists generated by QuickCheck to always have at least one element. Note that the return type is back to Bool again.

Finally, we also have this property:

prop_mylast3 :: NonEmptyList Int -> Bool
prop_mylast3 (NonEmpty a) = mylast (map inc a) == inc (mylast a)
                            where inc n = n + 1

We use inc as the mapping function here because it is simple; you could replace it with any other function of type Int -> Int.

Here are all the tests, plus a helper function to run them:

prop_mylast1 :: [Int] -> [Int] -> Property
prop_mylast1 a b = a /= [] && b /= [] ==> mylast (a ++ b) == mylast b

prop_mylast1b :: NonEmptyList Int -> NonEmptyList Int -> Bool
prop_mylast1b (NonEmpty a) (NonEmpty b) = mylast (a ++ b) == mylast b

prop_mylast2 :: [Int] -> Property
prop_mylast2 a = a /= [] ==> mylast (reverse a) == head a

prop_mylast2b :: NonEmptyList Int -> Bool
prop_mylast2b (NonEmpty a) = mylast (reverse a) == head a

prop_mylast3 :: NonEmptyList Int -> Bool
prop_mylast3 (NonEmpty a) = mylast (map inc a) == inc (mylast a)
                            where inc n = n + 1

runTests = do
    quickCheck prop_mylast1
    quickCheck prop_mylast1b
    quickCheck prop_mylast2
    quickCheck prop_mylast2b
    quickCheck prop_mylast3

Here’s a sample run:

> runTests
+++ OK, passed 100 tests; 22 discarded.
+++ OK, passed 100 tests.
+++ OK, passed 100 tests; 14 discarded.
+++ OK, passed 100 tests.
+++ OK, passed 100 tests.

In practice, you only need one version of each property; the ==> versions can probably be discarded.