2011/04/26

Deterministic Random Testing

I've just written an implementation of a weighted random shuffle. Because the function is random, it won't return the same results every time. This makes it hard to unit-test; I can't just run it on some given input and assert it returns some given output. Indeed, by its nature, any ordering of the results is potentially valid. One way to solve this is to run it a large number of times, counting the frequency of various output results, and asserting that roughly the right proportion of results are returned. Of course, exactly how close "roughly right" is, is hard to determine.

This isn't a very satisfactory testing method, however. Get the tolerance too tight, and spurious random differences cause tests to fail unnecessarily. Too loose, and you might miss a subtle logic bug that skews the probabilities of some case.

The weighted random shuffle algorithm calls int rand $limit a number of times, each time passing in a small integer. This effectively uses the RNG like a dice roll, randomly selecting some integer 0 .. $limit - 1. Because each call takes a small integer, and because the algorithm is entirely deterministic for any given set of random results, this suggests a better testing method. If we could instead enumerate all possible returns of random numbers, each one exactly once, we can generate all possible results from the shuffle algorithm in their ideal proportions. Because this will be an exact deterministic count, free from randomness, we can assert exact values for the results.

So this is what I did. I've created a module, for now simply called Unrandom, which exports one function, unrandomly. It is used like the following:
use Unrandom 'unrandomly';

unrandomly {
my $da = 1 + int rand 6;
my $db = 1 + int rand 6;
say "Roll 2d6: $da + $db = " . ($da+$db);
};
The unrandomly function has the effect of replacing the rand function with one under its control while it runs the block of code. It runs the block of code a number of times, enumerating the entire tree of possible return values, in a given deterministic order:
Roll 2d6: 1 + 1 = 2
Roll 2d6: 1 + 2 = 3
Roll 2d6: 1 + 3 = 4
Roll 2d6: 1 + 4 = 5
Roll 2d6: 1 + 5 = 6
Roll 2d6: 1 + 6 = 7
Roll 2d6: 2 + 1 = 3
Roll 2d6: 2 + 2 = 4
...
Roll 2d6: 6 + 5 = 11
Roll 2d6: 6 + 6 = 12
Because each possible combination is returned exactly once, we can unit test that this dice-rolling algorithm does in fact give us the right distribution of results; there will be exactly one 2, two 3s, etc... We don't have to run it a few million times, and check that we got "roughly" the right number; we can be exact.

While I'm using this code in a unit-test, there's nothing directly test-related in the code. It could be useful anywhere that statistical modelling is used, or other problems involving random integer generation.

I'm now just looking for a good name to call it, so I can extract it from the unit tests and give it a life of its own on CPAN.

Suggestions anyone?

2 comments:

  1. I don't have a better name suggestion, but it should definitely not be /Unrandom(ly)?/ :)

    ReplyDelete
  2. I have a similar problem when doing unit tests for my game, Kingdoms (plug: http://kingdoms-game.com), which uses a *lot* of randomness. I mostly address this in a similar way, but using Test::MockObject to mock any dice rolls (which are encapsulated in Games::Dice::Advanced).

    ReplyDelete