Photo by Todd Mittens on Unsplash

Although property-based testing is not one of the most “common” testing techniques, it’s been around for a while. Since it can be applied to almost any programming language, including dart (and Flutter), it’s certainly a tool that may come in handy sometimes. Let’s see how it works starting with a simple example.

Initial example

Suppose we’re checking the user legal age in our program and we have the following implementation:

class AgeManager {

  bool checkLegalAge(int year, int month, int day) =>
    DateTime.now().millisecondsSinceEpoch >=  getLegalAgeTime(DateTime.parse("$year$month$day")).millisecondsSinceEpoch;

  DateTime getLegalAgeTime(DateTime birthday) {
    return birthday.add(Duration(days: 365 * 18));
  }
}

Forget about the code structure and the “smells” it shows, like not wrapping the DateTime attributes, dealing with milliseconds everywhere or having magic numbers all around over the place. Let’s focus on the time-related calculations…

Basically, we take the birthday, add some days over it and compare the resulting date with the current one to decide if someone can use our app.

Everything looks fine, but we throw some JUnit tests to make sure, since we want to check what happens:

  • when dealing with boundary values, like 1970 or Y2K
  • if the user has a legal age
  • if the user has not a legal age
test('When user was born on boundary then check() returns true', () async {
    final mgr = AgeManager();

    final actual = mgr.checkLegalAge(1970, 1, 1);
    const expected = true;

    expect(actual, expected);
  }
);

test('When user is old enough then check() returns true', () async {
    final mgr = AgeManager();

    final actual = mgr.checkLegalAge(2004, 1, 1);
    const expected = true;

    expect(actual, expected);
  }
);

test('When user is NOT old enough then check() returns false', () async {
    final mgr = AgeManager();

    final actual = mgr.checkLegalAge(2010, 1, 1);
    const expected = false;

    expect(actual, expected);
  }
);

All tests pass and our coverage is 100%. So we can call it a day and go home … right?

Code coverage on AgeManager class

Unfortunately, when it comes down to testing, we can tell if we have a bug, but never say otherwise. So the only thing we know for sure is that we’ve found no bugs…

Nevertheless, using property-based testing, we could’ve stressed the previous code, running it with several random birthday inputs. Then sooner or later we would’ve realised that we did not take into account… leap years! So our implementation is a little bit buggy.

What is property-based testing?

When checking the behaviour of a program, it is virtually impossible to explore all testing scenarios and/or input combinations.

Let’s say we have a function that receives a number and performs some math transformation over it: if we want to be thorough, we should test the method with every integer available.

Since exhaustive input validation is not feasible at all, we end up picking a closed set of example-based input values for our tests and move forward.

But, as we saw in the initial example, this approach may be misleading, since even when our tests pass we may have some “undercover“ bugs.

What if we didn’t have to pick the inputs for our tests, but chose a feature of our program instead? Then we sit down and let the testing framework do all the heavy-lifting regarding the inputs. How does this sound…?

That’s precisely the principle behind property-based testing, which allow us to exercise the program under test more intensely by automating the input generation and execution of tests.

Focus on the inputs Focus on the properties

In general, any application:

  • executes a given contract: when provided with valid inputs, the program will return the corresponding outputs
  • satisfies certain invariants, that is, conditions that are always true in the system.

Both contracts and invariants are often referred to as “properties”. These generic characteristics are the target of property-based testing, which leaves aside input generation and focus on the behaviour and the assumptions we can state about our program.

In general, properties can either be implicit or explicit:

  • explicit properties usually have a direct match in our code, so they’re mapped to a method or attribute on some class.
class User {
  int age; //XXX: explicit property here

  …

  bool hasLegalAge() => return …;
}
  • implicit properties may be harder to find, since they have no direct match with the underlying code. Sometimes they correspond to a group of attributes and methods that perform some operation together. In other cases, they may be derived data obtained after transforming the main data of our domain.
class WareHouse {

   …
  
  //XXX: set of methods working over the same prop 
  OrderStatus order(String itemName, int quantity) {
    if (inStock(itemName)) {
      takeFromStock(itemName, quantity);  
      return OrderStatus("ok", itemName, quantity);
    } else {
      ...
    }
  }
}

Either way, the goal of this type of testing is “breaking” the program on behalf of a given property: meaning, finding a set of input values that make the property evaluate to false.

Once a breaking input is found, the system modifies it automatically looking for its minimal expression: we want to have the counterexample on its most comprehensive form, so we can analyse it easily. This simplification process is usually called “shrinking“.

Using input generators

Although we don’t have to think about specific inputs for our tests, we must define their domain (meaning its generic traits). For instance, if our program works with numbers, we should ask:

  • Shout the number be positive?
  • … negative?
  • Is zero allowed?
  • Should it manage numbers with decimals?
  • Which math notation do we use to represent it?

Any time we have to create inputs in a certain range or even custom input models (such as instances of a custom “User“ class) we must define some methods that provide those objects. These type of functions are often called generators and they’re invoked automatically when running our property-based tests.

For instance, in the previous birthday example, we’ll need to create random days of the month, so an integer generator that provides values in the range [1-31] will suffice.

Shrinkable getRandomDay(Random r, int i) {
  return Shrinkable(r.nextInt(31) + 1);
}

Advantages and disadvantages of property-based testing

By automating the input generation and putting the focus on the properties of our system, property-based testing fills an important gap in the testing tools providing:

  • large input coverage
  • high feature compliance

Since property-based tests use abstractions as inputs, they can be easier to read and maintain (as opposed to example-based tests, that rely on hand-picked particular inputs).

On the other hand, property-based tests may be harder to write at first, specially when being used to writing example-based tests. Analysing a system in order to identify its properties and formulate some expectations about it is an exercise that requires effort, specially in legacy systems or programs with no clear separation of concerns. When property-tests cannot be written ”because I do not see any props in the system…“ we may have a bigger problem regarding the application architecture.

How does property-testing work?

In order to carry out property-based tests, we basically need:

  • a test harness environment that allows us to specify the input values we want to use
  • a process to slightly modify (when needed) the inputs provided to tests, so we can perform shrinking
  • some automatic mechanism to iterate over the tests applying different combinations of random inputs

Since the implementation of these features from scratch would be expensive, property-based testing frameworks may come in handy. There’s a list of available libraries at the end of this article.

Property-based testing frameworks features

Regarding of the programming language they’re implemented on, all 3rd party libraries for property-based testing:

  • generate a large sets of random inputs automatically
  • execute multiple times our tests
  • programatically shrink any set of counterexamples found
  • report the inputs that make the program fail, so we can check and fix the bug

Workflow

  1. Create a test for each property in our system we want to test
  2. If needed, create a generator function that will provide random inputs for the previous test
  3. Specify assertions and/or expectations about the property under test
  4. Run the test to check the behaviour of the program
  5. Check provided test report
  6. If required, capture any input that made the program fail and analyse it further

Property-testing the initial example

The following snippet contains a property-based test for the birthday example using the glados library (hence some class names…):

g.Glados3(getRandomYear,getRandomMonth,getRandomDay)
.test('When checking birthday then both values in the same month', (int year, int month, int day) {
  final mgr = AgeManager();
  DateTime birthday = 
  DateTime.parse("$year$month$day}");

  final futureBirthday = 
    mgr.getLegalAgeTime(birthday);

  expect(futureBirthday.month, birthday.month);
  expect(futureBirthday.day, birthday.day);
});

The test uses several generators (for years, months and days of the month) and then passes as parameters the random set of values obtained to the current test.

In this case, the property under test is the ”legal age” check. What do we know about it? Which assumptions can we state? Well, to start with, we know for sure that:

  • day of the month must be the same on both birthday timestamp and 18th anniversary
  • same goes for the month of the year

So we can start by using these to check the behaviour of the program by converting them into test assertions.

After trying a few iterations we bump into a counterexample that breaks the program behaviour:

Test report on our failing test

In fact, there is no need for the 2nd assertion in the test, since by only applying the 1st one we already get to break the program.

As expected, the framework reports back the failing inputs so we can use them to do some digging. In this case, there is no input shrinking, since the date components are already simplified.

Some final notes

  • Although it comes from the functional-programming paradigm, it can also be applied to object-oriented programming.
  • Property-based testing frameworks are “clever enough” to generate boundary values (null, 0, ““, [] and so on) and use them as inputs in the automated tests.
  • This type of testing is not a substitute for traditional unit tests. In fact, both approaches are usually used together to increase the level of confidence in our code.
  • Since the property definition carries some abstraction, literature about this topic sometimes simplifies it by saying that properties are just “parametrised tests“.
  • Any time we have a set of inputs that break the program, we should turn it into a specific JUnit test using them. This way we make sure the error will not appear again when doing regression testing.
  • Property-based testing motto was coined by John Hughes: “don’t write tests… generate them!

A few available frameworks

Sample repository

The following repository contains different examples of property-based testing:

https://github.com/begomez/warehouse_prop_testing

Leave a comment