Menu Home

Tests… documentation for your future self (and others) of code correctness

Picture this… you come into work and a colleague runs to you saying their is a bug in the software,  everything is wrong, HELP!

Grabbing a coffee, you settle into your chair and open up the offending bit of code:

// Random Javascript module

function adds_up_to(nums, total) {
  var status = false;	
  for ( let first of nums) {
    for ( let second of nums ) {
	var sum = first + second;
    	if (sum == total) {
	    status = true;
	}
    }
  }	
  return status;
}
module.exports = adds_up_to;

Hmmm…. it returns “status”. Umm what? Umm why? Is it supposed to return “status”, what is “status” anyway?

And there you are scratching your head and wondering what the problem is, worse you are wondering why this code exists and why it is doing what it is doing. 

But fortunately, your past self cared a little bit about you and left a little test to help you.

// Random test script

const adds_up_to = require('./adds_up_to');

test('adds up to', () => {

  expect(adds_up_to([10,15,3,7],17)).toBe(true);
  expect(adds_up_to([1,1,1,1,1,1,1], 4)).toBe(false);
  expect(adds_up_to( [ 1, 2, 3, 4, 5, 6, 7, 8, 9 ], 7 )).toBe(true);
  expect(adds_up_to([-1,2,-2],-3)).toBe(true);

});

Ok… a test great, lets run it!

$ npm test

> adds_up_to@1.0.0 test /home/lance/dev/not_real_code
> jest

 PASS  ./adds_up_to.test.js
  ✓ adds up to (3ms)

Test Suites: 1 passed, 1 total
Tests:       1 passed, 1 total
Snapshots:   0 total
Time:        0.597s, estimated 1s
Ran all test suites.

So according to this the tests are passing, so the code is doing what your past self intended. That is something… but we are still confused as to what the codes intention was and we are not sure if the code is “correct”.

And by “correct” we mean something different to “tests pass” or “the code works as designed”. As Itamar Turner-Trauring wrote recently tests are not a silver bullet that will make your software correct.

In the example Javascript code above, the code works. It does what it was written to do, the tests confirm that. But the (hypothetical) colleague is telling you it is not correct.

So what is the problem?

The problem here is that the tests confirm the method/function works on a technical level. We have tested that the method code works… which is very good and solves a bunch of problems around ensuring the code is correct on a technical implementation level.

If we look at the above code examples we do not get the context of the code, it’s not easy to determine what problem the code is solving. At the time of writing the developer understood the problem and the domain that problem existed within. But did not include that understanding in the code (meaning the function and tests) so we cannot determine if the problem is still relevant, let alone if our approach is correct.

So what might we do to make this prove correctness?

The idea here is that we want to prove that we are solving a problem. So I guess step one is identify the problem.

So the actual problem the above code was written to solve was this sourced via the very interesting “Daily Coding Problem” mailing list:

Given a list of numbers and a number k, return whether any two numbers from the list add up to k.

For example, given [10, 15, 3, 7] and k of 17, return true since 10 + 7 is 17.

So this gives some context, so a first step might be to copy and paste this text into the test as a comment and/or the function. This would at least mean your future self might understand what you were trying to do. It’d be even better if we knew “why” this mattered to our business or users.

// Random test script

const adds_up_to = require('./adds_up_to');

/*
Given a list of numbers and a number k, return whether any two numbers from the list add up to k.

For example, given [10, 15, 3, 7] and k of 17, return true since 10 + 7 is 17.
*/

test('adds_up_to', () => {
  // 17 = 10 + 7, so return true
  expect(adds_up_to([10,15,3,7],17)).toBe(true);

  // 4 != 1 +1, so return false
  expect(adds_up_to([1,1,1,1,1,1,1], 4)).toBe(false);

  // 7 = 6 + 1, so return true
  // 7 = 5 + 2, so that also would have made it return true
  expect(adds_up_to( [ 1, 2, 3, 4, 5, 6, 7, 8, 9 ], 7 )).toBe(true);

  // -3 = -1 + -2, so return true
  // Check if two negative numbers works ok
  expect(adds_up_to([-1,2,-2],-3)).toBe(true);
});

This does not actually change our test output but now we have a little more context as reminder when we revisit this code next year. We have added a bunch of comments. First we state the problem (or as much of it as we know right now) and we also explain what the test are trying to prove a little more.

Reading this we can understand more of the intention of the code. I.e. take any two numbers, add them together; do they equal the other number provided. We have also explained the final test where we test the edge case of two negative numbers.

We could and really should extend and structure these tests so that the tests themselves explain the context:

// Random test script

const adds_up_to = require('./adds_up_to');

/*
Given a list of numbers and a number k, return whether any two numbers from the list add up to k.

For example, given [10, 15, 3, 7] and k of 17, return true since 10 + 7 is 17.
*/

test('given an array of values, if any two add up to the value provided then return true', () => {
  // 17 = 10 + 7, so return true
  expect(adds_up_to([10,15,3,7],17)).toBe(true);
});

test('given an array of values, if no two add up to the value provided then return false', () => {
  // 4 != 1 +1, so return false
  expect(adds_up_to([1,1,1,1,1,1,1], 4)).toBe(false);
});

test('given an array of values, if any two add up to the value provided then return true (this time more than one pair meet the criteria)', () => {
  // 7 = 6 + 1, so return true
  // 7 = 5 + 2, so that also would have made it return true
  expect(adds_up_to( [ 1, 2, 3, 4, 5, 6, 7, 8, 9 ], 7 )).toBe(true);
});
test('given an array of values, if any two add up to the value provided then return true (even if numbers are negative)', () => {
  // -3 = -1 + -2, so return true
  // Check if two negative numbers works ok
  expect(adds_up_to([-1,2,-2],-3)).toBe(true);
});

This ise a very verbose example, and still does not explain the business requirement; we are however explaining what we intended to achieve. So for me it is a better value test if you are trying to ensure correctness. Here is the Jest output:

$ npm test

> adds_up_to@1.0.0 test /home/lancew/dev/challenges/1/js
> jest

 PASS  ./adds_up_to.test.js
  ✓ given an array of values, if any two add up to the value provided then return true (3ms)
  ✓ given an array of values, if no two add up to the value provided then return false
  ✓ given an array of values, if any two add up to the value provided then return true (this time more than one pair meet the criteria)
  ✓ given an array of values, if any two add up to the value provided then return true (even if numbers are negative)

Test Suites: 1 passed, 1 total
Tests:       4 passed, 4 total
Snapshots:   0 total
Time:        0.73s, estimated 1s
Ran all test suites.

So as you can see, the text/name of the test now explicitly says what it is trying to prove. So if one fails hopefully the message gives you context before you even read the tests themselves.

But, this is still just proving technical correctness; what we really want to prove is that this code provides the business benefit it was written for. So we need to go talk to someone and find out what this code is actually for and in this hypothetical example the answer:

Hi Lance,

In the order system we allow people to use vouchers, if they have two vouchers that sum up to the purchase value; then we don’t need to trigger the credit card payment process, we  just ship the products.

The rules are they can only use two vouchers at a time and it has to match exactly the total price.

– Carol the sales manager.

Ok… this gives us the business context so lets rewrite the tests to express this:

// Random test script

const adds_up_to = require('./adds_up_to');

/*
Given a list of numbers and a number k, return whether any two numbers from the list add up to k.

For example, given [10, 15, 3, 7] and k of 17, return true since 10 + 7 is 17.
*/

test('Return true if two voucher value add up to the total price', () => {
  // 17 = 10 + 7, so return true
  expect(adds_up_to([10,15,3,7],17)).toBe(true);
});

test('Return false if no two voucher values add up to the total price', () => {
  // 4 != 1 +1, so return false
  expect(adds_up_to([1,1,1,1,1,1,1], 4)).toBe(false);
});

test('Return true if two voucher value add up to the total price (even if more than one pair match)', () => {
  // 7 = 6 + 1, so return true
  // 7 = 5 + 2, so that also would have made it return true
  expect(adds_up_to( [ 1, 2, 3, 4, 5, 6, 7, 8, 9 ], 7 )).toBe(true);
});

test('???????????? Negative Voucher values ???????????', () => {
  // -3 = -1 + -2, so return true
  // Check if two negative numbers works ok
  expect(adds_up_to([-1,2,-2],-3)).toBe(true);
});

Hang on a second!
Suddenly our test for negative numbers no longer makes sense in terms of the correctness of the business criteria. The business does not go around giving customers vouchers worth a negative amount. Nor do we allow negative total purchase prices.

Our “code” is correct, but only on technical level. On a business level it is horribly wrong and as developers we would not know without the context we got from the sales manager and wrote into our tests.

Our past self, wrote a function that does work perfectly with positive voucher values, but it would be better perhaps if our code protected us from an invalid voucher value.

It also might make us think, what if one voucher adds up to the total? So we can ask that question of the business and ensure that our feature actually does what the business wants.

Summary:

So in this contrived example, we actually cover a problem that as a developer I actually encounter all the time. Sometimes we code things up without expressing what the intention and context of the code is.

If we had written the test to express intent and context we might have discovered the negative voucher issue last year. Now we have found it we might want to go check the transaction database and find out if some sneaky customer discovered the exploit.

As the developer meeting this code for the first time, we now have a good chance of not doing something silly as we can see what the business case for the code is. This saves us time and it decreases the chances of us getting things wrong as the tests explicitly express more of the business idea of correct behaviour.

It is hard or perhaps impossible to prove correctness, in terms of the business idea of correct unless we code our tests to prove the business idea of correct operation.

For those of you reading this you might have a sense of familiarity with the ideas in this post… why? Because this is also a core idea of Domain Driven Design (DDD) popularised by Eric Evans.

Homework:

Having read this, perhaps now is a good time to go look at one of the tests in your code base at work and decide if you think that the tests are testing business correctness or code correctness.

Let me know what you find. 

Categories: Uncategorized

Lance