How to write tests that need a lot of data?

Based on a true story.

Introduction

For 20 years I developed applications in Java. Over the years I have gained a lot of experience with writing tests (unit tests/integration tests). I have used different styles of writing tests. Some styles seemed promising at first, but turned out horrible, for example, when classes when new features needed to be added to existing classes. Often the cause of tests becoming horrible was the large dataset required by the test. A couple of techniques tried out by my team mates and I helped us to write clean tests that used a lot of data. In this post I want to share these techniques with you.

The main goal of a test is to show that a specific piece of code works correctly. The piece of code that is being tested is called Subject Under Test.

A test consists of 3 parts:

  1. Arrange: sets up the test data
  2. Act: calls the code that must be tested
  3. Assert: verifies that the Subject Under Test worked correctly

Sometimes these parts are referred to as ‘given/when/then’. This post describes how to keep the arrange part clean and readable when it needs to set up a lot of data in the database.

I present the techniques with code samples of a Warehouse Management System (WMS). As you can find on my website, I have worked for a paint factory for about 8 years and worked on a WMS, which later grew into an ERP. That application was developed in Java.

In March 2019 I started to work for a completely different company, Protix, and since that time I develop in Python. Since this post is a preparation for my presentation at PyCon DE & PyData Berlin, I show code samples in Python using a WMS implemented in Django.

The code samples I show you are not from the real WMS I have worked on before. Nevertheless, the code samples are realistic and complex enough to explain the techniques for writing tests that need a lot of data. That is why I say: this post is based on a true story.

Finally I want to thank John Castelijn who was one of my team mates at the factory. His input and feedback over the years helped to develop and improve these techniques.

What is a Warehouse Management System?

To explain what a Warehouse Management System (WMS) is, I first explain what a warehouse is. A warehouse is a large building, full of racks. The racks store pallets. Each pallet contains items, for example 200 cans of white paint, 1 liter each.

The following pictures give an impression of the building, racks and pallets:

Photo by Ramon Cordeiro on Unsplash

Photo by Ruchindra Gunasekara on Unsplash

Operators drive forklifts to move pallets and to pick the orders.

Photo by ELEVATE on Pexels

The purpose of a warehouse is to handle orders. Just as software architecture is about decomposing functionality into components, the warehouse can be decomposed in areas with their own purpose:

Everything in the warehouse has an id:

Here is an example of a barcode and text of an id of a location in the pick area:

And here is an example of a barcode and text of a pallet id:

A WMS supports the planners and operators to fulfil orders. It does so by giving planners and operators an overview of the current stock and supporting the workflows in the warehouse. The main workflows in a WMS are:

What is a lot of data in a test?

When learning Test-Driven Development many people practice implementing small algorithms or data structures, for example, a stack. This is what unit tests for a Stack class look like:

class TestStack(TestCase):
    stack = Stack()

    def test_empty_stack_is_empty(self):
        self.assertTrue(self.stack.is_empty())

    def test_empty_stack_count_returns_zero(self):
        self.assertEqual(self.stack.count(), 0)

    def test_pop_from_empty_stack_raises_exception(self):
        self.assertRaises(StackIsEmptyError, self.stack.pop)

    def test_stack_with_one_item_is_not_empty(self):
        self.stack.push('foo')

        self.assertFalse(self.stack.is_empty())

    def test_stack_with_one_item_has_count_one(self):
        self.stack.push('foo')

        self.assertEqual(self.stack.count(), 1)

    def test_pop_from_stack_with_one_item_returns_item(self):
        self.stack.push('foo')

        self.assertEqual(self.stack.pop(), 'foo')

    def test_multiple_items_on_stack_are_popped_in_reverse_order_of_push(self):
        self.stack.push('foo')
        self.stack.push('bar')

        self.assertEqual(self.stack.pop(), 'bar')
        self.assertEqual(self.stack.pop(), 'foo')

These tests use at most two elements of data: the strings foo and bar. It is very easy to provide such few elements in tests.

Most developers work on applications that store data in a database. Over time, the database consists of tens to hundreds of tables. Implementing new features might introduce a new table or add a few columns to an existing table. To test the new feature, requires data in tables. If you are lucky, your test only needs data form one or two tables. If you are unlucky, you need data from tens of different tables or a hundred records in a single table.

Now we return to writing tests for a WMS. Depending on the Subject Under Test, different locations need to be created. For example, to test stocking, a relatively large bulk area is needed, for example with 100 locations. To test replenishment, a small bulk and pick area can suffice. For picking, a relatively large pick area must be used. See the following table to get an impression of the number of locations that are needed to test a process. Note that apart from this, we always need items, users and at least one forklift.

The Good, the Bad and the Ugly

For the Subject Under Test it does not matter how the data was set up, as long as the data is present. I have seen different ways to set up test data in a database:

When to fill the database?

  1. Fill database for each test case
  2. Fill database only once and use that database for all the test cases

How to fill the database?

  1. Fill database using SQL statements
  2. Fill database using Python code

What is stored in the database?

  1. The minimum set of data needed by the Subject Under Test. The data may be incomplete and non-realistic.
  2. Complete and realistic data. May be more than needed by the Subject Under Test.

Imagine we want to test moving a pallet from one location to another location. We first need to create locations in the database. Here is the Django model for locations:

class LocationType(Enum):
    BULK = 'B'
    PICK = 'P'
    PICK_AND_DROP = '&'
    AUDIT = 'A'
    STAGING = 'S'
    DOCK = 'D'
    PROBLEM = '!'
    FORKLIFT = 'F'


class Location(models.Model):
    id = models.CharField(max_length=8, primary_key=True)
    type = models.CharField(max_length=1, choices=[(tag, tag.value) for tag in LocationType])
    sequence = models.SmallIntegerField()
    level = models.SmallIntegerField()
    aisle = models.CharField(max_length=8, null=True, blank=True)
    blocked = models.BooleanField()

Here is a SQL statement that could be used to create a location in the database as part of the arrange part of a test case:

INSERT INTO location (id, type, sequence, level, aisle, blocked) 
VALUES ('BF145C', 'B', 543, 0, 'BULK-BF', false);

Instead of using SQL statements I have seen variations where data was copied from a production or test database and then stored as XML files. That is just another way of representing SQL statements and I consider them just as bad and ugly as SQL statements:

<testdata>
  <location id="BF145C" type="B" sequence="543" level="0" ailse="BULK-BF" blocked="false" />
</testdata>

Django makes it very easy to create an object using Python code:

location = Location.objects.create(id='BF145C', type=LocationType.BULK, aisle='BULK-BF', 
                                  sequence=543, level=0, blocked=False)

Imagine we add a new column, rename a column or rename a table. If you use SQL statements, you have to update a lot of SQL statements. If you use Python code as above, you would have to update a lot of Python code. The rest of this post describes techniques to keep your setup code short, clean and readable even when you need to set up a lot of data. The techniques will make it way easier to deal with changes in the structure of the database than when you use SQL statements.

Technique 1: Test Data Builder

One of the features of our WMS is that when a pallet gets moved to a problem location, the pallet gets blocked. A problem location is a location with the type Location.PROBLEM.

Here is a test case to illustrate this scenario:

from django.test import TestCase

from wms.models import Item, ItemOnPallet, Location, LocationType, Pallet
from wms.service import Service
from wms.tests import test_data_builder as tdb


class TestPalletMove(TestCase):
    service = Service()

    def test_pallet_moved_to_problem_location_gets_blocked(self):
        forklift = Location.objects.create(id='FORKLIFT01', sequence=0, level=0,
                                           type=LocationType.FORKLIFT, blocked=False)
        problem_location = Location.objects.create(id='PROBLEM', sequence=0, level=0,
                                                   type=LocationType.PROBLEM, blocked=False)
        item = Item.objects.create(description='can 1 liter white paint')
        pallet = Pallet.objects.create(location=forklift, blocked=False)
        pallet.items.add(ItemOnPallet.objects.create(pallet=pallet, item=item, quantity=100, batch='2019401234'))

        self.service.move_pallet(pallet, problem_location)

        self.assertTrue(pallet.blocked)
        self.assertEqual(pallet.location, problem_location)

The arrange part is the largest part of the test. Creating a location requires to fill in a lot of parameters, most of which are not even relevant for the test.

The test becomes more readable by defining default values for sequence and level. But sometimes it is not desirable to use default values. There is a way to improve readability without using default values in models: extract the code to create the forklift, the problem location and the pallet to separate methods:

def test_pallet_moved_to_problem_location_gets_blocked(self):
    forklift = self.create_forklift()
    problem_location = self.create_problem_location()
    item = self.create_item()
    pallet = self.create_pallet(forklift, {item: 100})

    self.service.move_pallet(pallet, problem_location)

    self.assertTrue(pallet.blocked)
    self.assertEqual(pallet.location, problem_location)

def create_forklift(self):
    return Location.objects.create(id='FORKLIFT01', sequence=0, level=0,
                                   type=LocationType.FORKLIFT, blocked=False)

def create_problem_location(self):
    return Location.objects.create(id='PROBLEM', sequence=0, level=0,
                                   type=LocationType.PROBLEM, blocked=False)

def create_item(self):
    return Item.objects.create(description='can 1 liter white paint')

def create_pallet(self, forklift, items):
    pallet = Pallet.objects.create(location=forklift, blocked=False)
    for item, quantity in items.items():
        pallet.items.add(ItemOnPallet.objects.create(pallet=pallet, item=item,
                                                     quantity=quantity, batch='2019401234'))
    return pallet

See how the arrange part of the test case has improved? See how the intention of the setup become clear?

Most variables in the arrange part could be inlined to make the arrange part even smaller. The only variables that cannot be inlined are problem_location and pallet because they are used more than once. But what if we changed the create methods to methods into methods that get or create an object. For example, the method create_problem_location() can be changed into a method problem_location() that will create a problem location when it is called the first time, and returns the same problem location on all subsequent calls. We can apply this same technique tot he other create methods as well except for the create_pallet(). The reason for keeping create_pallet() as is, is that most test cases require pallets with specific contents or at specific locations and sometimes need multiple pallets. Using the create or get trick for pallets is just not handy.

The test code now looks like this:

def test_pallet_moved_to_problem_location_gets_blocked(self):
    pallet = self.create_pallet(self.forklift(), {self.item1(): 100})

    self.service.move_pallet(pallet, self.problem_location())

    self.assertTrue(pallet.blocked)
    self.assertEqual(pallet.location, self.problem_location())

Wow, the arrange part is now just a single line! And notice how clear the important values stand out: we create a pallet on forklift and then move it to a problem location. The pallet has 100 items. (Ok, I must admit that for this test the value 100xitem1 is not relevant, as long as the pallet is not empty.)

This technique of extracting methods, and using the create or get trick, is really worth the effort. You can do this in each test class. But should each class get a method create_pallet() and item1() and forklift()? That would be violating the DRY principle (DRY = Don’t Repeat Yourself). So the next step is to move these methods to a separate module. I name this module test_data_builder. The result of moving the methods to this module looks like this:

from wms.tests import test_data_builder as tdb

def test_pallet_moved_to_problem_location_gets_blocked(self):
    pallet = tdb.create_pallet(tdb.forklift(), items={tdb.item_1(): 100})

    self.service.move_pallet(pallet, tdb.problem_location())

    self.assertTrue(pallet.blocked)
    self.assertEqual(pallet.location, tdb.problem_location())

Here is the code from the test_data_builder module:

from datetime import date

from wms.models import AssignedLocation, Customer, Item, ItemOnPallet, Location, LocationType, Order, OrderLine, Pallet

_next_sequence = 123
_next_batch = 1


def item_1():
    return Item.objects.get_or_create(description='White paint (1 liter)')[0]


def item_2():
    return Item.objects.get_or_create(description='Black paint (1 liter)')[0]


def item_3():
    return Item.objects.get_or_create(description='Yellow paint (1 liter)')[0]


def forklift(id='FORKLIFT01'):
    return Location.objects.get_or_create(id=id, sequence=0, level=0, type=LocationType.FORKLIFT, blocked=False)[0]


def problem_location(id='PROBLEM'):
    return Location.objects.get_or_create(id=id, sequence=0, level=0, type=LocationType.PROBLEM, blocked=False)[0]


def audit_location(id='AUDIT', blocked=False):
    return Location.objects.get_or_create(
        id=id,
        aisle=None,
        sequence=0,
        level=0,
        type=LocationType.AUDIT,
        blocked=blocked)[0]


def bulk_location(id=None, aisle='BULK', sequence=None, level=0, blocked=False):
    sequence = _get_sequence(sequence)
    return Location.objects.get_or_create(
        id=id or f'{aisle}{sequence:03d}{chr(65 + level)}',
        aisle=aisle,
        sequence=sequence,
        level=level,
        type=LocationType.BULK,
        blocked=blocked)[0]


def create_pick_location(id=None, aisle='PICK', sequence=None, level=0, blocked=False, item=None, max_quantity=100):
    sequence = _get_sequence(sequence)
    location = Location.objects.create(
        id=id or f'{aisle}{sequence:03d}{chr(65 + level)}',
        aisle=aisle,
        sequence=sequence,
        level=level,
        type=LocationType.PICK,
        blocked=blocked)

    AssignedLocation.objects.create(
        location=location,
        item=item or item_1,
        max_quantity=max_quantity or 100
    )

    return location


def staging_location(id=None, aisle='STAGING', sequence=0, level=0, blocked=False):
    sequence = _get_sequence(sequence)
    return Location.objects.get_or_create(
        id=id or f'{aisle}{sequence:03d}{chr(65 + level)}',
        aisle=aisle,
        sequence=sequence,
        level=level,
        type=LocationType.STAGING,
        blocked=blocked)[0]


def dock(id='DOCK01', blocked=False):
    return Location.objects.get_or_create(
        id=id,
        aisle=None,
        sequence=0,
        level=0,
        type=LocationType.DOCK,
        blocked=blocked)[0]


def _get_sequence(sequence=None):
    if not sequence:
        global _next_sequence
        sequence = _next_sequence
        _next_sequence += 1
    return sequence


def create_pallet(location: Location, blocked=False, pick_list=None, items={}):
    global _next_batch
    pallet = Pallet.objects.create(
        location=location,
        blocked=blocked,
        pick_list=pick_list)

    if items:
        for item, quantity in items.items():
            batch = f'{_next_batch:06d}'
            _next_batch += 1
            pallet.items.add(ItemOnPallet.objects.create(pallet=pallet, item=item, quantity=quantity, batch=batch))

    return pallet


def create_customer(name='John Doe') -> Customer:
    return Customer.objects.create(name=name)


def create_order(items, customer=None, shipping_date=None):
    order = Order.objects.create(
        customer=customer or create_customer(),
        shipping_date=shipping_date or date.today())

    for item, quantity in items.items():
        if isinstance(item, str):
            item = item_by_id(item)
        order.lines.add(OrderLine.objects.create(order=order, item=item, quantity=quantity))

    return order


def item_by_id(item_id: str):
    if item_id not in globals():
        raise Exception(f'No item defined with id {item_id} in this test data builder!')
    return globals()[item_id]()

See how some functions create a thing and return the same thing the next time the function is called? That is useful for things that are constant for your tests. item_1() will always return the same item. The exact details of the item are not relevant for most of the test. The Test Data Builder makes it easy to use a small set of items.

Other functions create a new thing every time you call them. Pallets are typically created specifically for a test. The function create_pallet() takes an optional items argument, which is a tuple of items and quantities.

The code in the Test Data Builder can get a bit complex, like create_pallet() shows. Note that you typically write the functions in the Test Data Builder once, and modify/extend it a couple of times, but you use the functions hundreds of times. So while writing functions for Test Data Builder, focus on ease of use and cleanliness of the test code that uses the Test Data Builder.

Functions in Test Data Builders have the following properties:

A trick to ensure the function is simple to use is to first write a call to the function in a test and then implement the function.

When using the Test Data Builder in a test, the test code should be simple and readable. However, tests should not depend on a specific default values chosen by the Test Data Builder. If a test needs some pallet that is full, and a Test Data Builder function named create_pallet() is called without specific items, then this function must be called with an argument that makes clear that the pallet is full: create_pallet(items={tdb.item_1(), tdb.item_1_full_pallet_quantity}. If you need full pallets in many tests, then you had better create a new function named create_full_pallet() with an optional parameter to override the item with which the pallet is filled.

Technique 2: Visualize the setup data

The second technique is illustrated by testing an important feature of stocking. Remember that stocking is the process of putting full pallets, which just arrived from the factory, in locations in the bulk area. The diagram below shows a side view of an aisle in the bulk area. The WMS will tell the forklift driver which location should be filled with the next pallet.

Filling the aisle level by level from bottom to top is not efficient, because the horizontal distance the forklift has to travel increases a lot. In real life there are more than 5 locations at one level. Filling the aisle column by column from left to right is more efficient. However, forklifts move faster horizontally than vertically. So the optimal way to fill the aisle is under an angle.

Here is the code to test that the correct location is returned to stock a pallet in an empty aisle.

def test_stocking_pallet_in_empty_aisle(self):
    pallet = tdb.create_pallet(tdb.forklift(), items={(tdb.item_1()): 100})
    for sequence in range(1, 6):
        for level in range(0, 5):
            tdb.bulk_location(aisle=self.AISLE, sequence=sequence, level=level)

    destination = self.service.get_stock_location(pallet, self.AISLE)

    self.assertEqual(destination, tdb.bulk_location(aisle=self.AISLE, sequence=1, level=0))

See how using a Test Data Builder in nested loops build an empty aisle in just 3 lines of code. This way of generating data gets messy when certain locations in the aisle are already occupied by pallets.

The diagram above inspires me to represent the aisle textually as a multiline string. For example, the following string represents the empty aisle and the asterisk indicates the location where the next pallet must be stocked:

 """|     |
    |     |
    |o    |
    |oo*  |"""

This multiline string defines both the arrange and assert part of the test!

Here is the code that uses such textual representations to specify the aisle that has to be configured and indicates the expected location where the next pallet must be stocked.

@parameterized.expand([
    ('empty aisle',
     """|     |
        |     |
        |     |
        |*    |"""),

    ('one pallet present',
     """|     |
        |     |
        |     |
        |o*   |"""),

    ('two pallets present',
     """|     |
        |     |
        |*    |
        |oo   |"""),

    ('three pallets present',
     """|     |
        |     |
        |o    |
        |oo*  |"""),

    ('four pallets present',
     """|     |
        |     |
        |o*   |
        |ooo  |"""),

    ('no free location in aisle',
     """|ooooo|
        |ooooo|
        |ooooo|
        |ooooo|"""),

    ('first gap is filled',
     """|     |
        |     |
        | o   |
        |o*o  |"""),

    ('blocked location is skipped',
     """|     |
        |     |
        |     |
        |x*   |"""),

])
def test_stocking_pallet(self, _, bulk_aisle_map: str) -> None:
    """
    Test if a non-blocked pallet gets the correct stock location within an aisle.
    :param _: is added to the name of the test. Not used otherwise.
    :param bulk_aisle_map: two-dimensional map of the aisle. The pipes indicate the start and end of a level
    in the rack. The meaning of the characters between the pipes is:
    o: a pallet
    *: empty location, this is the expected location
    x: empty location, blocked
    """
    expected_location = self._create_locations_in_aisle(bulk_aisle_map)
    pallet = tdb.create_pallet(tdb.forklift(), items={tdb.item_1(): 100})

    destination = self.service.get_stock_location(pallet, self.AISLE)

    if expected_location:
        self.assertEqual(destination, expected_location)
    else:
        self.assertIsNone(destination)

def _create_locations_in_aisle(self, bulk_aisle_map):
    lines = re.findall('[|][^|]+[|]', bulk_aisle_map)
    lines.reverse()
    expected_location = None
    level = 0
    for line in lines:
        for sequence in range(1, len(line) - 1):
            bulk_location = tdb.bulk_location(aisle=self.AISLE, sequence=sequence, level=level)
            if line[sequence] == 'o':
                tdb.create_pallet(bulk_location, items={tdb.item_1(): 100})
            if line[sequence] == '*':
                expected_location = bulk_location
            if line[sequence] == 'x':
                bulk_location.blocked = True
                bulk_location.save()
        level += 1
    return expected_location

The logic of parsing a multiline string and generating the data is implemented by _create_locations_in_aisle(). This code is a bit complex, but once this method has been written, generating test cases becomes a piece of cake. You can pair program with your product owner to write more test cases.

Another situation where this technique can be applied is when you test actor based code. You could use one string per actor to indicate at which moment in time a specific message is sent to that actor. Another string could be used to describe the expected messages sent by a specific actor:

in_1: "1        4          3     "
in_2: "  a        b      c       "
out:  "   (a,1)    (b,4)    (c,3)"

Technique 3: Workflow

The third technique is illustrated by testing parts of the picking workflow. The following diagram describes the work flow for order handling:

The step ‘Operator picks pick list’ can be further detailed as follows:

The WMS implements the following methods that are used by the terminal during the picking process:

To properly test these methods, we need an order and a pick list and a pallet on a forklift that contains the items that have already been picked. There are many scenarios we want to test:

The more picked items are required by the arrange part of the test, the more lines of code are added to the arrange part. This reduces the readability and maintainability of these tests.

Since the picking workflow is well defined, it is quite easy to write a PickWorkflow class that will use the business logic to get the database in a specific state. The class has methods that match with steps within the workflow diagrams shown above. Each method ensures that any preceding steps of the flow will have been performed.

from typing import Optional

from wms.models import Location, LocationType, Pallet, PickList
from wms.service import Service
from wms.tests import test_data_builder as tdb


class PickWorkflow:

    def __init__(self, items, customer=None, shipping_date=None, generate_pick_locations=True, forklift=None):
        self.service = Service()
        self.forklift = forklift or tdb.forklift()
        self.order = tdb.create_order(items=items, customer=customer, shipping_date=shipping_date)
        self.pick_list: Optional[PickList] = None
        self.pick_pallet: Optional[Pallet] = None
        self.picked_pallets = []

        if generate_pick_locations:
            self._create_pick_locations()

    def _create_pick_locations(self):
        for order_line in self.order.lines.all():
            location = tdb.create_pick_location(item=order_line.item, max_quantity=order_line.quantity)
            tdb.create_pallet(location, items={order_line.item: order_line.quantity})

    def generate_pick_list(self):
        if self.pick_list:
            raise Exception('A pick list has already been generated for the order.')

        self.pick_list = self.service.create_pick_list(self.order)

        return self

    def pick(self, items):
        self._ensure_pick_list_is_generated()
        self._ensure_pick_pallet_is_created()
        for item, quantity in items.items():
            location = self.service.get_next_pick_location(self.pick_list, None)
            self.service.pick(self.pick_list, location, self.pick_pallet, item, quantity)
        return self

    def put_pallet_in_staging_lane(self, staging_location: Optional[Location] = None):
        if not self.pick_pallet:
            raise Exception('The forklift carries no pallet.')

        staging_location = staging_location or tdb.staging_location()
        if not staging_location:
            raise Exception('The warehouse has no staging location.')

        self.service.move_pallet(self.pick_pallet, staging_location)
        self.pick_pallet = None

        return self

    def pick_and_put_pallet_in_staging(self, staging_location: Optional[Location] = None):
        self._ensure_pick_list_is_generated()
        location = Location.objects.filter(type=LocationType.PICK).first()
        while not self.pick_list.is_completely_picked():
            self._ensure_pick_pallet_is_created()
            location = self.service.get_next_pick_location(self.pick_list, location)
            if location:
                quantity = self.service.get_pick_quantity_for(self.pick_list, location)
                self.service.pick(self.pick_list, location, self.pick_pallet, location.assignment.item, quantity)
            else:
                raise Exception('Not enough items on stock for pick list.')

        self.put_pallet_in_staging_lane(staging_location)

        return self

    def pick_and_put_pallet_in_truck(self, dock: Optional[Location] = None):
        self.pick_and_put_pallet_in_staging()

        dock = dock or Location.objects.filter(type=LocationType.DOCK).first()
        if not dock:
            raise Exception('The warehouse has no dock location.')

        for pallet in self.picked_pallets:
            if not pallet.shipping_label:
                self.service.print_shipping_label(pallet)
            self.service.move_pallet(pallet, dock)

        return self

    def _ensure_pick_list_is_generated(self):
        if not self.pick_list:
            self.generate_pick_list()

    def _ensure_pick_pallet_is_created(self):
        if not self.pick_pallet:
            self.pick_pallet = tdb.create_pallet(self.forklift, pick_list=self.pick_list)
            self.picked_pallets.append(self.pick_pallet)

Here are examples of tests that use this PickWorkflow class:

import re

from django.test import TestCase

from wms.models import Location
from wms.service import Service
from wms.tests import test_data_builder as tdb
from wms.tests.pick_workflow import PickWorkflow


class TestPicking(TestCase):
    service = Service()

    def test_get_pick_quantity_when_nothing_picked_yet(self):
        self._create_pick_locations('P_1: 100xitem_1')
        workflow = PickWorkflow(generate_pick_locations=False, items={tdb.item_1(): 5}).generate_pick_list()

        quantity = self._get_pick_quantity('P_1', workflow)

        self.assertEqual(quantity, 5)

    def test_get_pick_quantity_for_location_that_has_less_than_remaining_quantity_to_be_picked(self):
        self._create_pick_locations('P_1: 3xitem_1')
        workflow = PickWorkflow(generate_pick_locations=False, items={tdb.item_1(): 5}).generate_pick_list()

        quantity = self._get_pick_quantity('P_1', workflow)

        self.assertEqual(quantity, 3)

    def test_get_pick_quantity_for_empty_location(self):
        self._create_pick_locations('P_1: 0xitem_1')
        workflow = PickWorkflow(generate_pick_locations=False, items={tdb.item_1(): 5}).generate_pick_list()

        quantity = self._get_pick_quantity('P_1', workflow)

        self.assertEqual(quantity, 0)

    def test_pick_part_of_pick_list_get_pick_quantity_for_item(self):
        self._create_pick_locations('P_1: 100xitem_1')
        workflow = PickWorkflow(generate_pick_locations=False, items={tdb.item_1(): 5})\
            .pick(items={tdb.item_1(): 2})

        quantity = self._get_pick_quantity('P_1', workflow)

        self.assertEqual(quantity, 3)

    def test_pick_complete_pick_list_get_pick_quantity_for_item(self):
        self._create_pick_locations('P_1: 100xitem_1')
        workflow = PickWorkflow(generate_pick_locations=False, items={tdb.item_1(): 5})\
            .pick(items={tdb.item_1(): 5})

        quantity = self._get_pick_quantity('P_1', workflow)

        self.assertEqual(quantity, 0)

    def test_pick_item_1_completely_get_pick_quantity_for_item_2(self):
        self._create_pick_locations('P_1: 100xitem_1', 'P_2: 100xitem_2')
        workflow = PickWorkflow(generate_pick_locations=False, items={tdb.item_1(): 5, tdb.item_2(): 10})\
            .pick(items={tdb.item_1(): 5})

        quantity = self._get_pick_quantity('P_2', workflow)

        self.assertEqual(quantity, 10)

    def test_pick_item_1_partly_get_pick_quantity_for_item_2(self):
        self._create_pick_locations('P_1: 100xitem_1', 'P_2: 100xitem_2')
        workflow = PickWorkflow(generate_pick_locations=False, items={tdb.item_1(): 5, tdb.item_2(): 10})\
            .pick(items={tdb.item_1(): 3})

        quantity = self._get_pick_quantity('P_2', workflow)

        self.assertEqual(quantity, 10)

    def test_get_pick_quantity_for_item_that_has_been_picked_partly_on_other_pallet(self):
        self._create_pick_locations('P_1: 100xitem_1')
        workflow = PickWorkflow(generate_pick_locations=False, items={tdb.item_1(): 5})\
            .pick(items={tdb.item_1(): 3})\
            .put_pallet_in_staging_lane()

        quantity = self._get_pick_quantity('P_1', workflow)

        self.assertEqual(quantity, 2)

    def _get_pick_quantity(self, location_id, workflow):
        location = Location.objects.get(id=location_id)
        return self.service.get_pick_quantity_for(workflow.pick_list, location)

    def _create_pick_locations(self, *args):
        for arg in args:
            match = re.match('(.+): ([0-9]+)x(.+)', arg)
            self.assertIsNotNone(match, f'The argument {arg} is invalid!')
            item = tdb.item_by_id(match.group(3))
            location = tdb.create_pick_location(id=match.group(1), item=item)
            tdb.create_pallet(location, items={item: int(match.group(2))})

The PickWorkflow is not only handy for testing picking. After pallets have been loaded in a truck, transport documentation can be generated. One type of transport documentation is a transport notice which describes the pallets and their contents in the truck. Here is an example of a test for generating the transport notice for three pallets from three different orders:

def test_transport_notice(self):
    workflow_1 = PickWorkflow(items={tdb.item_1(): 10}).pick_and_put_pallet_in_truck(tdb.dock())
    workflow_2 = PickWorkflow(items={tdb.item_2(): 25}).pick_and_put_pallet_in_truck(tdb.dock())
    workflow_3 = PickWorkflow(items={tdb.item_3(): 45}).pick_and_put_pallet_in_truck(tdb.dock())

    transport_notice = self.service.generate_transport_notice(tdb.dock())

    self.assertEqual(transport_notice, [
        {
            'shipping_label': workflow_1.picked_pallets[0].shipping_label,
            'contents': [{'item': tdb.item_1().description, 'quantity': 10}]
        },
        {
            'shipping_label': workflow_2.picked_pallets[0].shipping_label,
            'contents': [{'item': tdb.item_2().description, 'quantity': 25}]
        },
        {
            'shipping_label': workflow_3.picked_pallets[0].shipping_label,
            'contents': [{'item': tdb.item_3().description, 'quantity': 45}]
        }
    ])

Focus on clean test code

When writing tests, it might happen that you write the same lines of code in a couple of tests. When extracting this code to a method, make sure that using the extracted method is as simple as possible.

The WMS will check each pallet that is about to be loaded into a truck. If for a pallet an audit result is registered that indicates a difference was found between the actual contents on the pallet and the contents according to the WMS, the pallet is not allowed to be loaded. The idea is that the difference must be resolved first before the pallet can be loaded.

Imagine that you want to extract the two lines calling move_pallet() and register_audit_result() from this test:

def test_move_pallet_to_dock_with_differences_found_during_audit(self):
    pallet = tdb.create_pallet(tdb.bulk_location())
    service.move_pallet(pallet.id, tdb.audit_location().id)
    service.register_audit_result(pallet.id, True)

    self.assertRaises(Exception, self.service.move_pallet, pallet.id, tdb.create_dock().id)

Don’t do it like this:

def test_move_pallet_to_dock_with_differences_found_during_audit(self):
    pallet = tdb.create_pallet(tdb.bulk_location())
    self.audit_pallet_with_differences_found(pallet.id, tdb.audit_location().id)

    self.assertRaises(Exception, self.service.move_pallet, pallet.id, tdb.create_dock().id)

But do it like this:

def test_move_pallet_to_dock_with_differences_found_during_audit(self):
    pallet = tdb.create_pallet(tdb.bulk_location())
    self.audit_pallet_with_differences_found(pallet, tdb.audit_location())

    self.assertRaises(Exception, self.service.move_pallet, pallet.id, tdb.create_dock().id)

Let the extracted method worry about ids of the pallet and dock. Keep the test code clean.

Conclusions

Three techniques to generate test data have been explained:

Using these techniques it is possible to setup the database

When using these techniques focus on ease of use and readability for the tests.