Tim Raymond

What's new in Go 1.14: Test Cleanup

  • Tim Raymond
  • February 11, 2020
  • Reading time: 8 minutes.
  • go testing

The process of writing a unit test usually follows a certain set of steps. First, we set up dependencies of the unit under test. Next, we execute the unit of logic under test. We then compare the results of that execution to our expectations. Finally, we tear down any dependencies and restore the environment to the state we found it so as not to affect other unit tests. In Go 1.14, the testing package now includes a method, testing.(*T).Cleanup, which aims to make creating and cleaning up dependencies of tests easier.

Oftentimes, applications have some Repository-like struct that acts as the application's access to a database. Testing these structs can be challenging because working with them alters the state of the underlying database. Typically tests will have a function to produce instances of this struct:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
func NewTestTaskStore(t *testing.T) *pg.TaskStore {
	store := &pg.TaskStore{
		Config: pg.Config{
			Host:     os.Getenv("PG_HOST"),
			Port:     os.Getenv("PG_PORT"),
			Username: "postgres",
			Password: "postgres",
			DBName:   "task_test",
			TLS:      false,
		},
	}

	err = store.Open()
	if err != nil {
		t.Fatal("error opening task store: err:", err)
	}

	return store
}

This gives us a new instance of a Postgres-backed store responsible for storing different tasks in a task-tracking application. Now that we can produce instances of this store, we can write a test for it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
func Test_TaskStore_Count(t *testing.T) {
	store := NewTestTaskStore(t)

	ctx := context.Background()
	_, err := store.Create(ctx, tasks.Task{
		Name: "Do Something",
	})
	if err != nil {
		t.Fatal("error creating task: err:", err)
	}

	tasks, err := store.All(ctx)
	if err != nil {
		t.Fatal("error fetching all tasks: err:", err)
	}

	exp := 1
	got := len(tasks)

	if exp != got {
		t.Error("unexpected task count returned: got:", got, "exp:", exp)
	}
}

This test's intentions are good–we want to make sure that after creating one task that only one task is returned. When we run this test, we see that it passes:

$ export PG_HOST=127.0.0.1
$ export PG_PORT=5432
$ go test -count 1 -v ./...
  ?       github.com/timraymond/cleanuptest       [no test files]
  === RUN   Test_TaskStore_LoadStore
  --- PASS: Test_TaskStore_LoadStore (0.01s)
  === RUN   Test_TaskStore_Count
  --- PASS: Test_TaskStore_Count (0.01s)
  PASS
  ok      github.com/timraymond/cleanuptest/pg    0.035s

We have to add -count 1 to these tests to bypass the test cache because the test framework will cache the success and assume that the test will continue to succeed. When we run these tests again, we'll notice that they now fail:

$ go test -count 1 -v ./...
?       github.com/timraymond/cleanuptest       [no test files]
  === RUN   Test_TaskStore_LoadStore
  --- PASS: Test_TaskStore_LoadStore (0.01s)
  === RUN   Test_TaskStore_Count
      Test_TaskStore_Count: pg_test.go:79: unexpected task count returned: got: 2 exp: 1
  --- FAIL: Test_TaskStore_Count (0.01s)
  FAIL
  FAIL    github.com/timraymond/cleanuptest/pg    0.029s
  FAIL

Our tests aren't cleaning up after themselves so the existing state is invalidating the results of future test runs. The simplest fix is to defer a function to clean up the state after we finish running this test. Since every test that uses TaskStore will have to do this, it makes sense to return a cleanup function from the function manufacturing our test instances of TaskStore:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
func NewTestTaskStore(t *testing.T) (*pg.TaskStore, func()) {
	store := &pg.TaskStore{
		Config: pg.Config{
			Host:     os.Getenv("PG_HOST"),
			Port:     os.Getenv("PG_PORT"),
			Username: "postgres",
			Password: "postgres",
			DBName:   "task_test",
			TLS:      false,
		},
	}

	err := store.Open()
	if err != nil {
		t.Fatal("error opening task store: err:", err)
	}

	return store, func() {
		if err := store.Reset(); err != nil {
			t.Error("unable to truncate tasks: err:", err)
		}
	}
}

On lines 18-21, we're returning a closure that calls the Reset method off the *pg.TaskStore that we return as the first argument. Within our tests, we have to make sure to defer this test function:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
func Test_TaskStore_Count(t *testing.T) {
	store, cleanup := NewTestTaskStore(t)
	defer cleanup()

	ctx := context.Background()
	_, err := store.Create(ctx, cleanuptest.Task{
		Name: "Do Something",
	})
	if err != nil {
		t.Fatal("error creating task: err:", err)
	}

	tasks, err := store.All(ctx)
	if err != nil {
		t.Fatal("error fetching all tasks: err:", err)
	}

	exp := 1
	got := len(tasks)

	if exp != got {
		t.Error("unexpected task count returned: got:", got, "exp:", exp)
	}
}

This works, but it's awkward and becomes increasingly unweildy as the number of deferred cleanup functions need to be called. Are you certain each one was called? What happens if one of them panics? Each of these things serves as a distraction from what the test is actually trying to test. Furthermore, if test writers have to be concerned with all of these moving parts, writing tests become increasingly difficult. If you make it easier to write tests, more of them will be written.

Go 1.14 introduces the testing.(*T).Cleanup method to make it possible to register cleanup functions that run transparently to test authors. Let's refactor our factory function to use Cleanup:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
func NewTestTaskStore(t *testing.T) *pg.TaskStore {
	store := &pg.TaskStore{
		Config: pg.Config{
			Host:     os.Getenv("PG_HOST"),
			Port:     os.Getenv("PG_PORT"),
			Username: "postgres",
			Password: "postgres",
			DBName:   "task_test",
			TLS:      false,
		},
	}

	err = store.Open()
	if err != nil {
		t.Fatal("error opening task store: err:", err)
	}

	t.Cleanup(func() {
		if err := store.Reset(); err != nil {
			t.Error("error resetting:", err)
		}
	})

	return store
}

The NewTestTaskStore function still takes a *testing.T which is still useful for failing the test if we were unable to open a connection to Postgres. On lines 18-22, we call the Cleanup method and provide a func that invokes the Reset method off store. Unlike defer, this func will be run by the test runner at the end of each test. Let's integrate this into our test:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
func Test_TaskStore_Count(t *testing.T) {
	store := NewTestTaskStore(t)

	ctx := context.Background()
	_, err := store.Create(ctx, cleanuptest.Task{
		Name: "Do Something",
	})
	if err != nil {
		t.Fatal("error creating task: err:", err)
	}

	tasks, err := store.All(ctx)
	if err != nil {
		t.Fatal("error fetching all tasks: err:", err)
	}

	exp := 1
	got := len(tasks)

	if exp != got {
		t.Error("unexpected task count returned: got:", got, "exp:", exp)
	}
}

Notice that on line 2, we only receive a *pg.TaskStore from NewTestTaskStore. The concerns of cleanup and handling errors from constructing that *pg.TaskStore have been nicely encapsulated so that our test can focus exclusively on the behavior that it's testing.

What about t.Parallel?

Tests, or subtests, can be run in separate Goroutines by using the testing.(*T).Parallel() method. The only requirement is that tests that call the Parallel() method should be able to run safely alongside other tests that have also called that Parallel() method. We can modify the previous test to start multiple subtests that all do the same thing:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
func Test_TaskStore_Count(t *testing.T) {
	ctx := context.Background()
	for i := 0; i < 10; i++ {
		t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
			t.Parallel()
			store := NewTestTaskStore(t)
			_, err := store.Create(ctx, cleanuptest.Task{
				Name: "Do Something",
			})
			if err != nil {
				t.Fatal("error creating task: err:", err)
			}

			tasks, err := store.All(ctx)
			if err != nil {
				t.Fatal("error fetching all tasks: err:", err)
			}

			exp := 1
			got := len(tasks)

			if exp != got {
				t.Error("unexpected task count returned: got:", got, "exp:", exp)
			}
		})
	}
}

We're starting ten new subtests within the for loop by using the t.Run() method. Because we call t.Parallel() method, each of these subtests will be run together concurrently. We've also moved the creation of the store within the subtest so that the value of t is actually the subtest's *testing.T. Outside of this example we've also added some logging to see when cleanup functions are executed. Let's run go test to see what happens:

  === CONT  Test_TaskStore_Count/3
  === CONT  Test_TaskStore_Count/8
  === CONT  Test_TaskStore_Count/9
  === CONT  Test_TaskStore_Count/2
  === CONT  Test_TaskStore_Count/4
  === CONT  Test_TaskStore_Count/1
      Test_TaskStore_Count/3: pg_test.go:77: unexpected task count returned: got: 3 exp: 1
      Test_TaskStore_Count/3: pg_test.go:31: cleanup!
      Test_TaskStore_Count/5: pg_test.go:77: unexpected task count returned: got: 4 exp: 1
      Test_TaskStore_Count/5: pg_test.go:31: cleanup!
      Test_TaskStore_Count/9: pg_test.go:77: unexpected task count returned: got: 4 exp: 1
      Test_TaskStore_Count/9: pg_test.go:31: cleanup!
      Test_TaskStore_Count/2: pg_test.go:77: unexpected task count returned: got: 4 exp: 1
      Test_TaskStore_Count/2: pg_test.go:31: cleanup!
  === CONT  Test_TaskStore_Count/7
  === CONT  Test_TaskStore_Count/6
      Test_TaskStore_Count/8: pg_test.go:77: unexpected task count returned: got: 0 exp: 1
      Test_TaskStore_Count/8: pg_test.go:31: cleanup!

As you might have expected, cleanup functions run when the subtest completes, since we used the *testing.T value from the subtest. However, our tests still failed because the effects of one subtest are still observable to other subtests since we're not using transactions.

While t.Cleanup() has it's uses with parallel subtests, it's probably best used with care in this scenario. You may find more success using a combination of cleanup functions and transactions within the bodies of tests.

Conclusion

The “magical” behavior of t.Cleanup might seem to be too clever for what we're used to in Go. I wouldn't like this happening in production code either. Tests are different than production code in many ways though so we can relax some constraints to make it easier to write tests and easier to read what they're testing later. Just like how t.Fatal and t.Error make it trivial to handle unexpected errors in tests, t.Cleanup will hopefully make it much easier to retain cleanup logic without cluttering our tests with defers.