Making a Python unit test that never runs in parallel

Question

tl;dr - I want to write a Python unittest function that deletes a file, runs a test, and the restores the file. This causes race conditions because unittest runs multiple tests in parallel, and deleting and creating the file for one test messes up other tests that happen at the same time.

Long Specific Example:

I have a Python module named converter.py and it has associated tests in test_converter.py . If there is a file named config_custom.csv in the same directory as converter.py , then the custom configuration will be used. If there is no custom CSV config file, then there is a default configuration built into converter.py .

I wrote a unit test using unittest from the Python 2.7 standard library to validate this behavior. The unit test in setUp() would rename config_custom.csv to wrong_name.csv , then it would run the tests (hopefully using the default config), then in tearDown() it would rename the file back the way it should be.

Problem: Python unit tests run in parallel, and I got terrible race conditions. The file config_custom.csv would get renamed in the middle of other unit tests in a non-deterministic way. It would cause at least one error or failure about 90% of the time that I ran the entire test suite.

The ideal solution would be to tell unittest : Do NOT run this test in parallel with other tests, this test is special and needs complete isolation.

My work-around is to add an optional argument to the function that searches for config files. The argument is only passed by the test suite. It ignores the config file without deleting it. Actually deleting the test file is more graceful, that is what I actually want to test.

Answer 1

I frequently need to run tests with complete isolation. The only way I've found that works consistently is to put those tests in separate classes. Agreed that handling config files inside tests is still kind of a pain.

For something I'm doing right now, I may also try something like pytest-ordering for running tests in a more deterministic fashion.

Answer 2

Firstly most testing framework that support parallelism also support methods to serial certain important tests. For instance Python Fabric support a serial annotation (when you launch your test as a parallel suite with command line flag -P):

from fabric.api import *

def runs_in_parallel():
    pass

@serial
def runs_serially():
    pass

However, I don't think you should do that, I think you should stop and contempt for a moment. Following the Single Responsibility Principal this is what I would say is going on.

You are in a situation where a test has essentially identified a condition of your code, it's highlight a little spot of cohesion for you.

What you should do now is realise that "converter.py" has too many responsibilities, not only does it perform some complex operation under the hood it is also responsibility for managing it's own creation and set-up in a way that is probably too much for it.

Perhaps if you had an object that can management reading in that file, you could then easily mock/stub it out and produce the condition you are describing that you want to test.

Answer 3

The best testing strategy would be to make sure your testing on disjoint data sets. This will bypass any race conditions and make the code simpler. I would also mock out open or __enter__ / __exit__ if your using the context manager. This will allow you to fake the event that a file doesn't exist.

Answer 4

The Django 3.2 documentation contains an official solution to this problem using the django.test.testcases.SerializeMixin . This will force certain tests to run in series and prevent errors arising when they try to access the same resource.

As per the Documentation , you would do something like this:

import os

from django.test import TestCase
from django.test.testcases import SerializeMixin

class ImageTestCaseMixin(SerializeMixin):
    lockfile = __file__

    def setUp(self):
        self.filename = os.path.join(temp_storage_dir, 'my_file.png')
        self.file = create_file(self.filename)

class RemoveImageTests(ImageTestCaseMixin, TestCase):
    def test_remove_image(self):
        os.remove(self.filename)
        self.assertFalse(os.path.exists(self.filename))

class ResizeImageTests(ImageTestCaseMixin, TestCase):
    def test_resize_image(self):
        resize_image(self.file, (48, 48))
        self.assertEqual(get_image_size(self.file), (48, 48))

Answer 5

The problem is that the name of config_custom.csv should itself be a configurable parameter. Then each test can simply look for config_custom_<nonce>.csv , and any number of tests may be run in parallel.

Cleanup of the overall suite can just clear out config_custom_*.csv , since we won't be needing any of them at that point.

Making a Python unit test that never runs in parallel

Question

5 answers

solution1
1 2018-07-05 18:13:25

solution2
0 2014-02-14 09:32:47

solution3
0 2015-12-07 18:32:54

solution4
0 2021-10-24 08:09:46

solution5
-1 2014-02-14 09:39:10

Making a Python unit test that never runs in parallel

Question

5 answers

solution1 1 2018-07-05 18:13:25

solution2 0 2014-02-14 09:32:47

solution3 0 2015-12-07 18:32:54

solution4 0 2021-10-24 08:09:46

solution5 -1 2014-02-14 09:39:10

solution1
1 2018-07-05 18:13:25

solution2
0 2014-02-14 09:32:47

solution3
0 2015-12-07 18:32:54

solution4
0 2021-10-24 08:09:46

solution5
-1 2014-02-14 09:39:10