Run pytest for each file in directory

Question

I'm trying to build a routine that calls a Pytest class for each PDF document in current directoy... Let me explain

Lets say i have this test file

import pytest

class TestHeader:
    #asserts...

class TestBody:
    #asserts...

This script needs to test each pdf document in my cwd

Here is my best attempt:

import glob
import pytest

class TestHeader:
    #asserts...

class TestBody:
    #asserts...

filelist = glob.glob('*.pdf')

for file in filelist:
    #magically call pytest for each file

How would i approach this?

EDIT: Complementing my question.

I have a huge function that extracts each document's data, lets call it extract_pdf this function returns a tuple (header, body).

Current attempt looks like this:

import glob
import pytest

class TestHeader:
    #asserts...

class TestBody:
    #asserts...

filelist = glob.glob('*.pdf')

for file in filelist:
    header, body = extract_pdf(file)
    pytest.main(<pass header and body as args for pytest>)

I need to parse each document prior to testing. Can it be done this way?

Answer 1

The best way to do this through parameterization of the testcases dynamically..

This can be achieved using the pytest_generate_tests hook..

def pytest_generate_tests(metafunc):
    filelist = glob.glob('*.pdf')
    metafunc.parametrize("fileName", filelist )

NOTE: fileName should be one of the argument to your test function.

This will result in executing the testcase for each of the file in the directory and the testcase will be like

TestFunc[File1]
TestFunc[File2]
TestFunc[File3]
.
.

and so on..

Answer 2

This is expanding on the existing answer by @ArunKalirajaBaskaran.

The problem is that you have different test classes that want to use the same data, but you want to parse the data only once. If it is ok for you to read all data at once, you could read them into global variables and use these for parametrizing your tests:

def extract_data():
    filenames = []
    headers = []
    bodies = []
    for filename in glob.glob('*.pdf'):
        header, body = extract_pdf(filename)
        filenames.append(filename)
        headers.append(header)
        bodies.append(body)
    return filenames, headers, bodies

filenames, headers, bodies = extract_data()


def pytest_generate_tests(metafunc):
    if "header" in metafunc.fixturenames:
        # use the filename as ID for better test names
        metafunc.parametrize("header", headers, ids=filenames)
    elif "body" in metafunc.fixturenames:
        metafunc.parametrize("body", bodies, ids=filenames)

class TestHeader:
    def test_1(header):
        ...

    def test_2(header):
        ...

class TestBody:
    def test_1(body):
        ...

This is the same as using

class TestHeader:
    @pytest.mark.parametrize("header", headers, ids=filenames)
    def test_1(header):
        ...

    @pytest.mark.parametrize("header", headers, ids=filenames)
    def test_2(header):
        ...

pytest_generate_tests just adds a bit of convenience so you don't have to repeat the parametrize decorator for each test.

The downside of this is of course that you will read in all of the data at once, which may cause a problem with memory usage if there is a lot of files. Your approach with pytest.main will not work, because that is the same as calling pytest on the command line with the given parameters. Parametrization can be done at the fixture level or on the test level (like here), but both need the parameters alreay evaluated at load time, so I don't see a possibility to do this lazily (apart from putting it all into one test). Maybe someone else has a better idea...

Run pytest for each file in directory

Question

2 answers

solution1
1 2020-08-02 05:45:18

solution2
1 2020-08-02 10:09:36

Run pytest for each file in directory

Question

2 answers

solution1 1 2020-08-02 05:45:18

solution2 1 2020-08-02 10:09:36

solution1
1 2020-08-02 05:45:18

solution2
1 2020-08-02 10:09:36