简体   繁体   English

pytest:使用夹具与 pandas dataframe 进行参数化

[英]pytest: use fixture with pandas dataframe for parametrization

I have a fixture, which returns a pd.DataFrame .我有一个夹具,它返回一个pd.DataFrame I need to insert the individual columns ( pd.Series ) into a unit test and I would like to use parametrize .我需要将各个列 ( pd.Series ) 插入到单元测试中,并且我想使用parametrize

Here's a toy example without parametrize .这是一个没有parametrize的玩具示例。 Every column of the dataframe will be tested individually. dataframe 的每一列都将单独测试。 However, I guess I can get rid of the input_series fixture, can't I?但是,我想我可以摆脱input_series夹具,不是吗? With this code, only 1 test will be executed.使用此代码,只会执行 1 个测试。 However, I am looking for 3 tests while getting rid of the for-loop at the same time.但是,我正在寻找 3 个测试,同时摆脱 for 循环。

import numpy as np
import pandas as pd
import pytest


@pytest.fixture(scope="module")
def input_df():
    return pd.DataFrame(
        data=np.random.randint(1, 10, (5, 3)), columns=["col1", "col2", "col3"]
    )


@pytest.fixture(scope="module")
def input_series(input_df):
    return [input_df[series] for series in input_df.columns]


def test_individual_column(input_series):
    for series in input_series:
        assert len(series) == 5

I am basically looking for something like this:我基本上在寻找这样的东西:

@pytest.mark.parametrize("series", individual_series_from_input_df)
def test_individual_column(series):
    assert len(series) == 5

If you try to generate multiple data from a fixture based on another fixture you will get the yield_fixture function has more than one 'yield' error message.如果您尝试从基于另一个夹具的夹具生成多个数据,您将收到yield_fixture function has more than one 'yield'错误消息。

One solution is to use fixture parametrization .一种解决方案是使用夹具参数化 In your case you want to iterate by columns so the Dataframe columns are the parameters.在您的情况下,您希望按列进行迭代,因此 Dataframe 列是参数。

# test data
input_df = pd.DataFrame(
    data=np.random.randint(1, 10, (5, 3)), columns=["col1", "col2", "col3"]
)


@pytest.fixture(
    scope="module",
    params=input_df.columns,
)
def input_series(request):
    series = request.param
    yield input_df[series]


def test_individual_column(input_series):
    assert len(input_series) == 5

This will generate one test by column of the test Dataframe.这将按测试 Dataframe 的列生成一个测试。

pytest test_pandas.py
# test_pandas.py::test_individual_column[col1] PASSED
# test_pandas.py::test_individual_column[col2] PASSED
# test_pandas.py::test_individual_column[col3] PASSED

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM