简体   繁体   中英

Python - How to check a data format in a Pandas DataFrame column?

I have a PD DataFrame and want to test 1)if a column contains a value and 2)if values in another column are in a format of positive percent decimal numbers?

I test if an Id value is not in the DF like this: assert "C50" not in metrics_df["id"].tolist() .

How can I check that the values in a metric_1 column are in the right format of percent decimal (0.05; 0.238)? The only valid value of

    df = self.spark.createDataFrame(
        [('A10', -0.35, '2020-01-04'),
         ('A20', -0.20, '2017-05-01'),
         ('B30', 0.59, '2018-02-08'),
         ],
        ['id', 'metric_1', 'transaction_date']
    )

If I understand it correctly, this could help you:

assert (box_raw["metric_1"] < 1).all() and (box_raw["metric_1"] > 0).all()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM