简体   繁体   中英

Suppress a Warning -python-pandas-colab_notebook

I'm trying to normalize a three-digit country code column in a pandas df. I found a great function called country_converter , and I'am currently running this function on the country column in a very large dataframe. It's returning thousands of these warnings because there are NaN values present in the column.

WARNING:root:nan not found in ISO3

I'm looking for two things:

  1. to suppress the nan warnings specifically
  2. to speed up the processing time of this function (my thought is suppressing the warnings should speed up the process; however, if you have any suggestions about trying something different with my code that would be great!

I've tried all variations of the name, but nothing seems to work so I think I'm missing something...

import country_converter as coco
import pandas as pd
import numpy as np
import warnings

warnings.filterwarnings("ignore", message= "nan not found in ISO3")
warnings.filterwarnings("ignore", message= "root:nan not found in ISO3")
warnings.filterwarnings("ignore", message= "WARNING:root:nan not found in ISO3")

test = pd.DataFrame({"code":[np.nan, 'XXX', 'USA', 'GBR', "GBR",'SWE/n', "123", "abs", "ABCC", "ABC", np.nan, np.nan]})


test['code_convert']= test["code"].apply(lambda x: coco.convert(names= x, to='ISO3', not_found= np.NaN))

Expected to see no more warnings with the nan value.

I've adjusted your data in your dataframe to make the np.nan proper np.nan's and not strings.

test = pd.DataFrame(
    {
        "code": [
            np.nan,
            "XXX",
            "USA",
            "GBR",
            "GBR",
            "SWE/n",
            "123",
            "abs",
            "ABCC",
            "ABC",
            np.nan,
            np.nan,
        ]
    }
)

print(test)

     code
0     NaN
1     XXX
2     USA
3     GBR
4     GBR
5   SWE/n
6     123
7     abs
8    ABCC
9     ABC
10    NaN
11    NaN

Then all you need to do is filter out the np.nan when doing your calculation.

test["code_convert"] = test[test.notna()].apply(
    lambda x: coco.convert(names=x, to="ISO3")
)

I don't have country converter installed but if I simplify the apply to test:

test["code_convert"] = test[test.notna()].apply(
    lambda x: x + "_solution"
)

print(test)

     code    code_convert
0     NaN             NaN
1     XXX    XXX_solution
2     USA    USA_solution
3     GBR    GBR_solution
4     GBR    GBR_solution
5   SWE/n  SWE/n_solution
6     123    123_solution
7     abs    abs_solution
8    ABCC   ABCC_solution
9     ABC    ABC_solution
10    NaN             NaN
11    NaN             NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM