[英]How to modify function so as to return 2 DataFrame depending on values in Python Pandas?
I have function in Python Pandas like below:我在 Python Pandas 中有如下功能:
def my_func(df, col: str):
if pd.isna(df[col]):
return False
To use my function I need: df_resul = my_func(df = my_df, col = "col1")
要使用我的功能,我需要:
df_resul = my_func(df = my_df, col = "col1")
And Data Frame like below where col1 is string data type:和下面的数据框,其中 col1 是字符串数据类型:
col1
--------
NaN
ABC
NaN
How can I modify my function, so as to as a result have 2 different DataFrames:如何修改我的函数,以便有 2 个不同的 DataFrame:
So to use my function I need: df_nan, df_not_nan = my_func(df = my_df, col = "col1")
where df_nan will return df where in col1 is nan and df_not_nan will return df where in col is value other than nan.因此,要使用我的函数,我需要:
df_nan, df_not_nan = my_func(df = my_df, col = "col1")
其中 df_nan 将返回 df,其中 col1 是 nan,而 df_not_nan 将返回 df,其中 col 是 nan 以外的值。
df_nan: df_nan:
col1
------
NaN
NaN
df_not_nan: df_not_nan:
col1
-----
ABC
How can I modify my function in Python Pandas ?如何在 Python Pandas 中修改我的函数?
Use boolean indexing
with ~
fo rinvert mask, here for select non missing values rows:将
boolean indexing
与~
用于 rinvert 掩码一起使用,此处用于选择非缺失值行:
print (my_df)
col1 a
0 NaN 1
1 ABC 2
2 NaN 3
def my_func(df, col: str):
m = df[col].isna()
return df[m], df[~m]
df_nan, df_not_nan = my_func(df = my_df, col = "col1")
print (df_nan)
col1 a
0 NaN 1
2 NaN 3
print (df_not_nan)
col1 a
1 ABC 2
If need test if exist at least one missing value is necesary add Series.any
for avoid error如果需要测试是否存在至少一个缺失值,则需要添加
Series.any
以避免错误
ValueError: The truth value of a Series is ambiguous.
ValueError:Series 的真值不明确。 Use a.empty, a.bool(), a.item(), a.any() or a.all()
使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()
def my_func1(df, col: str):
if pd.isna(df[col]).any():
return 'exist at least one missing values'
else:
return 'no missing values'
out = my_func1(df = my_df, col = "col1")
print (out)
exist at least one missing values
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.