[英]How to apply UDF to dataframe?
I am trying to create a function that will cleanup and dataframe that I put through the function. 我正在尝试创建一个函数,该函数将清理和放置通过该函数的数据框。 But I noticed that the df returned is cleanued up but not in place of the original df.
但是我注意到返回的df已清理,但不能代替原始df。
How can I run a UDF on a dataframe and keep the updated dataframe saved in place? 如何在数据框上运行UDF并将更新的数据框保存在适当的位置?
ps I know I can combine these rules into one line but the function I am creating is a lot more complex so I don't want to combine for this example ps我知道我可以将这些规则合并为一行,但是我创建的函数要复杂得多,所以我不想在此示例中合并
df = pd.DataFrame({'Key': ['3', '9', '9', '9', '9','34','34', '34'],
'LastFour': ['2290', '0087', 'M433','M433','25','25','25','25'],
'NUM': [20120528, 20120507, 20120615,20120629,20120621,20120305,20120506,20120506]})
def cleaner(x):
x = x[x['Key'] == '9']
x = x[x['LastFour'] == 'M433']
x = x[x['NUM'] == 20120615]
return x
cleaner(df)
Result from the UDF: UDF的结果:
Key LastFour NUM
2 9 M433 20120615
But if I run the df after the function then I still get the original dataset: 但是,如果我在函数之后运行df,那么我仍然可以获得原始数据集:
Key LastFour NUM
0 3 2290 20120528
1 9 0087 20120507
2 9 M433 20120615
3 9 M433 20120629
4 9 25 20120621
5 34 25 20120305
6 34 25 20120506
7 34 25 20120506
You need to assign the result of cleaner(df)
back to df
as so: 您需要按以下方式将
cleaner(df)
的结果分配回df
:
df = cleaner(df)
An alternative method is to use pd.DataFrame.pipe
to pass your dataframe through a function: 另一种方法是使用
pd.DataFrame.pipe
通过函数传递数据pd.DataFrame.pipe
:
df = df.pipe(cleaner)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.