[英]python: how to return a DataFrame or a list from a function?
This question has a big chance to be duplicated but I haven't found an answer yet. 这个问题有很大的机会被重复,但我还没有找到答案。 However, I'm trying to apply a function to a pandas DataFrame and I want to have a DataFrame back. 但是,我正在尝试将函数应用于pandas DataFrame,并且我想恢复一个DataFrame。 Followed example is reproducible: 以下示例是可重现的:
df = pd.DataFrame({'ID': ["1","2"],
'Start': datetime.strptime('20160701', '%Y%m%d'),
'End': datetime.strptime('20170701', '%Y%m%d'),
'Value': [100, 200],
'CreditNote': [-20, -30]})
My function: 我的功能:
def act_value_calc(x):
start_delta = (x.Start.replace(day=31,month=12) - x.Start).days
full_delta = (x.End - x.Start).days
result1 = round( (x.Value + x.CreditNote) / full_delta * start_delta, 2)
result2 = round( (x.Value + x.CreditNote) - result1, 2)
return(pd.DataFrame({'r1': [result1],'r2': [result2]}))
Why I can not run the following code ... 为什么我不能运行以下代码...
df.apply(act_value_calc, 1)
and what should be done to let it run? 以及如何使它运行? I mean to get a DataFrame or a list back with result1
and result2
? 我的意思是要获得一个带有result1
和result2
的DataFrame或列表?
apply
will return some value per row, or per column, depending on the axis
argument you provide (I believe you understand this already given you are providing an axis
arg of 1). apply
将根据您提供的axis
参数返回每行或每列一些值(我相信您已经理解了这一点,因为您提供的axis
arg为1)。
Returning a DataFrame from apply is problematic. 从apply返回DataFrame是有问题的。 What you probably want to do is create a new column with the values returned by the function you are applying. 您可能想做的就是使用您要应用的函数返回的值创建一个新列。
Something like 就像是
def act_value_calc1(x):
start_delta = (x.Start.replace(day=31,month=12) - x.Start).days
full_delta = (x.End - x.Start).days
result1 = round( (x.Value + x.CreditNote) / full_delta * start_delta, 2)
return result1
def act_value_calc2(x):
start_delta = (x.Start.replace(day=31,month=12) - x.Start).days
full_delta = (x.End - x.Start).days
result2 = round( (x.Value + x.CreditNote) - x.result1, 2)
return result2
df['result1'] = df.apply(act_value_calc1, axis=1)
df['result2'] = df.apply(act_value_calc2, axis=1)
You can make it easier for yourself while returning a pandas.Series instead of a pandas.DataFrame: 您可以在返回pandas.Series而不是pandas.DataFrame时使自己更轻松:
def act_value_calc(x):
start_delta = (x.Start.replace(day=31,month=12) - x.Start).days
full_delta = (x.End - x.Start).days
result1 = round( (x.Value + x.CreditNote) / full_delta * start_delta, 2)
result2 = round( (x.Value + x.CreditNote) - result1, 2)
return(pd.Series({'r1': result1,'r2': result2}))
print(df.apply(act_value_calc, 1))
r1 r2
0 40.11 39.89
1 85.23 84.77
you can create a global variable by declaring it within the function and then create a data frame out of it 您可以通过在函数中进行声明来创建全局变量,然后从中创建数据框
def act_value_calc(x):
start_delta = (x.Start.replace(day=31,month=12) - x.Start).days
full_delta = (x.End - x.Start).days
result1 = round( (x.Value + x.CreditNote) / full_delta * start_delta, 2)
result2 = round( (x.Value + x.CreditNote) - result1, 2)
global df ### declaring global variable
df=pd.DataFrame({'r1': [result1],'r2': [result2]})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.