I'm trying to use .apply()
with a dataframe as one of the arguments:
df.apply(func, axis=1, args=(df))
When I do, I get the following error:
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Here is the function:
def func(df):
new_val = df.loc[ \
(df["date"] == self.date + relativedelta(years=1)) & \
(df["indicator"] == self.indicator), "val"]
if (len(new_val) == 1):
new_val = list(new_val)[0] # Extract integer from series
self["updated_val"] = new_val - self.val
Ok, the problem here is the combination of func
and apply
. The apply
method of a dataframe applies the given function to each COLUMN in the data frame and returns the result. So the function you pass to apply
should expect a pandas Series or an array as input, not a dataframe. It should give either a series/array or single value as output. For example
df.apply(sum)
will apply the sum
function to each column and give a series containing the result for each column (normally you would do df.sum() for this but I'm just using it to illustrate the point).
Secondly, the args
parameter in apply
is only used when the function you are passing takes additional arguments (besides the series, which should be the first argument). For example, you might have a function that sums an array and then divides by some number (again a silly example):
def sum_div(array, divisor):
return sum(array) / divisor
The you might want to apply this to each column of a dataframe with divisor = 2. You would do
df.apply(sum_div, args=[2])
I'm not sure what you want to here. Is it just func(df)
?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.