[英]Run Python function over two DataFrame columns
我遇到了一個問題,我認為它應該很簡單。 問題是我有一個 function,我想將其應用於我的 dataframe 的兩列。但我收到一個錯誤:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
向您展示我正在嘗試做的事情:
# Calculate the accuracy
def mape(actual,pred):
if actual == 0:
if pred == 0:
return 0
else:
return 100
else:
return np.mean(np.abs((actual - pred) / actual)) * 100
然后,我嘗試將它應用於兩列(稱為 Actuals_March 和 Forecast_March)。
# This line runs into the ValueError above.
# I removed all NaN values before running this.
df['MAPE_Mar'] = df.apply(lambda x: mape(df.Actuals_March , df.Forecast_March), axis=1)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
#This is an snapshot of my data:
df.Actuals_March df.Forecast_March
0.0 0.0
0.0 0.0
0.0 0.0
4.0 0.0
0.0 0.0
5.0 0.0
20.0 0.0
0.0 0.0
2.0 0.0
13.0 0.0
希望您能夠幫助我。 提前致謝
將df
替換為x
以按列匹配標量值:
df['MAPE_Mar'] = df.apply(lambda x: mape(x.Actuals_March , x.Forecast_March), axis=1)
矢量化替代方案:
m1 = df['Actuals_March'] == 0
m2 = df['Forecast_March'] == 0
s = (np.abs(df['Actuals_March'] - df['Forecast_March']) / df['Actuals_March']) * 100
df['MAPE_Mar1'] = np.select([m1 & m2, ~m1 & m2], [0, 100], s)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.