![](/img/trans.png)
[英]How to apply a function to all the columns in a data frame and take output in the form of dataframe in python
[英]Apply function to all columns of data frame python
我有两个dfs
xx
AVERAGE_CALL_DURATION | AVERAGE_DURATION | CHANGE_OF_DETAILS |
---|---|---|
267 | 298 0 | 0 |
421 | 609.33 | 0.33 |
330 | 334 0 | 0 |
240.5 | 666.5 | 0 |
628 | 713 0 | 0 |
和
NoC_c
AVERAGE_CALL_DURATION | AVERAGE_DURATION | CHANGE_OF_DETAILS |
---|---|---|
-5.93 | -4.95 | 0.90 |
593.50 | 595.70 | 1.00 |
如果xx
列包含NoC_c
内的范围(其中列名相同),我想返回 1
我可以为一栏做到这一点
def check_between_ranges(xx, NoC_c):
ranges = NoC_c['AVERAGE_CALL_DURATION']
if (xx['AVERAGE_CALL_DURATION'] >= ranges.iloc[0]) and (xx['AVERAGE_CALL_DURATION'] <= ranges.iloc[1]):
return 1
return xx['AVERAGE_CALL_DURATION']
xx['AVERAGE_CALL_DURATION2'] = xx.apply(lambda x: check_between_ranges(x, NoC_c), axis=1)
但是,我需要删除手动指定列名的元素,因为实际的 dfs 包含更多列。
我努力了
a = NoC_c.columns
def check_between_ranges(xx, NoC_c):
ranges = NoC_c[a]
if (xx[a] >= ranges.iloc[0]) & (xx[a] <= ranges.iloc[1]):
return 1
xx.apply(lambda x: check_between_ranges(x, NoC_c[a]), axis=1)
但是,我得到了错误ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
。
我尝试了此处列出的解决方案,但没有成功
另请阅读此内容以解决特定错误,但对我的问题没有帮助
任何帮助,将不胜感激。
Traceback (most recent call last):
File "<ipython-input-11-2affca771555>", line 10, in <module>
xx.apply(lambda x: check_between_ranges(x, NoC_c[a]), axis=1)
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\frame.py", line 7552, in apply
return op.get_result()
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\apply.py", line 185, in get_result
return self.apply_standard()
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\apply.py", line 276, in apply_standard
results, res_index = self.apply_series_generator()
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\apply.py", line 305, in apply_series_generator
results[i] = self.f(v)
File "<ipython-input-11-2affca771555>", line 10, in <lambda>
xx.apply(lambda x: check_between_ranges(x, NoC_c[a]), axis=1)
File "<ipython-input-11-2affca771555>", line 6, in check_between_ranges
if (xx[a] >= ranges.iloc[0]) & (xx[a] <= ranges.iloc[1]):
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\generic.py", line 1330, in __nonzero__
f"The truth value of a {type(self).__name__} is ambiguous. "
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
你几乎有解决方案。 尝试添加.all()
, 此处的文档:
def check_between_ranges(xx, NoC_c):
ranges = NoC_c[a]
if (xx[a] >= ranges.iloc[0]).all() & (xx[a] <= ranges.iloc[1]).all():
return 1
这对你有用吗?
比较 Function
def check_between_ranges(x):
v = []
for c in x.index:
if (x[c] >= NoC_c.at[0,c]) & (x[c] <= NoC_c.at[1,c]):
v += [1]
else:
v += [x[c]]
return pd.Series(v, index=x.index)
执行
xx.apply(check_between_ranges, axis=1)
结果
AVERAGE_CALL_DURATION AVERAGE_DURATION CHANGE_OF_DETAILS
0 1.0 1.00 0.00
1 1.0 609.33 0.33
2 1.0 1.00 0.00
3 1.0 666.50 0.00
4 628.0 713.00 0.00
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.