繁体   English   中英

将 function 应用于数据框 python 的所有列

[英]Apply function to all columns of data frame python

我有两个dfs

xx
AVERAGE_CALL_DURATION AVERAGE_DURATION CHANGE_OF_DETAILS
267 298 0 0
421 609.33 0.33
330 334 0 0
240.5 666.5 0
628 713 0 0

NoC_c
AVERAGE_CALL_DURATION AVERAGE_DURATION CHANGE_OF_DETAILS
-5.93 -4.95 0.90
593.50 595.70 1.00

如果xx列包含NoC_c内的范围(其中列名相同),我想返回 1

我可以为一栏做到这一点

def check_between_ranges(xx, NoC_c):
    ranges = NoC_c['AVERAGE_CALL_DURATION']
    
    if (xx['AVERAGE_CALL_DURATION'] >= ranges.iloc[0]) and (xx['AVERAGE_CALL_DURATION'] <= ranges.iloc[1]):
        return 1
    return xx['AVERAGE_CALL_DURATION']

xx['AVERAGE_CALL_DURATION2'] = xx.apply(lambda x: check_between_ranges(x, NoC_c), axis=1)

但是,我需要删除手动指定列名的元素,因为实际的 dfs 包含更多列。

我努力了

a = NoC_c.columns

def check_between_ranges(xx, NoC_c):
    ranges = NoC_c[a]
    
    if (xx[a] >= ranges.iloc[0]) & (xx[a] <= ranges.iloc[1]):
        return 1

xx.apply(lambda x: check_between_ranges(x, NoC_c[a]), axis=1)

但是,我得到了错误ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all() ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

我尝试了此处列出的解决方案,但没有成功

另请阅读内容以解决特定错误,但对我的问题没有帮助

任何帮助,将不胜感激。

Traceback (most recent call last):

  File "<ipython-input-11-2affca771555>", line 10, in <module>
    xx.apply(lambda x: check_between_ranges(x, NoC_c[a]), axis=1)

  File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\frame.py", line 7552, in apply
    return op.get_result()

  File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\apply.py", line 185, in get_result
    return self.apply_standard()

  File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\apply.py", line 276, in apply_standard
    results, res_index = self.apply_series_generator()

  File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\apply.py", line 305, in apply_series_generator
    results[i] = self.f(v)

  File "<ipython-input-11-2affca771555>", line 10, in <lambda>
    xx.apply(lambda x: check_between_ranges(x, NoC_c[a]), axis=1)

  File "<ipython-input-11-2affca771555>", line 6, in check_between_ranges
    if (xx[a] >= ranges.iloc[0]) & (xx[a] <= ranges.iloc[1]):

  File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\generic.py", line 1330, in __nonzero__
    f"The truth value of a {type(self).__name__} is ambiguous. "

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

你几乎有解决方案。 尝试添加.all()此处的文档:

def check_between_ranges(xx, NoC_c):
    ranges = NoC_c[a]
    
    if (xx[a] >= ranges.iloc[0]).all() & (xx[a] <= ranges.iloc[1]).all():
        return 1

这对你有用吗?

比较 Function

def check_between_ranges(x):
    v = []
    
    for c in x.index:
        if (x[c] >= NoC_c.at[0,c]) & (x[c] <= NoC_c.at[1,c]):
            v += [1]
        else:
            v += [x[c]]
            
    return pd.Series(v, index=x.index)

执行

xx.apply(check_between_ranges, axis=1)

结果

   AVERAGE_CALL_DURATION  AVERAGE_DURATION  CHANGE_OF_DETAILS
0                    1.0              1.00               0.00
1                    1.0            609.33               0.33
2                    1.0              1.00               0.00
3                    1.0            666.50               0.00
4                  628.0            713.00               0.00

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM