I have two dfs
xx
AVERAGE_CALL_DURATION | AVERAGE_DURATION | CHANGE_OF_DETAILS |
---|---|---|
267 | 298 0 | 0 |
421 | 609.33 | 0.33 |
330 | 334 0 | 0 |
240.5 | 666.5 | 0 |
628 | 713 0 | 0 |
and
NoC_c
AVERAGE_CALL_DURATION | AVERAGE_DURATION | CHANGE_OF_DETAILS |
---|---|---|
-5.93 | -4.95 | 0.90 |
593.50 | 595.70 | 1.00 |
I want to return 1 if the xx
column contains the range within NoC_c
(where column names are the same
I can do this for one column
def check_between_ranges(xx, NoC_c):
ranges = NoC_c['AVERAGE_CALL_DURATION']
if (xx['AVERAGE_CALL_DURATION'] >= ranges.iloc[0]) and (xx['AVERAGE_CALL_DURATION'] <= ranges.iloc[1]):
return 1
return xx['AVERAGE_CALL_DURATION']
xx['AVERAGE_CALL_DURATION2'] = xx.apply(lambda x: check_between_ranges(x, NoC_c), axis=1)
However, I need remove the element of manually specifying the column name as the actual dfs contain many more columns.
I have tried
a = NoC_c.columns
def check_between_ranges(xx, NoC_c):
ranges = NoC_c[a]
if (xx[a] >= ranges.iloc[0]) & (xx[a] <= ranges.iloc[1]):
return 1
xx.apply(lambda x: check_between_ranges(x, NoC_c[a]), axis=1)
However, I get the error ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
.
I tried the solutions listed here , although, they were unsuccessful
Also read this to address the specific error but didn't aid in my issue
Any help would be appreciated.
Traceback (most recent call last):
File "<ipython-input-11-2affca771555>", line 10, in <module>
xx.apply(lambda x: check_between_ranges(x, NoC_c[a]), axis=1)
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\frame.py", line 7552, in apply
return op.get_result()
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\apply.py", line 185, in get_result
return self.apply_standard()
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\apply.py", line 276, in apply_standard
results, res_index = self.apply_series_generator()
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\apply.py", line 305, in apply_series_generator
results[i] = self.f(v)
File "<ipython-input-11-2affca771555>", line 10, in <lambda>
xx.apply(lambda x: check_between_ranges(x, NoC_c[a]), axis=1)
File "<ipython-input-11-2affca771555>", line 6, in check_between_ranges
if (xx[a] >= ranges.iloc[0]) & (xx[a] <= ranges.iloc[1]):
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\generic.py", line 1330, in __nonzero__
f"The truth value of a {type(self).__name__} is ambiguous. "
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
You have almost the solution. Try to add .all()
, docs here :
def check_between_ranges(xx, NoC_c):
ranges = NoC_c[a]
if (xx[a] >= ranges.iloc[0]).all() & (xx[a] <= ranges.iloc[1]).all():
return 1
Would this work for you?
Comparison Function
def check_between_ranges(x):
v = []
for c in x.index:
if (x[c] >= NoC_c.at[0,c]) & (x[c] <= NoC_c.at[1,c]):
v += [1]
else:
v += [x[c]]
return pd.Series(v, index=x.index)
Execution
xx.apply(check_between_ranges, axis=1)
Result
AVERAGE_CALL_DURATION AVERAGE_DURATION CHANGE_OF_DETAILS
0 1.0 1.00 0.00
1 1.0 609.33 0.33
2 1.0 1.00 0.00
3 1.0 666.50 0.00
4 628.0 713.00 0.00
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.