[英]Assign values to a dataframe by considering values in 2 columns of different dataframe as range
The following code explains the scenario, I have a dataframe(df_ticker) with 3 columns 以下代码说明了这种情况,我有一个包含3列的数据框
import pandas as pd
df_ticker = pd.DataFrame({'Min_val': [22382.729,36919.205,46735.164,62247.61], 'Max_val': [36901.758,46716.06,62045.06,182727.05],
'Ticker':['$','$$','$$$','$$$$']})
df_ticker`
df_ticker My second dataframe contains 2 columns df_ticker我的第二个数据框包含2列
df_values = pd.DataFrame({'Id':[1,2,3,4,5,6],'sal_val': [3098,45639.987,65487.4,56784.8,8,736455]})
df_values `
For every value in df_values ['sal_val'], I want to check in which range it falls in df_ticker [Max_val] and df_ticker [min_val] and assign df_ticker [ticker] accordingly. 对于df_values ['sal_val']中的每个值,我想检查它在df_ticker [Max_val]和df_ticker [min_val]的哪个范围内,并相应地分配df_ticker [ticker]。
Sample output would be something like this, sample_output 示例输出将是这样, sample_output
In the sample output, sal_val=3098 is greater than or equal to Min_val=22382.729 and less than or equal to max_val=36901.75, it was assigned ticker=$ 在样本输出中,sal_val = 3098大于或等于Min_val = 22382.729且小于或等于max_val = 36901.75,已将其分配为报价器= $
I tried the following, 我尝试了以下方法
df_values['ticker']=df_ticker.\
loc[((df_values['sal_val']>=df_ticker['Min_val'])| (df_values['sal_val']<=df_ticker['Max_val']))]['Ticker']
df_values
It failed with error "ValueError: Can only compare identically-labeled Series objects" 它失败,并显示错误“ ValueError:只能比较标记相同的Series对象”
Any solutions for this issue? 这个问题有解决方案吗?
One way is to define a custom mapping function and use pd.Series.apply
. 一种方法是定义自定义映射函数并使用pd.Series.apply
。
def mapper(x, t):
if x < t['Min_val'].min():
index = 0
elif x >= t['Max_val'].max():
index = -1
else:
index = next((idx for idx, (i, j) in enumerate(zip(t['Min_val'], t['Max_val']))\
if i <= x < j), None)
return t['Ticker'].iloc[index] if index is not None else None
df_values['Ticker'] = df_values['sal_val'].apply(mapper, t=df_ticker)
Result 结果
Id sal_val Ticker
0 1 3098.000 $
1 2 45639.987 $$
2 3 65487.400 $$$$
3 4 56784.800 $$$
4 5 8.000 $
5 6 736455.000 $$$$
Explanation 说明
pd.Series.apply
accepts a custom mapping function as an input. pd.Series.apply
接受自定义映射功能作为输入。 sal_val
and compares it to values in df_ticker
via an if / else structure. 映射函数获取sal_val
每个条目,并通过if / else结构将其与df_ticker
值进行比较。 if
statements deal with minimum and maximum boundaries. 前两个if
语句处理最小和最大边界。 else
statement uses a generator, which cycles through each row in df_ticker
and finds the index of values where the input is within the range of Min_val
and Max_val
. 最后的else
语句使用生成器,该生成器循环遍历df_ticker
每一行,并找到输入在Min_val
和Max_val
范围内的值的Max_val
。 df_ticker['Ticker']
via .iloc
integer accessor. 最后,我们使用索引,并通过.iloc
整数访问器将其输入df_ticker['Ticker']
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.