[英]find column value based on another dataframe
I've a sample dataframe我有一个示例数据框
value id
a 1
b 2
c 5
d 8
e 11
another dataframe:另一个数据框:
entity start_range end_range
ABC 1 3
DEF 4 7
XYZ 8 15
How can I get the values of entities in dataframe1 based on range which would look like the below?如何根据如下所示的范围获取 dataframe1 中实体的值?
value id entity
a 1 ABC
b 2 ABC
c 5 DEF
d 8 XYZ
e 11 XYZ
it's not a clean answer and I don't know if there is a better way to do this but try this it should works:这不是一个干净的答案,我不知道是否有更好的方法来做到这一点,但试试这个它应该有效:
data=pd.DataFrame({"value":["a","b","c","d","e"],"id":[1,2,5,8,11]})
df=pd.DataFrame({"entity":["ABC","DEF","XYZ"],"start_range":[1,4,8],"end_range":[3,7,15]})
df["explode"]=df.apply(lambda x:[i for i in range(x["start_range"],x["end_range"])],axis=1)
exploded=df.explode("explode")
exploded.index=exploded["explode"]
data["entity"]=data["id"].replace(exploded["entity"].to_dict())
You can do:你可以做:
ii = pd.IntervalIndex.from_arrays(df2['start_range'], df2['end_range'], closed='both')
df1['entity'] = df2.set_index(ii).loc[df1['id'], 'entity'].values
print(df1)
value id entity
0 a 1 ABC
1 b 2 ABC
2 c 5 DEF
3 d 8 XYZ
4 e 11 XYZ
So the problem to be solved in this case is how to search for 'id' in the range.所以本例要解决的问题是如何在范围内搜索'id'。 Apparently loc
can do that !显然loc
可以做到这一点! as long as your index is IntervalIndex
.只要您的索引是IntervalIndex
。 So create IntervalIndex from df2 and use that to loc
the df1's id
所以从 df2 创建 IntervalIndex 并使用它来loc
的id
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.