[英]Iterate over a column in a dataframe matching each value with a value in another column in another dataframe
I basically have two data frames. 我基本上有两个数据框。 Let's say aa and bb.
假设是aa和bb。 I want to look all the values in the first column of bb that are in the first column of aa and if they are I have to get column 2 of aa and add it to a new column in bb (if there is not much I'll put a 0).
我想查看aa第一列中bb的第一列中的所有值,如果是,则必须获取aa的第二列并将其添加到bb中的新列中(如果我没有太多, ll放一个0)。 Let's see if looking at some code it makes more sense.
让我们看看看一些代码是否更有意义。 I've done it using apply and a function:
我已经使用apply和一个函数完成了它:
aa=pd.DataFrame({'a':[1,2,3,4,5],'b':[6,7,8,9,0]})
bb=pd.DataFrame({'c':[11,2,13,4,15],'d':['f','h','j','k','l']})
a b
0 1 6
1 2 7
2 3 8
3 4 9
4 5 0
c d
0 11 f
1 2 h
2 13 j
3 4 k
4 15 l
def set_time_session (row):
element = row['c']
if element in aa['a'].unique():
return aa['b'][aa['a']==element]
else:
return 0
column = bb.apply(set_time_session,axis=1)
bb['newcolumn']=column
c d newcolumn
0 11 f 0
1 2 h 7
2 13 j 0
3 4 k 9
4 15 l 0
This actually works, but when done in a dataframe with 200000 rows it takes forever to complete. 这实际上是可行的,但是在具有200000行的数据帧中完成时,将永远需要完成。 I'm sure the is a better and faster way to do it.
我敢肯定,这是一种更好更快的方法。 Thanks!
谢谢!
Try this: 尝试这个:
res = pd.merge(aa, bb, left_on='a', right_on='c', how='inner', left_index=True)
bb['newcolumn']= res.reindex(range(len(aa))).fillna(0)['b']
print(bb)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.