[英]Add series to dataframe which is an intersection of two others
Suppose I have a dataframe like this假设我有一个这样的数据框
column1 column2
1 8
2 9
20 1
4 2
56
6
2
I want a result like this :我想要这样的结果:
column1 column2 column3
1 8 1
2 9 2
20 1
4 2
56
6
2
So I want a result in the column 3所以我想要第 3 列的结果
column = [1, 2, 20, 4, 56, 6, 2]
column = [8, 9, 1, 2]
list_1 = []
for item1 in column1:
for item2 in column2:
if item1 == item2:
list_1.append(item1)
else:
print("NO MATCH")
z = list(set(list_1))
print(z)
Using set.intersection
with pd.DataFrame.loc
:使用set.intersection
和pd.DataFrame.loc
:
L = list(set(df['column1']) & set(df['column2']))
df.loc[np.arange(len(L)), 'column3'] = L
print(df)
column1 column2 column3
0 1 8.0 1.0
1 2 9.0 2.0
2 20 1.0 NaN
3 4 2.0 NaN
4 56 NaN NaN
5 6 NaN NaN
6 2 NaN NaN
You should be aware this isn't vectorised and somewhat against the grain with Pandas / NumPy, hence a solution which uses regular Python objects.您应该知道这不是矢量化的,并且与 Pandas / NumPy 有点不符,因此是使用常规 Python 对象的解决方案。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.