[英]What is the fastest way to populate one pandas dataframe based on values from another pandas dataframe?
I have a pandas dataframe position我有一个 pandas dataframe position
row column
1 3 Brazil
2 6 USA
3 3 USA
4 7 Canada
and another x和另一个x
Brazil Canada USA
1 False False False
2 False False False
3 False False False
4 False False False
5 False False False
6 False False False
7 False False False
I want to populate the second one based on the values from the first one, so the result is:我想根据第一个值填充第二个,所以结果是:
Brazil Canada USA
1 False False False
2 False False False
3 True False True
4 False False False
5 False False False
6 False False True
7 False True False
I'm doing that using iterrows()我正在使用iterrows()
for i, r in positions.iterrows():
x.at[r['row'],r['column']] = True
Is there a faster way to do that?有没有更快的方法来做到这一点?
I will do crosstab
with update
我会用
update
做crosstab
x.update(pd.crosstab(df.row,df.column).eq(1))
x
Out[44]:
Brazil Canada USA
1 False False False
2 False False False
3 True False True
4 False False False
5 False False False
6 False False True
7 False True False
You can pivot the positions
table:您可以 pivot 的
positions
表:
s = (df.assign(dummy=True).set_index(['row','column'])
['dummy'].unstack(fill_value=False)
)
x |= s
Output: Output:
Brazil Canada USA
1 False False False
2 False False False
3 True False True
4 False False False
5 False False False
6 False False True
7 False True False
searchsorted
and slice assignment with iloc
searchsorted
进行iloc
排序和切片分配This assumes that index
and columns
in x
are sorted.这假设
x
中的index
和columns
已排序。
We'll use iloc
and tuples of positions to assign the value of True
我们将使用
iloc
和位置元组来分配True
的值
i = tuple(x.index.searchsorted(df.row))
j = tuple(x.columns.searchsorted(df.column))
x.iloc[[i, j]] = True
x
Brazil Canada USA
1 False False False
2 False False False
3 True False True
4 False False False
5 False False False
6 False False True
7 False True False
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.