简体   繁体   English

根据另一个 dataframe 的 id 和值在一个 dataframe 上设置值

[英]Setting value on one dataframe based on the id and value from another dataframe

I've got problem with my dataframes.我的数据框有问题。

First dataframe looks like:第一个 dataframe 看起来像:


id     0    1    2    3

100    0    0    0    0
101    0    0    0    0
102    0    0    0    0
103    0    0    0    0

Second dataframe looks like:第二个 dataframe 看起来像:

id     num

100    1
100    2
100    3
101    0
101    3
102    1
103    2
103    3

And I want to change in the first dataframe zeros to ones in the specific rows represented by "id" in the columns which are presented in the second dataframe in column "num" with specific "id".而且我想将第一个 dataframe 中的零更改为由“id”表示的特定行中的一个,这些列中显示在具有特定“id”的列“num”中的第二个 dataframe 中。 So in the end I would like to have first dataframe changed to:所以最后我想先把 dataframe 改为:

id     0    1    2    3

100    0    1    1    1
101    1    0    0    1
102    0    1    0    0
103    0    0    1    1

How can I do that?我怎样才能做到这一点? I know that I can use for loop (which I've already prepared), but my dataframes are very big and it will take about 4 hours to finish.我知道我可以使用 for 循环(我已经准备好了),但是我的数据框非常大,大约需要 4 个小时才能完成。 I was thinking about mapping in pandas, but I didn't have a solution.我正在考虑在 pandas 中进行映射,但我没有解决方案。

Best regards此致

Use get_dummies with max by index for indicator values, if need count values use sum instead max :使用get_dummiesmax by index 作为指标值,如果需要计数值使用sum而不是max

df = pd.get_dummies(df2.set_index('id')['num']).max(level=0)
print (df)
     0  1  2  3
id             
100  0  1  1  1
101  1  0  0  1
102  0  1  0  0
103  0  0  1  1

If possible more rows or columns in first DataFrame add DataFrame.reindex :如果可能,在第一个 DataFrame 添加更多行或列DataFrame.reindex

df = (pd.get_dummies(df.set_index('id')['num']).max(level=0)
        .reindex(index=df1.index, columns=df1.columns, fill_value=0))

Naming the first data frame df1 and second one df2 , you can pivot the data frame df2 :命名第一个数据帧df1和第二个df2 ,您可以 pivot 数据帧df2

df2['value'] = 1
df1 = df2.pivot_table(index='id', columns='num', values='value', fill_value=0)

Output: Output:

num  0  1  2  3
id             
100  0  1  1  1
101  1  0  0  1
102  0  1  0  0
103  0  0  1  1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM