简体   繁体   中英

Setting value on one dataframe based on the id and value from another dataframe

I've got problem with my dataframes.

First dataframe looks like:


id     0    1    2    3

100    0    0    0    0
101    0    0    0    0
102    0    0    0    0
103    0    0    0    0

Second dataframe looks like:

id     num

100    1
100    2
100    3
101    0
101    3
102    1
103    2
103    3

And I want to change in the first dataframe zeros to ones in the specific rows represented by "id" in the columns which are presented in the second dataframe in column "num" with specific "id". So in the end I would like to have first dataframe changed to:

id     0    1    2    3

100    0    1    1    1
101    1    0    0    1
102    0    1    0    0
103    0    0    1    1

How can I do that? I know that I can use for loop (which I've already prepared), but my dataframes are very big and it will take about 4 hours to finish. I was thinking about mapping in pandas, but I didn't have a solution.

Best regards

Use get_dummies with max by index for indicator values, if need count values use sum instead max :

df = pd.get_dummies(df2.set_index('id')['num']).max(level=0)
print (df)
     0  1  2  3
id             
100  0  1  1  1
101  1  0  0  1
102  0  1  0  0
103  0  0  1  1

If possible more rows or columns in first DataFrame add DataFrame.reindex :

df = (pd.get_dummies(df.set_index('id')['num']).max(level=0)
        .reindex(index=df1.index, columns=df1.columns, fill_value=0))

Naming the first data frame df1 and second one df2 , you can pivot the data frame df2 :

df2['value'] = 1
df1 = df2.pivot_table(index='id', columns='num', values='value', fill_value=0)

Output:

num  0  1  2  3
id             
100  0  1  1  1
101  1  0  0  1
102  0  1  0  0
103  0  0  1  1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM