Pandas: How to replace values in a Data Frame using other Data Frame's columns

Question

I want to replace the values in a dataframe A with "1" using another data frame as reference to map it, something like this:

Original Data Frame A:

Index  201901    201902    201903
a      0         0         0
b      0         0         0
c      0         0         0
d      0         0         0

Reference Data Frame B

Index  Month
a      201902
b      201901

The result Data Frame C

Index  201901    201902    201903
a      0         1         0
b      1         0         0
c      0         0         0
d      0         0         0

I've tried with loc but haven't found a way to make it work. Any suggestions?

Answer 1

You can use df.iterrows() to iterate through the rows of the second dataframe and use df.at[] to set the values where you need to.

df = pd.DataFrame([[0,0,0], [0,0,0], [0,0,0], [0,0,0]], columns=['201901', '201902', '201903'])
df.index=['a', 'b','c', 'd']
print(df)
#    201901  201902  201903
# a       0       0       0
# b       0       0       0
# c       0       0       0
# d       0       0       0

dfb = pd.DataFrame(['201902', '201901'], columns=['month'])
dfb.index = ['a', 'b']
print(dfb)
#     month
# a  201902
# b  201901

for i, row in dfb.iterrows():
    df.at[i, row] = 1

print(df)
#    201901  201902  201903
# a       0       1       0
# b       1       0       0
# c       0       0       0
# d       0       0       0

Answer 2

Looks like there's no need to iterate. I have a simple solution using pd.get_dummies and pd.DataFrame.update

dfA.update(pd.get_dummies(dfB.Month.apply(str)))

I used the .apply(str) because the content of dfB registered as an integer but the columns from A are strings, so the update won't work if the fields don't match

Output:

       201901  201902  201903
Index                        
a         0.0     1.0       0
b         1.0     0.0       0
c         0.0     0.0       0
d         0.0     0.0       0

Answer 3

Numpy assign

df.values[df.index.get_indexer(dfb.index),df.columns.get_indexer(dfb.month)]=1
df
Out[1081]: 
   201901  201902  201903
a       0       1       0
b       1       0       0
c       0       0       0
d       0       0       0

Pandas: How to replace values in a Data Frame using other Data Frame's columns

Question

3 answers

solution1
4 2019-05-01 20:20:29

solution2
2 2019-05-01 20:34:02

Output:

solution3
2 ACCPTED 2019-05-01 21:07:47

Pandas: How to replace values in a Data Frame using other Data Frame's columns

Question

3 answers

solution1 4 2019-05-01 20:20:29

solution2 2 2019-05-01 20:34:02

Output:

solution3 2 ACCPTED 2019-05-01 21:07:47

solution1
4 2019-05-01 20:20:29

solution2
2 2019-05-01 20:34:02

solution3
2 ACCPTED 2019-05-01 21:07:47