I want to replace the values in a dataframe A with "1" using another data frame as reference to map it, something like this:
Original Data Frame A:
Index 201901 201902 201903
a 0 0 0
b 0 0 0
c 0 0 0
d 0 0 0
Reference Data Frame B
Index Month
a 201902
b 201901
The result Data Frame C
Index 201901 201902 201903
a 0 1 0
b 1 0 0
c 0 0 0
d 0 0 0
I've tried with loc but haven't found a way to make it work. Any suggestions?
You can use df.iterrows()
to iterate through the rows of the second dataframe and use df.at[]
to set the values where you need to.
df = pd.DataFrame([[0,0,0], [0,0,0], [0,0,0], [0,0,0]], columns=['201901', '201902', '201903'])
df.index=['a', 'b','c', 'd']
print(df)
# 201901 201902 201903
# a 0 0 0
# b 0 0 0
# c 0 0 0
# d 0 0 0
dfb = pd.DataFrame(['201902', '201901'], columns=['month'])
dfb.index = ['a', 'b']
print(dfb)
# month
# a 201902
# b 201901
for i, row in dfb.iterrows():
df.at[i, row] = 1
print(df)
# 201901 201902 201903
# a 0 1 0
# b 1 0 0
# c 0 0 0
# d 0 0 0
Looks like there's no need to iterate. I have a simple solution using pd.get_dummies
and pd.DataFrame.update
dfA.update(pd.get_dummies(dfB.Month.apply(str)))
I used the .apply(str)
because the content of dfB registered as an integer but the columns from A are strings, so the update
won't work if the fields don't match
201901 201902 201903
Index
a 0.0 1.0 0
b 1.0 0.0 0
c 0.0 0.0 0
d 0.0 0.0 0
Numpy assign
df.values[df.index.get_indexer(dfb.index),df.columns.get_indexer(dfb.month)]=1
df
Out[1081]:
201901 201902 201903
a 0 1 0
b 1 0 0
c 0 0 0
d 0 0 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.