[英]replace values in dataframe based in other dataframe filter
I have 2 DataFrames, and I want to replace the values in one dataframe, with the values of the other dataframe, base on the columns on the first one.我有 2 个数据帧,我想根据第一个数据帧的列,用另一个数据帧的值替换一个数据帧中的值。 I put the compositions to clarify.
我把这些成分加以澄清。
DF1: DF1:
A B C D E
Date
01/01/2019 1 2 3 4 5
02/01/2019 1 2 3 4 5
03/01/2019 1 2 3 4 5
DF2: DF2:
name1 name2 name3
Date
01/01/2019 A B D
02/01/2019 B C E
03/01/2019 A D E
THE RESULT I WANT:我想要的结果:
name1 name2 name3
Date
01/01/2019 1 2 4
02/01/2019 2 3 5
03/01/2019 1 4 5
Try:尝试:
result = df2.melt(id_vars="index").merge(
df1.melt(id_vars="index"),
left_on=["index", "value"],
right_on=["index", "variable"],
).drop(columns=["value_x", "variable_y"]).pivot(
index="index", columns="variable_x", values="value_y"
)
print(result)
The two melt
's transform your dataframes to only contain the numbers in one column, and an additional column for the orignal column names:两个
melt
将您的数据框转换为仅包含一列中的数字,以及原始列名称的附加列:
df1.melt(id_vars='index')
index variable value
0 01/01/2019 A 1
1 02/01/2019 A 1
2 03/01/2019 A 1
3 01/01/2019 B 2
4 02/01/2019 B 2
5 03/01/2019 B 2
...
These you can now join on index
and value
/ variable
.这些你现在可以加入
index
和value
/ variable
。 The last part is just removing a couple of columns and then reshaping the table back to the desired form.最后一部分只是删除几列,然后将表格重新调整为所需的形式。
The result is结果是
variable_x name1 name2 name3
index
01/01/2019 1 2 4
02/01/2019 2 3 5
03/01/2019 1 4 5
Use DataFrame.lookup
for each column separately:对每一列分别使用
DataFrame.lookup
:
for c in df2.columns:
df2[c] = df1.lookup(df1.index, df2[c])
print (df2)
name1 name2 name3
01/01/2019 1 2 4
02/01/2019 2 3 5
03/01/2019 1 4 5
General solution is possible different index and columns names:一般解决方案可能是不同的索引和列名称:
print (df1)
A B C D G
01/01/2019 1 2 3 4 5
02/01/2019 1 2 3 4 5
05/01/2019 1 2 3 4 5
print (df2)
name1 name2 name3
01/01/2019 A B D
02/01/2019 B C E
08/01/2019 A D E
df1.index = pd.to_datetime(df1.index, dayfirst=True)
df2.index = pd.to_datetime(df2.index, dayfirst=True)
cols = df2.stack().unique()
idx = df2.index
df11 = df1.reindex(columns=cols, index=idx)
print (df11)
A B D C E
2019-01-01 1.0 2.0 4.0 3.0 NaN
2019-01-02 1.0 2.0 4.0 3.0 NaN
2019-01-08 NaN NaN NaN NaN NaN
for c in df2.columns:
df2[c] = df11.lookup(df11.index, df2[c])
print (df2)
name1 name2 name3
2019-01-01 1.0 2.0 4.0
2019-01-02 2.0 3.0 NaN
2019-01-08 NaN NaN NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.