[英]I have a two data frames df and df1 have same id columns but different column names in both data frames and I want update values in second dataframe
I want to update df1 as per updated df1, if df1 have nans replace with values if df have values matched with ID column on both data frames.我想根据更新的 df1 更新 df1,如果 df1 有 nans 替换为值,如果 df 的值与两个数据帧上的 ID 列匹配。
ID QD QP QE ID QD QP QE
101 4 6 4 101 4 6 4
102 5 8 5 102 5 8 5
103 7 6 6 103 7 6 6
104 8 3 5 104 8 3 5
105 4 2 5 105 4 2 5
If your ID
columns is sorted and these two columns are one-to-one correspondence, you can use如果你的
ID
列是排序好的,这两列是一一对应的,你可以使用
df1[df1.isnull()] = df.values
print(df1)
ID QD QP QE
0 101 4.0 6.0 4.0
1 102 5.0 8.0 5.0
2 103 7.0 6.0 6.0
3 104 8.0 3.0 5.0
4 105 4.0 2.0 5.0
If not, you'd better set the ID
column as index and choose one among fillna
method, combine_first
method and update
method to update column according to index.如果没有,最好将
ID
列设置为索引,并在fillna
方法、 combine_first
方法和update
方法中选择一种根据索引更新列。
df1 = df1.set_index('ID')
# fillna
df1 = df1.fillna(df.set_index('ID').set_axis(df1.columns, axis=1)).reset_index()
# combine_first, if df is bigger than your original df1,
# the additional rows and columns are added
df1 = df1.combine_first(df.set_index('ID').set_axis(df1.columns, axis=1)).reset_index()
# update method will modify data inplace,
# you need to do reset index in separate step
df1.update(df.set_index('ID').set_axis(df1.columns, axis=1))
df1.reset_index()
print(df1)
ID QD QP QE
0 101 4.0 6.0 4.0
1 102 5.0 8.0 5.0
2 103 7.0 6.0 6.0
3 104 8.0 3.0 5.0
4 105 4.0 2.0 5.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.