根据特定列将 pandas dataframe 列替换为另一个 dataframe

Question

I have two dataframes with many columns df1, df2, and i want to replace all df1 values (except the time columns) with the data from df2 columns where the time values is the same:我有两个包含许多列 df1、df2 的数据框，我想用时间值相同的 df2 列中的数据替换所有 df1 值（时间列除外）：

df1:

index time   x y   ......many other columns ( the same as df2)
0       1    1 1
1       1.1  2 2
2       1.1  3 3
3       1.1  4 4
4       1.4  5 5
5       1.5  6 6
6       1.5  7 7


df2:

index time  x   y   ....many other columns (the same as df1)
0       1   10  10
1       1.1 11  11
2       1.2 12  12
3       1.3 13  13
4       1.4 14  14
5       1.5 15  15
6       1.6 16  16



the result for df1 should be:

index time  x   y   ....many other columns 
0       1    10 10
1       1.1  11 11
2       1.1  11 11
3       1.1  11 11
4       1.4  14 14
5       1.5  15 15
6       1.5  15 15

Answer 1

You need to merge:你需要合并：

df1 = df1.merge(df2, left_index = True, right_index = True)

then you need to remove the columns you do not need那么您需要删除不需要的列

Answer 2

Edit: Misread the question the first time.编辑：第一次误读问题。 This should help:这应该有助于：

df1[['time']].merge(df2, on='time')

Answer 3

I think I was able to get my thinking in order and hopefully have reached a solution that will work for you.我想我能够让我的想法井然有序，并希望能找到一个适合你的解决方案。

Try this, you can get your answer with using combine_first , and doing some tweaking:试试这个，你可以通过使用combine_first得到你的答案，并做一些调整：

combine_first fills null values from another dataframe , so first you can replace all values (except in 'time' column) with np.nan . combine_first从另一个dataframe填充 null 值，因此首先您可以用np.nan替换所有值（“时间”列除外）。 Note that I use 'time' column as the index .请注意，我使用“时间”列作为index 。
As combine_first will return the union of the two dataframes, you can use isin to get only the time values from df1 in your final output.由于combine_first将返回两个数据帧的并集，因此您可以使用isin仅从最终 output 中的df1获取时间值。

import numpy as np
import pandas as pd

df1[df1.columns.difference(['time'])] = np.nan
res = df1.set_index('time').combine_first(df2.set_index('time')).reset_index()
li = [i for i in df1['time'].unique()]

final= res[res['time'].isin(li)]

Which will get you:这会让你：

   time     x     y
0   1.0  10.0  10.0
1   1.1  11.0  11.0
2   1.1  11.0  11.0
3   1.1  11.0  11.0
6   1.4  14.0  14.0
7   1.5  15.0  15.0
8   1.5  15.0  15.0

Try it on your actual dataset, and let me know if it works.在您的实际数据集上尝试一下，让我知道它是否有效。

根据特定列将 pandas dataframe 列替换为另一个 dataframe

问题描述

3 个解决方案

解决方案1
0 2021-01-18 15:32:55

解决方案2
0 2021-01-18 15:44:27

解决方案3
0 已采纳 2021-01-18 18:50:08

根据特定列将 pandas dataframe 列替换为另一个 dataframe

问题描述

3 个解决方案

解决方案1 0 2021-01-18 15:32:55

解决方案2 0 2021-01-18 15:44:27

解决方案3 0 已采纳 2021-01-18 18:50:08

解决方案1
0 2021-01-18 15:32:55

解决方案2
0 2021-01-18 15:44:27

解决方案3
0 已采纳 2021-01-18 18:50:08