熊貓combin_first與特定的索引列？

Question

我正在嘗試在熊貓中聯接兩個數據框以具有以下行為：我想在指定的列上聯接，但是要這樣做，因此不會將多余的列添加到數據框。 這類似於combine_first不同的是combine_first似乎沒有帶索引列的可選參數。 例：

# combine df1 and df2 based on "id" column
df1 = pandas.merge(df2, how="outer", on=["id"])

上面的問題是，除“ id”外，df1 / df2公用的列將兩次（以_x,_y前綴）添加到df1。 我該怎么做：

# Do outer join from df2 to df1, matching items by "id" but not adding
# columns that are redundant (df1 takes precedence if the values disagree)
df1.combine_first(df2, on=["id"])

如何才能做到這一點？

Answer 1

如果您嘗試將df2列合並到df1同時排除任何多余的列，則應該可以進行以下操作。

df1.set_index("id", inplace=True)
df2.set_index("id", inplace=True)
df3 = df1.merge(df2.ix[:,df2.columns-df1.columns], left_index=True, right_index=True, how="outer")

但是，這顯然不會用df2值更新df1 任何值，因為它僅引入了非冗余列。 但是，既然您說過df1將優先於所有不同意的值，也許這可以解決問題？

熊貓combin_first與特定的索引列？

問題描述

1 個解決方案

解決方案1
1 已采納 2013-03-28 01:43:55

熊貓combin_first與特定的索引列？

問題描述

1 個解決方案

解決方案1 1 已采納 2013-03-28 01:43:55

解決方案1
1 已采納 2013-03-28 01:43:55