简体   繁体   English

pandas - 合并两个数据帧覆盖并指定要保留的列

[英]pandas - merge two data frames overwrite and specify which columns to keep

Im trying to merge to panda dataframes, although what I want may not actually be a merge. 我试图合并到熊猫数据帧,虽然我想要的可能实际上不是合并。

I have a two columns in two frames that match, one column shares unique values that can be used to join. 我有两个匹配的两列,一列共享可用于连接的唯一值。 the other column has one empty field and one populated one. 另一列有一个空字段和一个填充的字段。

I want to overwrite the emtpy fields whilst matching on the unique fields but only keep the column thats overwritten, I do not want the rest of the columns from the second DataFrame. 我想在匹配唯一字段时覆盖emtpy字段,但只保留被覆盖的列,我不希望第二个DataFrame中的其余列。

hopefully the below will explain a little further 希望以下内容能够进一步解释

>>> animals = [{"animal" : "dog", "name" : "freddy", "food" : ""},{"animal" : "cat", "name" : "dexter", "food" : ""},{"animal" : "dog", "name" : "lou lou", "food" : ""}]
>>> foods = [{"name" : "freddy", "food" : "dog mix", "brand" : "doggys dog"},{"name" : "dexter", "food" : "fussy cat mix", "brand" : "fish fishy"},{"name" : "lou lou", "food" : "bones", "brand" : "i was a cow"}]
>>> a_pd = pd.DataFrame(animals)
>>> a_pd
  animal food     name
0    dog        freddy
1    cat        dexter
2    dog       lou lou
>>> f_pd = pd.DataFrame(foods)
>>> f_pd
         brand           food     name
0   doggys dog        dog mix   freddy
1   fish fishy  fussy cat mix   dexter
2  i was a cow          bones  lou lou
>>>
>>>
>>> animal_data = a_pd.merge(f_pd, on='name', how='left')
>>> animal_data
  animal food_x     name        brand         food_y
0    dog          freddy   doggys dog        dog mix
1    cat          dexter   fish fishy  fussy cat mix
2    dog         lou lou  i was a cow          bones
>>>

I should just have food and I dont want the brand (also to note this is sample data and the live data has a lot more columns 我应该有食物,我不想要品牌(还要注意这是样本数据,实时数据有更多的列

desired results 期望的结果

>>> animal_data
  animal        name            food
0    dog      freddy         dog mix
1    cat      dexter   fussy cat mix
2    dog     lou lou           bones

Use: 采用:

animal_data = a_pd.merge(f_pd, on='name', how='left', suffixes=('_x','')).drop('food_x', axis=1)

Output: 输出:

  animal     name        brand           food
0    dog   freddy   doggys dog        dog mix
1    cat   dexter   fish fishy  fussy cat mix
2    dog  lou lou  i was a cow          bones

Or 要么

a_pd[['animal','name']].merge(f_pd, how='left')

Output: 输出:

  animal     name        brand           food
0    dog   freddy   doggys dog        dog mix
1    cat   dexter   fish fishy  fussy cat mix
2    dog  lou lou  i was a cow          bones

You can using update 您可以使用update

a_pd.set_index('name',inplace=True)
a_pd.update(f_pd.set_index('name'))
a_pd
Out[68]: 
        animal           food
name                         
freddy     dog        dog mix
dexter     cat  fussy cat mix
lou lou    dog          bones
a_pd.reset_index()
Out[69]: 
      name animal           food
0   freddy    dog        dog mix
1   dexter    cat  fussy cat mix
2  lou lou    dog          bones

Or we using map 或者我们使用map

a_pd.food=a_pd.name.map(f_pd.set_index('name').food)
a_pd
Out[74]: 
  animal           food     name
0    dog        dog mix   freddy
1    cat  fussy cat mix   dexter
2    dog          bones  lou lou

I'd either try drop or just selecting columns you want to keep: 我要么尝试drop要么只选择要保留的列:

animal_data.drop(['food_x', 'brand'], axis=1, inplace=True)

or 要么

animal_data = animal_data[['animal', 'name', 'food']]

It might be best to merge views of the dataframes that do not contain the columns you don't want in the merged dataframe. 最好合并不包含合并数据框中不需要的列的数据框的视图。 For example: 例如:

a_cols = ['animal', 'name']
f_cols = ['food', 'name']
a_pd[a_cols].merge(f_pd[f_cols], on='name', how='left')

This may be faster and may save you some memory if working with extremely large dataframes, as only the relevant columns are carried forward in the merge. 这可能更快,如果使用非常大的数据帧,可能会节省一些内存,因为只有相关的列在合并中结转。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM