简体   繁体   English

在 Python 中合并三个不同的数据框

[英]Merge three different dataframes in Python

I want to merge three data frames in Python, the code I have now provide me with some wrong outputs.我想在 Python 中合并三个数据帧,我现在提供的代码为我提供了一些错误的输出。

This is the first data frame这是第一个数据框

 df_1
   Year    Month        X_1    Y_1
 0  2021    January     $90     $100
 1  2021    February    NaN     $120
 2  2021    March       $100    $130
 3  2021    April       $110    $140
 4  2021    May         Nan     $150
 5  2019    June        $120    $160
   

This is the second data frame这是第二个数据框

  df_2
    Year    Month       X_2     Y_2
  0 2021    January     Nan     $120
  1 2021    February    NaN     $130
  2 2021    March       $80     $140
  3 2021    April       $90     $150
  4 2021    May         Nan     $150
  5 2021    June        $120    $170

This is the third data frame这是第三个数据框

   df_3
     Year    Month      X_3     Y_3
  0 2021    January     $110    $150
  1 2021    February    $140    $160
  2 2021    March       $97     $170
  3 2021    April       $90     $180
  4 2021    May         Nan     $190
  5 2021    June        $120    $200

The idea is to combine them into one data frame like this:我们的想法是将它们组合成一个数据框,如下所示:

 df_combined
    Year    Month       X_1    Y_1    X_2    Y_2    X_3    Y_3
0   2019    January     $90     $100  NaN    $120   $110    $150 
1   2019    February    NaN     $120  NaN    $130   $140    $160
2   2019    March       $100    $130  $80    $140   $97     $170
3   2019    April       $110    $140  $90    $150   $90     $180
4   2019    May         Nan     $150  Nan    $150   Nan     $190
5   2019    June        $120    $160  $120   $170   $120    $200

The code I have for now does not give me the correct outcome, only df_3 has to the correct numbers.我现在拥有的代码没有给我正确的结果,只有 df_3 有正确的数字。

 # compile the list of data frames you want to merge
 import functools as ft
 from functools import reduce
    
 data_frames = [df_1, df_2, df_3]
 df_merged = reduce(lambda  cross, right: pd.merge(cross,right,on=['Year'],
                                                           how='outer'),data_frames)
  #remove superfluous columns
  df_merged.drop(['Month_x', 'Month_y'], axis=1, inplace=True)

你可以试试

df_1.merge(df_2, how='left', on=['Year', 'Month']).merge(df_3, how='left', on=['Year', 'Month'])

One option of probably many is to do可能很多的一种选择是做

from functools import reduce
import pandas as pd

idx = ["Year", "Month"]
new_df = reduce(pd.DataFrame.join, (i.set_index(idx) for i in dataframes)).reset_index()

or或者

reduce(lambda x, y: pd.merge(x, y, how="outer", on=["Year", "Month"]), dataframes)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM