简体   繁体   English

在 Python 中使用 Pandas 组合三个数据帧

[英]Combine Three DataFrames Using Pandas in Python

I am trying to combine three pandas DataFrames in python.我正在尝试在 python 中组合三个 Pandas DataFrames。 Below are the three DataFrames that I am trying to combine as well as my desired output (where NaN is null).下面是我试图组合的三个 DataFrame 以及我想要的输出(其中 NaN 为空)。 I know that joining the two tables with a left merge did not work.我知道用左合并连接两个表是行不通的。 What is the correct sequence of two merges to achieve the desired output?实现所需输出的两次合并的正确顺序是什么? (Does not have to be one line of code if it is not possible) Thanks! (如果不可能,则不必是一行代码)谢谢!

df1
    +--------+
    | x      |
    +--------+
    | 1      |
    | 2      |
    | 3      |
    +--------+

df2
    +--------+---+
    | x      | b |
    +--------+---+
    | 1      | A |
    | 1      | B |
    | 1      | C |
    | 2      | D |
    | 2      | E |
    | 2      | F |
    | 3      | G |
    +--------+---+

df3
    +--------+---+
    | x      | c |
    +--------+---+
    | 1      | L |
    | 1      | M |
    | 2      | N |
    | 3      | O |
    | 3      | P |
    | 3      | Q |
    +--------+---+

df_result
    +----------------+-----+-----+
    | x              |  b  |  c  |
    +----------------+-----+-----+
    | 1              | A   | NaN |
    | 1              | B   | NaN |
    | 1              | C   | NaN |
    | 1              | NaN | L   |
    | 1              | NaN | M   |
    | 2              | D   | NaN |
    | 2              | E   | NaN |
    | 2              | F   | NaN |
    | 2              | NaN | N   |
    | 3              | G   | NaN |
    | 3              | NaN | O   |
    | 3              | NaN | P   |
    | 3              | NaN | Q   |
    +----------------+-----+-----+

The following attempts do not result in the the df_result DataFrame as shown above:以下尝试不会产生如上所示的 df_result DataFrame:

attempt1:
df_step1 = df1.merge(df2, on='x', how='left')
df_result = df_step1.merge(df3, on='x', how='left')
df_result

I have tried the above with a varying combination of left, right, outer and inner joins / merges我已经尝试了上述的左、右、外和内连接/合并的不同组合

attempt2:
df_result = pd.concat([table1, table2, table3], axis=1, sort='false')
df_result

This also does not produce the desired df_result DataFrame.这也不会产生所需的 df_result DataFrame。

Perhaps I need a combination of a concat and merge?也许我需要一个 concat 和 merge 的组合? Or, because it is a new row for each entry basically, I could just write a for loop that enters the information for each of these entries in a new series in the df.或者,因为它基本上是每个条目的新行,所以我可以编写一个 for 循环,在 df 的新系列中输入这些条目中的每一个的信息。 Something like this:像这样的东西:

for i in range(len(df1.index)):
        for j in range (len(df2.index)):
            df_result = df_result.append(df2[j])

        for k in range (len(df3.index)):
            df_result = df_result.append(df3[k])

I found that concatenating the second and third dataframes, and then after that, sorting by the x column allowed me to produce a df that matches the expected output defined above in df_result :我发现连接第二个和第三个数据帧,然后在此之后,按x列排序允许我生成与df_result定义的预期输出匹配的 df :

df1 = pd.DataFrame({'x': [1,2,3]})
df2 = pd.DataFrame({'x': [1,1,1,2,2,2,3],
                    'b': ['A', 'B', 'C', 'D', 'E', 'F', 'G']})
df3 = pd.DataFrame({'x': [1,1,2,3,3,3],
                    'c': ['L', 'M', 'N', 'O', 'P', 'Q']})


pd.concat([df2, df3], sort=False).sort_values('x').set_index('x', drop=True)

    b   c
x       
1   A   NaN
1   B   NaN
1   C   NaN
1   NaN L
1   NaN M
2   D   NaN
2   E   NaN
2   F   NaN
2   NaN N
3   G   NaN
3   NaN O
3   NaN P
3   NaN Q

Is it what you need?这是你需要的吗?

import pandas as pd
df2 = pd.DataFrame(data=[(1, 'A'),
 (1, 'B'),
 (1, 'C'),
 (2, 'D'),
 (2, 'E'),
 (3, 'F'),
 (3, 'G')], columns = ("x","b"))

df3 = pd.DataFrame(data=[(1, 'L'),
 (1, 'M'),
 (2, 'N'),
 (3, 'L'),
 (3, 'O'),
 (3, 'P'),
 (3, 'Q')], columns= ("x","c"))

df2["c"] = float('nan')
df3["b"] = float('nan')

df_result=pd.concat((df2,df3), sort=True)[["x","b","c"]]
df_result.sort_values("x")

I got:我有:

   x     b     c
0  1    A  NaN
1  1    B  NaN
2  1    C  NaN
0  1  NaN    L
1  1  NaN    M
3  2    D  NaN
4  2    E  NaN
2  2  NaN    N
5  3    F  NaN
6  3    G  NaN
3  3  NaN    L
4  3  NaN    O
5  3  NaN    P
6  3  NaN    Q

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM