简体   繁体   English

Pandas 两列合并

[英]Pandas Merging on two columns

I am new to pandas and want to merge two tables with the help of two columns.我是 pandas 的新手,想在两列的帮助下合并两个表。 A row can only be identified with both columns in combination.一行只能通过两列的组合来识别。

Example:
Table 1.               Table2.  
Index A B C D.         Index A B C 
 1.   a a c d.         1.    a b j
 2.   a b e f.         2.    a c k
 3.   a c g h


Result:

Table
Index A B C D E
 1.   a a c d na
 2.   a b e f j
 3.   a c g h k


I tried something like:

df_new = df_1.merge(df_2, on=[‘A’,’B’]

But I got the error B is not unique但我得到错误 B 不是唯一的

(In the real case the table contain every value in a and b multiple times, but the combination is unique.) (在实际情况下,该表多次包含 a 和 b 中的每个值,但组合是唯一的。)

Many thanks in advance.提前谢谢了。

Take the coulombs you wish to experiment with first, and then use this code as an example.先拿你想试验的库仑,然后用这段代码作为例子。

a_dataframe["AB"] = a_dataframe["A"] + a_dataframe["B"]

Then add the rest of the coulombs.然后加上库仑的rest。 There could be a simpler solution.可能有一个更简单的解决方案。

In my case it works:就我而言,它有效:

import pandas as pd

df1 = pd.DataFrame({"A":["a","a","a"], 
                    "B":["a", "b", "c"], 
                    "C":["c", "e", "g"],
                    "D":["d", "f", "h"]})

df2 = pd.DataFrame({"A":["a", "a"], 
                    "B":["b", "c"], 
                    "C":["j", "k"]})

Output: Output:

pd.merge(df1, df2, on=["A", "B"], how="left").rename(columns={"C_x":"C", "C_y":"E"})

    A   B   C   D   E
0   a   a   c   d   NaN
1   a   b   e   f   j
2   a   c   g   h   k

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM