简体   繁体   English

Pandas 合并没有保留我想的那么多行

[英]Pandas merge doesn't retain as many rows as I would think

Consider the following two data frames考虑以下两个数据帧

df1 = pd.DataFrame({'a': ['foo', 'bar'], 'b': [1, 2]})
df2 = pd.DataFrame({'a': ['foo', 'baz'], 'c': [3, 4]})

Running跑步

df3 = pd.merge(df1, df2, on='a')

Yields产量

     a  b  c
0  foo  1  3

But why not the following?但为什么不下面呢?

     a  b  c
0  foo  1  3
1  bar  2  -
1  baz  -  4

What do I need to tell python to get it to output both rows?我需要告诉 python 什么才能把它送到 output 两行?

A pandas merge does by default an inner join, if you are familiar with database joins.如果您熟悉数据库连接,则 pandas 合并默认情况下会执行内部连接。 That means it only returns the rows that have a matching entry in both the left and right dataframe.这意味着它只返回在左右 dataframe 中具有匹配条目的行。 For you, that is just 'foo'.对你来说,这只是'foo'。

You can change that by setting the how argument.您可以通过设置how参数来更改它。 If you want all rows from both left, and right set it to outer , if you want to keep all from the left frame set it to left and if you want to keep all from the right frame set it to right .如果您想要左侧和右侧的所有行,请将其设置为outer ,如果您想保留左侧框架中的所有行,请将其设置为left ,如果您想保留右侧框架中的所有行,请将其设置为right

pd.merge(df1, df2, on='a', how='outer') will join on matching keys with all non matching keys returned as a new row will NaN filling in the blanks. pd.merge(df1, df2, on='a', how='outer')将加入匹配的键,所有不匹配的键作为新行返回, NaN将填充空白。

try here for an overview of different types of SQL style joins which merge uses as basis.尝试在这里查看merge用作基础的不同类型的 SQL 样式连接的概述。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM