简体   繁体   English

在熊猫中合并具有相同行和索引的两个数据帧

[英]merging two dataframes with same rows and indexes in pandas

I'm trying to merge two pandas dataframes that have common row indexes and common columns 0,1,2 but different column 3, so the resulting dataframe has columns from both:我正在尝试合并两个具有共同行索引和共同列 0,1,2 但不同列 3 的 Pandas 数据帧,因此生成的数据帧具有来自两者的列:

First dataframe:第一个数据框:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 817 entries, 0 to 816
Data columns (total 3 columns):
0    817 non-null int64
1    817 non-null int64
2    817 non-null float64
dtypes: float64(1), int64(2)
memory usage: 19.2 KB


0   1       2
0   1950    1   -0.060310
1   1950    2   0.626810
2   1950    3   -0.008128
3   1950    4   0.555100
4   1950    5   0.071577

Second dataframe:第二个数据框:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 817 entries, 0 to 816
Data columns (total 3 columns):
0    817 non-null int64
1    817 non-null int64
2    817 non-null float64
dtypes: float64(1), int64(2)
memory usage: 19.2 KB

0   1       2
0   1950    1   0.92
1   1950    2   0.40
2   1950    3   -0.36
3   1950    4   0.73
4   1950    5   -0.59

So far I tried with merge:到目前为止,我尝试过合并:

pd.merge(df, df2, left_index=True, right_index=True, how='outer')

But results are not what I expect:但结果并不是我所期望的:

    0_x     1_x     2_x     0_y     1_y     2_y
0   1950    1   -0.060310   1950    1   0.92
1   1950    2   0.626810    1950    2   0.40
2   1950    3   -0.008128   1950    3   -0.36
3   1950    4   0.555100    1950    4   0.73
4   1950    5   0.071577    1950    5   -0.59

And with concat:并使用连接:

pd.concat([df, df2], axis=1, ignore_index=True).head()


0   1       2       3       4       5
0   1950    1   -0.060310   1950    1   0.92
1   1950    2   0.626810    1950    2   0.40
2   1950    3   -0.008128   1950    3   -0.36
3   1950    4   0.555100    1950    4   0.73
4   1950    5   0.071577    1950    5   -0.59

I'm expecting something like我期待像

0   1       2       3     
0   1950    1   -0.060310    0.92
1   1950    2   0.626810     0.40
2   1950    3   -0.008128    -0.36
3   1950    4   0.555100     0.73
4   1950    5   0.071577     -0.59

EDIT : Maybe I was unclear and I apologize if so, I'm trying to add the last column from the second dataset in the resulting one, so I have the same year, month, value1 and then value2 columns编辑:也许我不清楚,如果是这样,我很抱歉,我试图在结果中添加第二个数据集中的最后一列,所以我有相同的年、月、值1和值2列

I would try:我会尝试:

pd.merge(df, df2, on=['0', '1'])

maybe也许

pd.merge(df, df2, on=[0,1]

Just do:只需这样做:

df.merge(df2, on=1)

you don't need to add index column, once they have same index.一旦它们具有相同的索引,您就不需要添加索引列。 And it can be a inner join by default.默认情况下,它可以是内部联接。

Your error was made the merge just by index, the merge function doesn't know that the column 1 is equal in both data.您的错误是仅通过索引进行合并,合并函数不知道第 1 列在两个数据中是否相等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM