简体   繁体   English

熊猫在索引和列上合并多个数据框

[英]Pandas Merge multiple dataframes on index and column

I am trying to Merge multiple dataframes to one main dataframe using the datetime index and id from main dataframe and datetime and id columns from other dataframes 我正在尝试使用主数据帧中的datetime索引和id以及其他数据帧中的datetime和id列将多个数据帧合并到一个主数据帧中

Main dataframe 主数据框

DateTime | id | data
(Df.Index)
---------|----|------
2017-9-8 |  1 |  a
2017-9-9 |  2 |  b

df1 df1

id | data1 | data2 | DateTime
---|-------|-------|---------
1  |  a    |   c   | 2017-9-8
2  |  b    |   d   | 2017-9-9
5  |  a    |   e   | 2017-9-20

df2 df2

id | data3 | data4 | DateTime
---|-------|-------|---------
1  |  d    |   c   | 2017-9-8
2  |  e    |   a   | 2017-9-9
4  |  f    |   h   | 2017-9-20

The main dataframe and the other dataframes are in different dictionaries. 主数据框和其他数据框位于不同的字典中。 I want to read from each dictionary and merge when the joining condition (datetime, id) is met 我想从每个字典中读取并在满足加入条件(日期时间,ID)时合并

for sleep in dictOfSleep#MainDataFrame:
    for sensorDevice in dictOfSensor#OtherDataFrames:
        try:
  dictOfSleep[sleep]=pd.merge(dictOfSleep[sleep],dictOfSensor[sensorDevice], how='outer',on=['DateTime','id'])

        except:
            print('Join could not be done')

Desired Output: 所需输出:

DateTime | id | data | data1 | data2 | data3 | data4
(Df.Index)
---------|----|------|-------|-------|-------|-------|
2017-9-8 |  1 |  a   |  a    |   c   |   d   |   c   |
2017-9-9 |  2 |  b   |  b    |   d   |   e   |   a   |

I'm not sure how your dictionaries are set up so you will most likely need to modify this but I'd try something like: 我不确定您的词典的设置方式,因此您很可能需要修改它,但是我会尝试类似的方法:

for sensorDevice in dictOfSensor:
    df = dictOfSensor[sensorDevice]
    # set df index to match the main_df index
    df = df.set_index(['DateTime'])
    # try join (not merge) when combining on index
    main_df = main_df.join(df, how='outer')

Alternatively, if the id column is very important you can try to first reset your main_df index and then merging. 另外,如果id列非常重要,则可以尝试首先重置main_df索引,然后合并。

main_df = main_df.reset_index()
for sensorDevice in dictOfSensor:
    df = dictOfSensor[sensorDevice]
    # try to merge on both columns
    main_df = main_df.merge(df, how='outer', on=['DateTime', 'id])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM