简体   繁体   中英

Conditional merge of Pandas multi-index dataframes

I have two dataframes.

The first is a dataframe of customers with an accompanying month within which a shipment must be fulfilled.

The second is a dataframe that contains all possible combinations of dates within a horizon, and customers. For example, a three-day horizon combination, with one customer, 'ABC' starting '2020-01-01' would look like.

Date        Customer
2020-01-01  'ABC'
2020-01-02  'ABC'
2020-01-03  'ABC'

I am trying to join the below two dateframes such that I get a combination of customer:dates such that the dates can only occur WITHIN the delivery month.

df_a.head(5)

>>> month,    client
    2020-01   'ABC'
              'DEF'
    2020-02   'GHI'
              'JKL'
              'MNO'
    2020-03   'PQR'


    df_b.head(5)
    
>>> dates           client
    '2020-01-01'    'ABC'
    '2020-01-01'    'DEF'
    '2020-01-02'    'ABC'
    '2020-01-02'    'DEF'
    '2020-01-03'    'ABC'
    '2020-01-03'    'DEF'

Desired output:

df_joined.head(5)

customer     dates
'ABC'        2020-01-01
'ABC'        2020-01-02
'ABC'        2020-01-03
'DEF'        2020-01-01
'DEF'        2020-01-02
'DEF'        2020-01-03
'GHI'        2020-02-01
'GHI'        2020-02-02
'GHI'        2020-02-03
'JKL'        2020-02-01
'JKL'        2020-02-02
'JKL'        2020-02-03

I have attempted to accomplish this with merge and query

ie.

ship_dates = df1.merge(df2, left_on='dates', right_on='client')\
                .query('dates >= month')\
                .set_index(['customer','dates'])

but I receiving a KeyError for dates.

All help greatly appreciated!

Managed to find a solution.

I created a month:year column in each dataframe:

df1['mnth_year'] = pd.to_datetime(df1['dates']).dt.strftime('%B-%Y')
df2['month_year'] = pd.to_datetime(df2['month']).dt.strftime('%B-%Y')

then merged, using.query() equating mnth_yr with month_year :

dates = df1.merge(df2, how='inner', left_on='customers', 
                             right_on='customer')\
           .query('mnth_yr == month_year')\
           .set_index(['customer', 'dates'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM