简体   繁体   中英

Join pandas data frames based on columns and column of lists

I'm trying to join two data frames based on multiple columns. However, one of the conditions is not straight forward, because one column in one data frame exists in column of lists in the other data frame. As following

df_a:

在此处输入图像描述

df_b:

在此处输入图像描述

I want to join both data frames based on node,channel,value. However, value is inside a list in column trail

I have tried to create a dictionary from df_a to have the node and channel as keys, but that won't be unique, I added the value as a key but then I couldn't look that value up. Any tip to solve this issue is much appreciated

Update per comment by @JonClements if not always the first element try:

(df_b.assign(value=df_b['trail'].str.split(','))
     .explode('value')
     .merge(df_a, on=['node', 'channel', 'value']))

Try, if value is always the first element in trail:

import pandas as pd

df_a = pd.DataFrame({'node':['a']*4, 
                     'channel':[1,2,1,2],   
                     'wavelength':[555, 666, 555, 666], 
                     'value':['1-111-5', '1-222-6', '1-111-7', '1-222-8']})

df_b = pd.DataFrame({'node':['a']*4,
                     'channel':[1,2,1,2],
                     'trail':['1-111-5,1-a-5,2-r-3', 
                             '1-222-6,2-4-5,1-d-3', 
                             '1-111-7,1-a-2,1-f-t', 
                             '1-222-8,1-e-4'],
                    'power':[10,12,13,12]})

df_out = (df_b.assign(value=df_b['trail'].str.split(',').str[0])
              .merge(df_a, on=['node', 'channel', 'value']))
print(df_out)

Output:

  node  channel                trail  power    value  wavelength
0    a        1  1-111-5,1-a-5,2-r-3     10  1-111-5         555
1    a        2  1-222-6,2-4-5,1-d-3     12  1-222-6         666
2    a        1  1-111-7,1-a-2,1-f-t     13  1-111-7         555
3    a        2        1-222-8,1-e-4     12  1-222-8         666

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM