简体   繁体   中英

How to combine a dataframe normal column with set column in pandas?

have got a dataframe df

item        Space   rem_spc     nxt_item
Pineapple   0.5     0.5         {Mango, Grape}

need to combine df['item'] and df['nxt_item'] into single column df['com_item'] as given below

item        Space   rem_spc     nxt_item        com_item
Pineapple   0.5     0.5         {Mango, Grape}  Pineapple,Mango,Grape

Thanks!

Use Series.str.join for set columns and add item values:

df['com_item'] = df['item'] + ',' + df['nxt_item'].str.join(',')
print (df)
        item  Space  rem_spc        nxt_item               com_item
0  Pineapple    0.5      0.5  {Grape, Mango}  Pineapple,Grape,Mango

Or use Series.str.cat :

df['com_item'] = df['item'].str.cat( df['nxt_item'].str.join(','), sep=',')

If need create first sets for deduplicated values in com_item is possible add value item to set with join in list comprehension:

df['com_item'] = [','.join(b.union({a})) for a, b in zip(df['item'],df['nxt_item'])]

Sample data for see difference of solutions:

print (df)
        item  Space  rem_spc           nxt_item
0  Pineapple    0.5      0.5  {'Mango','Grape'}
1      Mango    0.5      0.5  {'Mango','Grape'}

df['nxt_item'] = df['nxt_item'].apply(ast.literal_eval)

df['com_item1'] = [','.join(b.union({a})) for a, b in zip(df['item'],df['nxt_item'])]
df['com_item2'] = df['item'] + ',' + df['nxt_item'].str.join(',')

print (df)
        item  Space  rem_spc        nxt_item              com_item1  \
0  Pineapple    0.5      0.5  {Grape, Mango}  Grape,Mango,Pineapple   
1      Mango    0.5      0.5  {Grape, Mango}            Grape,Mango   

               com_item2  
0  Pineapple,Grape,Mango  
1      Mango,Grape,Mango  

if you want nxt_item as a list:

df['com_item'] = df.apply(lambda row: list(row['nxt_item']) + [row['item']] ,axis=1)

output:

        item  Space  rem_spc        nxt_item                   com_item
0  pineapple    0.5      0.5  {Grape, Mango}  [Grape, Mango, pineapple]

if you want as a string:

df['com_item'] = df.apply(lambda row: ' '.join(list(row['nxt_item']) + [row['item']] ),axis=1)

output:

        item  Space  rem_spc        nxt_item               com_item
0  pineapple    0.5      0.5  {Grape, Mango}  Grape Mango pineapple

With pandas.Series.strip :

df["com_item"] = df["item"] + "," + df["nxt_item"].astype(str).str.strip("{}")

Output:

print(df)

        item  Space  rem_spc        nxt_item                 com_item
0  Pineapple    0.5      0.5  {Mango, Grape}  Pineapple, Mango, Grape

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM