简体   繁体   中英

MOVING column values based on values from another data frame

I have two dataframes such as: df1:

  Category                   Keywords
0    Fruit            ['apple', 'pear', 'plum', 'grape']
1    Color            ['red', 'purple', 'green']

df2:

              Items
0        plum
1        purple
2        pear
3        orange
4        apple
5        rainbow

whenever I find any values in df2 from the keyword list of df1, I want to MOVE the found values into new list or dataframe; which means the values are taken from df2 and moved to df3. The results will be as follows:

df2:

              Items

0        orange
1        rainbow

df3:

              Items
0        plum
1        purple
2        pear
3        apple

or list of items as [plum, purple, pear, apple]

A similar but not exact question would be: Use keywords from dataframe to detect if any present in another dataframe or string

EDIT: items such as "pears" or "pearl" should still be identified for the keyword "pear"

items_list = df1['Keywords'].tolist()
items_list = [item for sub_list in items_list for item in sub_list]

df3 = df2.loc[~df2['Items'].isin(items_list)]
df2 = df2.loc[df2['Items'].isin(items_list)]

You can use str.contains() and check for a regex with |. Also, I am using explode() to convert the keyword to a list.

import pandas as pd

c = ['Category','Keywords']
d = [['Fruit',['apple', 'pear', 'plum', 'grape']],
     ['Color',['red', 'purple', 'green']]]
df1 = pd.DataFrame(d,columns=c)


df2 = pd.DataFrame({'Items':['plum','purple','pear','orange',
                            'apple','rainbow','pearl','pears',
                            'peary','pineapple','plumber']})

print (df1)
print (df2)

keywords = df1.Keywords.explode().explode().to_list()
key_dict = r'({})'.format('|'.join(keywords))
mask = df2.Items.str.contains(key_dict)

df3 = df2[mask]
df2 = df2[~mask]

print (df2)
print (df3)

This will give you:

Original df1:

  Category                    Keywords
0    Fruit  [apple, pear, plum, grape]
1    Color        [red, purple, green]

Original df2:

        Items
0        plum
1      purple
2        pear
3      orange
4       apple
5     rainbow
6       pearl
7       pears
8       peary
9   pineapple
10    plumber

New df3: contains all the items that were part of the keyword

        Items
0        plum
1      purple
2        pear
4       apple
6       pearl
7       pears
8       peary
9   pineapple
10    plumber

Updated df2:

     Items
3   orange
5  rainbow

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM