简体   繁体   中英

Update cell in dataframe1 based on value from list in dataframe2

I have a dataframe1 with a column that has cell entries like so (each cell is a set of comma separated words) - "are, boy, cat, dog, ear, far, gone" .

Dataframe2 has a column with cell entries like so (each cell is a single letter or word) - "are" or "boy" or "gone" .

I want to add a column to dataframe1 that will have a boolean entry if a word in each cell in dataframe1 has the word in dataframe2. For example,

DF1 = (are, boy, cat, dog, ear, far, gone), (home, guy, tall, egg), (cat, done, roof, grass), etc.....

DF2 = (are), (boy), (gone), etc....

New column cell value in dataframe1 = (1), (0), (1), etc....

Assuming these are your inputs,

import pandas as pd

df1 = pd.DataFrame(
    {'A': [
        'are, boy, cat, dog, ear, far, gone',
        'home, guy, tall, egg',
        'cat, done, roof, grass',
    ]}
)

df2 = pd.DataFrame({'A': ['are', 'boy', 'gone']})

print('%s\n%s' % (df1, df2))
#                                     A
# 0  are, boy, cat, dog, ear, far, gone
# 1                home, guy, tall, egg
# 2         cat, done, roof, grass, boy
#       A
# 0   are
# 1   boy
# 2  gone

We can use Series.str.split() to convert the comma-separated strings to lists:

df1['A'].str.split(r'\s*,\s*')
# 0    [are, boy, cat, dog, ear, far, gone]
# 1                  [home, guy, tall, egg]
# 2           [cat, done, roof, grass, boy]
# Name: A, dtype: object

The argument to split r'\\s*,\\s*' is a RegEx pattern matching commas and any whitespace between. The r prefix means to treat the string literal as a raw string .

Then we apply set.isdisjoint() to check which cells do not contain the values in df2 :

df1['A'].str.split(r'\s*,\s*').apply(set(df2['A']).isdisjoint)
# 0    False
# 1     True
# 2     True
# Name: A, dtype: bool

And then finally negate this and assign to a new column 'B' :

df1['B'] = ~df1['A'].str.split(r'\s*,\s*').apply(set(df2['A']).isdisjoint)
print(df1)
#                                     A      B
# 0  are, boy, cat, dog, ear, far, gone   True
# 1                home, guy, tall, egg  False
# 2              cat, done, roof, grass  False

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM