简体   繁体   English

根据 dataframe2 中列表的值更新 dataframe1 中的单元格

[英]Update cell in dataframe1 based on value from list in dataframe2

I have a dataframe1 with a column that has cell entries like so (each cell is a set of comma separated words) - "are, boy, cat, dog, ear, far, gone" .我有一个 dataframe1,其中有一列具有像这样的单元格条目(每个单元格是一组逗号分隔的单词) - "are, boy, cat, dog, ear, far, gone"

Dataframe2 has a column with cell entries like so (each cell is a single letter or word) - "are" or "boy" or "gone" . Dataframe2 有一列包含像这样的单元格条目(每个单元格是一个字母或单词) - "are""boy""gone"

I want to add a column to dataframe1 that will have a boolean entry if a word in each cell in dataframe1 has the word in dataframe2.我想向 dataframe1 添加一列,如果 dataframe1 中每个单元格中的单词在 dataframe2 中都有一个单词,则该列将具有布尔条目。 For example,例如,

DF1 = (are, boy, cat, dog, ear, far, gone), (home, guy, tall, egg), (cat, done, roof, grass), etc..... DF1 =(是,男孩,猫,狗,耳朵,远,走了),(家,家伙,高,蛋),(猫,完成,屋顶,草),等等.....

DF2 = (are), (boy), (gone), etc.... DF2 = (are), (boy), (gone), 等等....

New column cell value in dataframe1 = (1), (0), (1), etc.... dataframe1 中的新列单元格值 = (1)、(0)、(1) 等......

Assuming these are your inputs,假设这些是您的输入,

import pandas as pd

df1 = pd.DataFrame(
    {'A': [
        'are, boy, cat, dog, ear, far, gone',
        'home, guy, tall, egg',
        'cat, done, roof, grass',
    ]}
)

df2 = pd.DataFrame({'A': ['are', 'boy', 'gone']})

print('%s\n%s' % (df1, df2))
#                                     A
# 0  are, boy, cat, dog, ear, far, gone
# 1                home, guy, tall, egg
# 2         cat, done, roof, grass, boy
#       A
# 0   are
# 1   boy
# 2  gone

We can use Series.str.split() to convert the comma-separated strings to lists:我们可以使用Series.str.split()将逗号分隔的字符串转换为列表:

df1['A'].str.split(r'\s*,\s*')
# 0    [are, boy, cat, dog, ear, far, gone]
# 1                  [home, guy, tall, egg]
# 2           [cat, done, roof, grass, boy]
# Name: A, dtype: object

The argument to split r'\\s*,\\s*' is a RegEx pattern matching commas and any whitespace between. split r'\\s*,\\s*'是一个正则表达式模式,匹配逗号和中间的任何空格。 The r prefix means to treat the string literal as a raw string . r前缀意味着将字符串文字视为原始字符串

Then we apply set.isdisjoint() to check which cells do not contain the values in df2 :然后我们应用set.isdisjoint()来检查哪些单元格包含df2的值:

df1['A'].str.split(r'\s*,\s*').apply(set(df2['A']).isdisjoint)
# 0    False
# 1     True
# 2     True
# Name: A, dtype: bool

And then finally negate this and assign to a new column 'B' :然后最后否定这一点并分配给新列'B'

df1['B'] = ~df1['A'].str.split(r'\s*,\s*').apply(set(df2['A']).isdisjoint)
print(df1)
#                                     A      B
# 0  are, boy, cat, dog, ear, far, gone   True
# 1                home, guy, tall, egg  False
# 2              cat, done, roof, grass  False

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据 DataFrame2 中的值在 DataFrame1 中创建列 - Create column in DataFrame1 based on values from DataFrame2 如何根据 dataframe1 1 中给定的值,在两个 pandas 数据帧之间循环以选择要从 dataframe2 中提取的行范围 - How to loop between two pandas dataframes to select a range of rows to be extracted from dataframe2, based on a value given in the dataframe1 1 将 dataframe1 中的单元格值替换为 dataframe2 中先前确定的值 - Replace cell values in dataframe1 with previously determined values in dataframe2 在类似于sql like运算符的dataframe1列中找到dataframe2列,并使用pandas列出dataframe2的结果 - find dataframe2 colum in dataframe1 column similar to the sql like operator and list the result from dataframe2 using pandas 在 Python 中的特定行值之后,使用来自 dataframe2 和 select 中的所有行的值过滤 dataframe1 - Filter dataframe1 with values from dataframe2 and select all rows in dataframe 1 after a particular row value in Python 如何根据在 dataframe2 中创建的列在 dataframe1 中创建列,该列是通过在 dataframe1 上使用 groupby() 导出的 - How create a column in dataframe1 based on a column created in dataframe2, which is derived by using groupby() on dataframe1 Python pandas - 根据 dataframe1 中的另一列将 dataframe1 中的列与 dataframe2 中的列分开 - Python pandas - Divide a column in dataframe1 with a column in dataframe2 based on another column in dataframe1 交叉连接/合并dataframe1以基于dataframe1中的列创建组合的dataframe2 - cross join/merge dataframe1 to create dataframe2 of combinations based on column in dataframe1 使用pandas中dataframe1中的一列的值查找dataframe2中特定列的值 - find the value of a specific column in dataframe2 using the value of one column in the dataframe1 in pandas 访问附加到 dataframe1 中某些行的一些常量值并使用 dataframe2 中的值 - Access some constant value attached to some rows in dataframe1 and use the value in dataframe2
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM