简体   繁体   English

当特定单词是列内列表中的值时,如何将特定单词添加到新列

[英]How to add a specific word to a new column when it is a value in a list within a column

supposed my data set假设我的数据集

name what
A    apple[red]
B    cucumber[green]
C    dog
C    orange
D    banana
D    monkey
E    cat
F    carrot
.
.

I want to create and specify a list, and if the column contains a value contained in that list, I want to make the specified value a new column.我想创建并指定一个列表,如果该列包含该列表中包含的值,我想将指定的值设为新列。

list value列出值

fruit = ['apple', 'banana', 'orange']
animal = ['dog', 'monkey', 'cat']
vegetable = ['cucumber', 'carrot']

result what I want结果我想要的

name what     class
A    apple    fruit
B    cucumber vegetable
C    dog      animal
C    orange   fruit
D    banana   fruit
D    monkey   animal
E    cat      animal
F    carrot   vegetable

List values and column values do not 'match' and must be contained.列表值和列值不“匹配”,必须包含。

thank you for reading.谢谢你的阅读。

Use Series.map with dictionary created from lists and swapped keys with values with flattening:使用Series.map和从列表创建的字典,并使用扁平化的值交换键:

fruit = ['apple', 'banana', 'orange']
animal = ['dog', 'monkey', 'cat']
vegetable = ['cucumber', 'carrot']

d = {'fruit':fruit, 'animal':animal,'vegetable':vegetable}
#http://stackoverflow.com/a/31674731/2901002
d1 = {k: oldk for oldk, oldv in d.items() for k in oldv}

Loop alternative of dictionary comprehension:字典理解的循环替代:

d1 = {}
for oldk, oldv in d.items():
    for k in oldv:
        d1[k] = oldk

And then:接着:

df['class'] = df['what'].map(d1)
#if need values before first [
#df['class'] = df['what'].str.split('[').str[0].map(d1)
print (df)
  name      what      class
0    A     apple      fruit
1    B  cucumber  vegetable
2    C       dog     animal
3    C    orange      fruit
4    D    banana      fruit
5    D    monkey     animal
6    E       cat     animal
7    F    carrot  vegetable

EDIT: For match by subtrings you can loop by dictionary d , check matching by Series.str.contains for mask and set new values:编辑:对于子字符串匹配,您可以按字典d循环,通过Series.str.contains检查匹配以获取掩码并设置新值:

d = {'fruit':fruit, 'animal':animal,'vegetable':vegetable}

for k, v in d.items():
    mask = df['what'].str.contains('|'.join(v))
    df.loc[mask, 'class'] = k
print (df)
  name             what      class
0    A       apple[red]      fruit
1    B  cucumber[green]  vegetable
2    C              dog     animal
3    C           orange      fruit
4    D           banana      fruit
5    D           monkey     animal
6    E              cat     animal
7    F           carrot  vegetable

If possible multiple words is possible use words boundaries:如果可能有多个单词,请使用单词边界:

for k, v in d.items():
    pat = '|'.join(r"\b{}\b".format(x) for x in v)
    df.loc[ df['what'].str.contains(pat), 'class'] = k
print (df)
  name             what      class
0    A       apple[red]      fruit
1    B  cucumber[green]  vegetable
2    C              dog     animal
3    C           orange      fruit
4    D           banana      fruit
5    D           monkey     animal
6    E              cat     animal
7    F           carrot  vegetable

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如果列表中的列中的单词在新列中添加Y标志,则为python熊猫 - python Pandas if word in column in list add Y flag in new column 在组中添加一个具有最低值的新列 - Add a new column with lowest value within groups 如何根据外部列表的值在 dataframe 中添加新列? - How to add a new column in a dataframe based on the value of external list? 如果特定单词在新列中位于不同单词之前,则赋值 - Assign value if a specific words precedes different word in a new column 如何计算单词出现次数(从特定列表中的单词)并将结果存储在 Python 中的 Pandas Dataframe 中的新列中? - How to count the word occurence (from words in specific list) and store the results in a new column in a Pandas Dataframe in Python? 如何在 dataframe 中创建新列,当另一列中的 diff() 小于 0 时添加 1? - How to create new column in a dataframe that add 1 when diff() within another column is less than 0? 如何添加新列并根据另一列的系列填充特定值? - How to add a new column and fill it up with a specific value depending on another column's series? 检查列是否在列表中,如果不是则删除并将值添加到新列 - Check if column is in a list, remove if not and add value to a new column 如何将 groupby object 转换为列表列表和 append 将新列/值转换为列表中的列表 - How can I convert a groupby object to a list of lists and append a new column/value to the list's within the list Jinja 表,当列 == 值时,然后将 div 添加到特定列 - Jinja table, when column == value then add div to specific column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM