从数据框的列中过滤掉关键字（不区分大小写） - Pandas

Question

Want to create a new column based on a word in an existing column The new column should either Lobby,UPS,Electrical or ' Blank Space '想要根据现有列中的单词创建新列新列应该是 Lobby、UPS、Electrical 或“空白空间”

Name                                SubUnitName
Lobby Area                          Lobby
Sensor - Bank lobby                 Lobby
Temperature - UPS Room              UPS
Sensor - Electric Room              Electric
Sensor - electrical Room            Electric
Temperature - electric Room         Electric
Sensor

As Seen above the search should be case insensitive and if 'Electrical' or 'Electric' is found then the result should be 'Electric'如上所示，搜索应不区分大小写，如果找到“Electrical”或“Electric”，则结果应为“Electric”

Answer 1

Establishes the list of words to look for in the "Name" column, then applies the function "find_match" in order to create the new "SubUnitName" column.建立要在“名称”列中查找的单词列表，然后应用函数“find_match”以创建新的“SubUnitName”列。

search_list = ["Lobby", "UPS", "Electric"]


def find_match(name_str: str) -> str:
    for item in search_list:
        item_lc = item.lower()
        if item_lc in name_str.lower():
            return item


df.loc[:, "SubUnitName"] = df["Name"].apply(find_match)

Replace None with blank space for last row用最后一行的空格替换 None

df["SubUnitName"].fillna('', inplace=True)

Answer 2

I provided a solution for you.我为您提供了解决方案。 It checks for a match between the strings and if found, adds it to a list which will be your new column.它检查字符串之间是否匹配，如果找到，则将其添加到一个列表中，该列表将成为您的新列。

import pandas as pd 


d = { "Name" : ["Sensor - Bank lobby ", "Sensor - Bank Lobby ", "Temperature - UPS Room", "Sensor - Electric Room ", "Sensor - electrical Room", "Sensor"]}


df = pd.DataFrame(data=d)

list_sub_units = []

list_matches = ["Lobby", "UPS", "Electric"]

for entry in df["Name"]:
    matched = False

    for match in list_matches:
        if entry.lower().find(match.lower()) > 0:
            list_sub_units.append(match)
            matched = True
        
    if not matched:
        list_sub_units.append("")

df["SubUnitName"] = list_sub_units


print(df)

从数据框的列中过滤掉关键字（不区分大小写） - Pandas

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-10-30 13:04:04

解决方案2
0 2020-10-30 12:59:26

从数据框的列中过滤掉关键字（不区分大小写） - Pandas

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-10-30 13:04:04

解决方案2 0 2020-10-30 12:59:26

解决方案1
1 已采纳 2020-10-30 13:04:04

解决方案2
0 2020-10-30 12:59:26