[英]Filtering out keywords(case insensitive) from a column of a dataframe - Pandas
Want to create a new column based on a word in an existing column The new column should either Lobby,UPS,Electrical or ' Blank Space '想要根据现有列中的单词创建新列 新列应该是 Lobby、UPS、Electrical 或“空白空间”
Name SubUnitName
Lobby Area Lobby
Sensor - Bank lobby Lobby
Temperature - UPS Room UPS
Sensor - Electric Room Electric
Sensor - electrical Room Electric
Temperature - electric Room Electric
Sensor
As Seen above the search should be case insensitive and if 'Electrical' or 'Electric' is found then the result should be 'Electric'如上所示,搜索应不区分大小写,如果找到“Electrical”或“Electric”,则结果应为“Electric”
Establishes the list of words to look for in the "Name" column, then applies the function "find_match" in order to create the new "SubUnitName" column.建立要在“名称”列中查找的单词列表,然后应用函数“find_match”以创建新的“SubUnitName”列。
search_list = ["Lobby", "UPS", "Electric"]
def find_match(name_str: str) -> str:
for item in search_list:
item_lc = item.lower()
if item_lc in name_str.lower():
return item
df.loc[:, "SubUnitName"] = df["Name"].apply(find_match)
Replace None with blank space for last row用最后一行的空格替换 None
df["SubUnitName"].fillna('', inplace=True)
I provided a solution for you.我为您提供了解决方案。 It checks for a match between the strings and if found, adds it to a list which will be your new column.
它检查字符串之间是否匹配,如果找到,则将其添加到一个列表中,该列表将成为您的新列。
import pandas as pd
d = { "Name" : ["Sensor - Bank lobby ", "Sensor - Bank Lobby ", "Temperature - UPS Room", "Sensor - Electric Room ", "Sensor - electrical Room", "Sensor"]}
df = pd.DataFrame(data=d)
list_sub_units = []
list_matches = ["Lobby", "UPS", "Electric"]
for entry in df["Name"]:
matched = False
for match in list_matches:
if entry.lower().find(match.lower()) > 0:
list_sub_units.append(match)
matched = True
if not matched:
list_sub_units.append("")
df["SubUnitName"] = list_sub_units
print(df)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.