[英]Python Pandas Dataframe drop columns if string contains special character
I have a dataframe:我有一个 dataframe:
Product产品 | Storage贮存 | Price价格 | |
---|---|---|---|
Azure Azure | (2.4% (2.4% | Server服务器 | £540 540 英镑 |
AWS AWS | Server服务器 | £640 640 英镑 | |
GCP GCP | Server服务器 | £540 540 英镑 |
I would like to remove the column which contains the string '(2.4%' however I only want to remove the column in Pandas through regex if regex finds either a bracket or percentage in the string in that column '(%' and then pandas should drop that column entirely.我想删除包含字符串'(2.4%'但是我只想通过正则表达式删除Pandas中的列,如果正则表达式在该列中的字符串中找到括号或百分比'(%'然后pandas应该完全删除该列。
Please can you help me find a way to use regex to search for special characters within a string and drop the column if that condition is met?请您帮我找到一种方法来使用正则表达式在字符串中搜索特殊字符并在满足该条件时删除该列?
I've searched on stack/google.我在堆栈/谷歌上搜索过。 I've used the following so far:到目前为止,我已经使用了以下内容:
df = df.drop([col for col in df.columns if df[col].eq('(%').any()], axis=1)
chars = '(%'
regex = f'[{"".join(map(re.escape, chars))}]'
df = df.loc[:, ~df.apply(lambda c: c.str.contains(regex).any())]
however neither of these worked.但是这些都不起作用。
Any help would be greatly appreciated.任何帮助将不胜感激。 :) :)
Thank You * Insert Smiley*谢谢*插入笑脸*
I would do something like this我会做这样的事情
import pandas as pd
from io import StringIO
text = """
Product,Perc,Storage,Price
Azure,(2.4%,Server,£540
AWS,,Server,£640
GCP,,Server,£540
"""
data = pd.read_csv(StringIO(text))
print(data)
drop_columns = list()
for col_name in data.columns:
has_special_characters = data[col_name].str.contains("[\(%]")
if has_special_characters.any():
drop_columns.append(col_name)
print(f"Dropping {drop_columns}")
data.drop(drop_columns, axis=1, inplace=True)
print(data)
Output of the script is: Output 的脚本是:
Product Perc Storage Price
0 Azure (2.4% Server £540
1 AWS NaN Server £640
2 GCP NaN Server £540
Dropping ['Perc']
Product Storage Price
0 Azure Server £540
1 AWS Server £640
2 GCP Server £540
Process finished with exit code 0
you re using eq
function it check exactly if the value in the columun
match %
instead of eq
do this你正在使用eq
columun
它检查列中的值是否匹配%
而不是eq
这样做
df.drop([col for col in df.columns if df[col].apply(lambda x:'(%' in str(x)).any()], axis=1,inplace=True)
You can try this (I guess the name of the column you want to drop is ""):你可以试试这个(我猜你要删除的列的名称是“”):
import re
change_col = False
for elem in df[""]:
if re.search(r'[(%]', elem):
change_col = True
if change_col:
df = df.drop("", axis=1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.