Python Pandas Dataframe 如果字符串包含特殊字符，则删除列

Question

I have a dataframe:我有一个 dataframe：

Product产品		Storage贮存	Price价格
Azure Azure	(2.4% (2.4%	Server服务器	£540 540 英镑
AWS AWS		Server服务器	£640 640 英镑
GCP GCP		Server服务器	£540 540 英镑

I would like to remove the column which contains the string '(2.4%' however I only want to remove the column in Pandas through regex if regex finds either a bracket or percentage in the string in that column '(%' and then pandas should drop that column entirely.我想删除包含字符串'（2.4％'但是我只想通过正则表达式删除Pandas中的列，如果正则表达式在该列中的字符串中找到括号或百分比'（％'然后pandas应该完全删除该列。

Please can you help me find a way to use regex to search for special characters within a string and drop the column if that condition is met?请您帮我找到一种方法来使用正则表达式在字符串中搜索特殊字符并在满足该条件时删除该列？

I've searched on stack/google.我在堆栈/谷歌上搜索过。 I've used the following so far:到目前为止，我已经使用了以下内容：

df = df.drop([col for col in df.columns if df[col].eq('(%').any()], axis=1)

chars = '(%'
regex = f'[{"".join(map(re.escape, chars))}]'

df = df.loc[:, ~df.apply(lambda c: c.str.contains(regex).any())]

however neither of these worked.但是这些都不起作用。

Any help would be greatly appreciated.任何帮助将不胜感激。 :) :)

Thank You * Insert Smiley*谢谢*插入笑脸*

Answer 1

I would do something like this我会做这样的事情

import pandas as pd
from io import StringIO

text = """
Product,Perc,Storage,Price
Azure,(2.4%,Server,£540
AWS,,Server,£640
GCP,,Server,£540
"""
data = pd.read_csv(StringIO(text))
print(data)

drop_columns = list()
for col_name in data.columns:
    has_special_characters = data[col_name].str.contains("[\(%]")
    if has_special_characters.any():
        drop_columns.append(col_name)

print(f"Dropping {drop_columns}")
data.drop(drop_columns, axis=1, inplace=True)
print(data)

Output of the script is: Output 的脚本是：

  Product   Perc Storage Price
0   Azure  (2.4%  Server  £540
1     AWS    NaN  Server  £640
2     GCP    NaN  Server  £540
Dropping ['Perc']
  Product Storage Price
0   Azure  Server  £540
1     AWS  Server  £640
2     GCP  Server  £540

Process finished with exit code 0

Answer 2

you re using eq function it check exactly if the value in the columun match % instead of eq do this你正在使用eq columun它检查列中的值是否匹配%而不是eq这样做

df.drop([col for col in df.columns if df[col].apply(lambda x:'(%' in str(x)).any()], axis=1,inplace=True)

Answer 3

You can try this (I guess the name of the column you want to drop is ""):你可以试试这个（我猜你要删除的列的名称是“”）：

import re

change_col = False
for elem in df[""]:
    if re.search(r'[(%]', elem):
        change_col = True

if change_col:
    df = df.drop("", axis=1)

Python Pandas Dataframe 如果字符串包含特殊字符，则删除列

问题描述

3 个解决方案

解决方案1
1 2022-09-14 15:23:51

解决方案2
1 已采纳 2022-09-14 15:24:22

解决方案3
1 2022-09-14 15:26:06

Python Pandas Dataframe 如果字符串包含特殊字符，则删除列

问题描述

3 个解决方案

解决方案1 1 2022-09-14 15:23:51

解决方案2 1 已采纳 2022-09-14 15:24:22

解决方案3 1 2022-09-14 15:26:06

解决方案1
1 2022-09-14 15:23:51

解决方案2
1 已采纳 2022-09-14 15:24:22

解决方案3
1 2022-09-14 15:26:06