简体   繁体   English

Python Pandas Dataframe 如果字符串包含特殊字符,则删除列

[英]Python Pandas Dataframe drop columns if string contains special character

I have a dataframe:我有一个 dataframe:

Product产品 Storage贮存 Price价格
Azure Azure (2.4% (2.4% Server服务器 £540 540 英镑
AWS AWS Server服务器 £640 640 英镑
GCP GCP Server服务器 £540 540 英镑

I would like to remove the column which contains the string '(2.4%' however I only want to remove the column in Pandas through regex if regex finds either a bracket or percentage in the string in that column '(%' and then pandas should drop that column entirely.我想删除包含字符串'(2.4%'但是我只想通过正则表达式删除Pandas中的列,如果正则表达式在该列中的字符串中找到括号或百分比'(%'然后pandas应该完全删除该列。

Please can you help me find a way to use regex to search for special characters within a string and drop the column if that condition is met?请您帮我找到一种方法来使用正则表达式在字符串中搜索特殊字符并在满足该条件时删除该列?

I've searched on stack/google.我在堆栈/谷歌上搜索过。 I've used the following so far:到目前为止,我已经使用了以下内容:

df = df.drop([col for col in df.columns if df[col].eq('(%').any()], axis=1)

chars = '(%'
regex = f'[{"".join(map(re.escape, chars))}]'

df = df.loc[:, ~df.apply(lambda c: c.str.contains(regex).any())]

however neither of these worked.但是这些都不起作用。

Any help would be greatly appreciated.任何帮助将不胜感激。 :) :)

Thank You * Insert Smiley*谢谢*插入笑脸*

I would do something like this我会做这样的事情

import pandas as pd
from io import StringIO

text = """
Product,Perc,Storage,Price
Azure,(2.4%,Server,£540
AWS,,Server,£640
GCP,,Server,£540
"""
data = pd.read_csv(StringIO(text))
print(data)

drop_columns = list()
for col_name in data.columns:
    has_special_characters = data[col_name].str.contains("[\(%]")
    if has_special_characters.any():
        drop_columns.append(col_name)

print(f"Dropping {drop_columns}")
data.drop(drop_columns, axis=1, inplace=True)
print(data)

Output of the script is: Output 的脚本是:

  Product   Perc Storage Price
0   Azure  (2.4%  Server  £540
1     AWS    NaN  Server  £640
2     GCP    NaN  Server  £540
Dropping ['Perc']
  Product Storage Price
0   Azure  Server  £540
1     AWS  Server  £640
2     GCP  Server  £540

Process finished with exit code 0

you re using eq function it check exactly if the value in the columun match % instead of eq do this你正在使用eq columun它检查列中的值是否匹配%而不是eq这样做

df.drop([col for col in df.columns if df[col].apply(lambda x:'(%' in str(x)).any()], axis=1,inplace=True)

You can try this (I guess the name of the column you want to drop is ""):你可以试试这个(我猜你要删除的列的名称是“”):

import re

change_col = False
for elem in df[""]:
    if re.search(r'[(%]', elem):
        change_col = True

if change_col:
    df = df.drop("", axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从 Pandas DataFrame 中删除名称包含特定字符串的列 - Drop columns whose name contains a specific string from pandas DataFrame 数据框列表,如果行包含特殊字符串,则删除数据框列(列具有不同的名称) - List of Dataframes, drop Dataframe column (columns have different names) if row contains a special string Python Pandas检查数据框是否包含特定字符串 - Python pandas checking if dataframe contains a certain string 基于条件在熊猫数据框列中的特殊字符上拆分字符串 - Splitting a string on a special character in a pandas dataframe column based on a conditional Pandas:如果数据框中的值包含来自另一个数据帧的字符串,则追加列 - Pandas : if value in a dataframe contains string from another dataframe, append columns Python:在pandas数据帧的每一列中用NULL替换特殊字符 - Python: Replace the special character by NULL in each column in pandas dataframe Python / Pandas - 删除以字符串开头的列 - Python / Pandas - Drop columns that start with string Python Pandas-Main DataFrame,想要将所有列都放在较小的DataFrame中 - Python Pandas - Main DataFrame, want to drop all columns in smaller DataFrame Python Pandas Drop Dataframe - Python Pandas Drop Dataframe 如何提取 python 列 dataframe 上的特殊字符之间的字符串? - How to extract a string between special character on a column dataframe in python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM