简体   繁体   English

如何在不删除空格的情况下删除 python 字符串中的特殊字符“^”

[英]How to remove the special character '^' in a python string without removing whitespace with it

i've been wondering how to remove the special character '^' in a python string, it seems like it doesn't count like the other special characters.我一直想知道如何删除 python 字符串中的特殊字符“^”,它似乎不像其他特殊字符那样计数。

I actually was trying to remove some special characters in a dataframe by using this code below:实际上,我试图通过使用以下代码删除 dataframe 中的一些特殊字符:

def remove_special_characters(text, remove_digits=True):
    text=re.sub(r'[^a-zA-z0-9\s]+','',text)
    
    return text


df['review']=df['review'].apply(remove_special_characters)

but the symbol '^' is still appearing in my data, do you know some code to remove it please?但是符号'^'仍然出现在我的数据中,你知道一些代码来删除它吗?

The use case you're tackling is already addressed by translate(), without any need to resort to power tools like regexes.您正在处理的用例已由 translate() 解决,无需求助于正则表达式等强大工具。

https://docs.python.org/3/library/stdtypes.html#str.maketrans https://docs.python.org/3/library/stdtypes.html#str.maketrans

But suppose you really want to use a regex.但是假设您真的想使用正则表达式。 This unit test works fine.这个单元测试工作正常。

    def test_battle(self):

        def remove(text):
            return non_alnum.sub("", text)

        non_alnum = re.compile(r"[^a-zA-Z0-9]")
        d = dict(word="Bat^tle", definition="Combat between opponents,")
        df = pd.DataFrame([d])
        self.assertEqual(["Bat^tle"], list(df.word))
        df["word"] = df.word.apply(remove)
        self.assertEqual(["Battle"], list(df.word))

Depending on specifics of your use case, this code might be preferable:根据您的用例的具体情况,此代码可能更可取:

        non_alnum = re.compile(r"[^\w]")  # We choose to ignore the "_" underscore detail.
        b = dict(word="Bat^tle", definition="Combat between opponents,")
        c = dict(word="Coup d'état", definition="Diplomacy through other means")
        df = pd.DataFrame([b, c])
        self.assertEqual(["Bat^tle", "Coup d'état"], list(df.word))
        df["word"] = df.word.apply(remove)
        self.assertEqual(['Battle', 'Coupdétat'], list(df.word))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM