简体   繁体   English

从Python字符串中删除多种字符

[英]Removing many types of chars from a Python string

I have some string X and I wish to remove semicolons, periods, commas, colons, etc, all in one go. 我有一些字符串X,我希望一次性删除分号,句号,逗号,冒号等。 Is there a way to do this that doesn't require a big chain of .replace(somechar,"") calls? 有没有一种方法不需要大量的.replace(somechar,“”)调用?

You can use re.sub to pattern match and replace. 您可以使用re.sub模式匹配和替换。 The following replaces h and i only with empty strings: 以下仅用空字符串替换hi

In [1]: s = 'byehibyehbyei'

In [1]: re.sub('[hi]', '', s)
Out[1]: 'byebyebye'

Don't forget to import re . 不要忘记import re

>>> import re
>>> foo = "asdf;:,*_-"
>>> re.sub('[;:,*_-]', '', foo)
'asdf'
  • [;:,*_-] - List of characters to be matched [;:,*_-] -要匹配的字符列表
  • '' - Replace match with nothing '' -用任何东西代替比赛
  • Using the string foo . 使用字符串foo

For more information take a look at the re.sub(pattern, repl, string, count=0, flags=0) documentation . 有关更多信息,请参阅re.sub(pattern, repl, string, count=0, flags=0)文档

You can use the translate method with a first argument of None : 您可以使用带有第一个参数Nonetranslate方法:

string2 = string1.translate(None, ";.,:")

Alternatively, you can use the filter function : 另外,您可以使用filter功能

string2 = filter(lambda x: x not in ";,.:", string1)

Note that both of these options only work for non-Unicode strings and only in Python 2. 请注意,这两个选项仅适用于非Unicode字符串,并且仅适用于Python 2。

Don't know about the speed, but here's another example without using re . 不知道速度,但是这是不使用re的另一个示例。

commas_and_stuff = ",+;:"
words = "words; and stuff!!!!"
cleaned_words = "".join(c for c in words if c not in commas_and_stuff)

Gives you: 给你:

'words and stuff!!!!' “言语之类的东西!!!!”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM