[英]Removing many types of chars from a Python string
I have some string X and I wish to remove semicolons, periods, commas, colons, etc, all in one go. 我有一些字符串X,我希望一次性删除分号,句号,逗号,冒号等。 Is there a way to do this that doesn't require a big chain of .replace(somechar,"") calls?
有没有一种方法不需要大量的.replace(somechar,“”)调用?
You can use re.sub
to pattern match and replace. 您可以使用
re.sub
模式匹配和替换。 The following replaces h
and i
only with empty strings: 以下仅用空字符串替换
h
和i
:
In [1]: s = 'byehibyehbyei'
In [1]: re.sub('[hi]', '', s)
Out[1]: 'byebyebye'
Don't forget to import re
. 不要忘记
import re
。
>>> import re
>>> foo = "asdf;:,*_-"
>>> re.sub('[;:,*_-]', '', foo)
'asdf'
[;:,*_-]
- List of characters to be matched [;:,*_-]
-要匹配的字符列表 ''
- Replace match with nothing ''
-用任何东西代替比赛 foo
. foo
。 For more information take a look at the re.sub(pattern, repl, string, count=0, flags=0)
documentation . 有关更多信息,请参阅
re.sub(pattern, repl, string, count=0, flags=0)
文档 。
You can use the translate
method with a first argument of None
: 您可以使用带有第一个参数
None
的translate
方法:
string2 = string1.translate(None, ";.,:")
Alternatively, you can use the filter
function : 另外,您可以使用
filter
功能 :
string2 = filter(lambda x: x not in ";,.:", string1)
Note that both of these options only work for non-Unicode strings and only in Python 2. 请注意,这两个选项仅适用于非Unicode字符串,并且仅适用于Python 2。
Don't know about the speed, but here's another example without using re
. 不知道速度,但是这是不使用
re
的另一个示例。
commas_and_stuff = ",+;:"
words = "words; and stuff!!!!"
cleaned_words = "".join(c for c in words if c not in commas_and_stuff)
Gives you: 给你:
'words and stuff!!!!'
“言语之类的东西!!!!”
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.