[英]removing \xa0, \n, \t from python string
I have a list item, that I've converted into a string: 我有一个列表项,已将其转换为字符串:
[u'\n Door:\xa0Novum \t ']
I need to remove everything so that Im left with 我需要删除所有内容,以便我离开
Door:Novum
I have tried various methods: 我尝试了各种方法:
string = string.replace("\xa0", "")
string.rstrip('\n')
string.translate(string.maketrans("\n\t", ""))
I am obviously doing something wrong, but can't figure out what 我显然做错了,但无法弄清楚是什么
You need to store the return value ; 您需要存储返回值 ; strings are immutable so methods return a new string with the change applied.
字符串是不可变的,因此方法将返回应用了更改的新字符串。
You can translate for all those characters, but use the unicode
form of the method: 您可以翻译所有这些字符,但使用方法的
unicode
形式:
toremove = dict.fromkeys((ord(c) for c in u'\xa0\n\t '))
outputstring = inputstring.translate(toremove)
I'm assuming you wanted to get rid of spaces as well. 我假设您也想摆脱空间 。
Demo: 演示:
>>> inputstring = u'\n Door:\xa0Novum \t '
>>> toremove = dict.fromkeys((ord(c) for c in u'\xa0\n\t '))
>>> outputstring = inputstring.translate(toremove)
>>> outputstring
u'Door:Novum'
A better method still would be to use str.split()
, then join again: 更好的方法仍然是使用
str.split()
,然后再次加入:
outputstring = u''.join(inputstring.split())
\\xa0
, spaces, tabs and newlines are all included in what str.split()
will split on, as well as carriage returns. \\xa0
,空格,制表符和换行符都包含在str.split()
拆分内容以及回车符中。
Demo: 演示:
>>> u''.join(inputstring.split())
u'Door:Novum'
This is better because it is a lot faster for this job than using str.translate()
! 这是更好,因为它是这个工作比使用快很多
str.translate()
>>> import timeit
>>> timeit.timeit('inputstring.translate(toremove)', 'from __main__ import inputstring, toremove')
3.4527599811553955
>>> timeit.timeit('u"".join(inputstring.split())', 'from __main__ import inputstring')
0.5409181118011475
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.