简体   繁体   English

Pythonic方式替换多个字符

[英]Pythonic Way of replacing Multiple Characters

I've created a onetime function 我创建了一个一次性功能

a = lambda x: x.replace('\n', '')
b = lambda y: y.replace('\t', '').strip()
c = lambda x: b(a(x))

Is there a Pythonic and compact way ( one liner? ) to this to improve readability and performance. 是否有Pythonic紧凑的方式( 一个衬垫? ),以提高可读性和性能。 Mainly performance. 主要表现。

(note: I know I can do lambda x: x.replace('\\n').replace('\\t\\).strip() but doesn't do anything. Preferably there's a built-in method that deals with this type of issue that I wasn't aware of and I know that the performance improvements are negligible.) (注意:我知道我可以做lambda x: x.replace('\\n').replace('\\t\\).strip()但是没有做任何事情。最好有一个内置方法来处理这个问题。我不知道的问题类型,我知道性能改进可以忽略不计。)

Input: 'my \\t\\t\\t test, case \\ntest\\n LoremIpsum' 输入: 'my \\t\\t\\t test, case \\ntest\\n LoremIpsum'

Desired Output: 'my test, case test LoremIpsum' 期望的输出: 'my test, case test LoremIpsum'

Option 1 选项1
str.translate
For starters, if you're replacing a lot of characters with the same thing, I'd 100% recommend str.translate . 对于初学者来说,如果你用相同的东西替换很多角色,我会100%推荐str.translate

>>> from string import whitespace as wsp
>>> '\n\ttext   \there\r'.translate(str.maketrans(dict.fromkeys(wsp, '')))
'texthere'

This syntax is valid with python-3.x only. 此语法仅对python-3.x有效。 For python-2.x, you will need to import string and use string.maketrans to build the mapping instead. 对于python-2.x,您需要import string并使用string.maketrans来构建映射。

If you want to exclude whitespace chars itself, then 如果你想排除空白字符本身,那么

wsp = set(wsp) - {' '}

Option 2 选项2
re.sub
The regex equivalent of the above would be using re.sub . 上面的正则表达式将使用re.sub

>>> import re
>>> re.sub(r'\s+', '', '\n\ttext   \there\r')
'texthere'

However, performance wise, str.translate beats this hands down. 然而,表现明智, str.translate击败了这一手。

The improvements are pretty straightforward: 改进非常简单:

Drop lambdas. 放下lambda。 str.replace() method is a function, and in the first line of your snippet you define a function that calls to another function and nothing else. str.replace()方法是一个函数,在你的代码片段的第一行中,你定义了一个调用另一个函数而不是其他函数的函数。 Why do you need the wrapping lambda? 你为什么需要包裹lambda? The same concerns the second line. 同样涉及第二行。

Use return values. 使用返回值。 Actually, in docs we see: 实际上,在文档中我们看到:

Return a copy of the string with all occurrences of substring old replaced by new. 返回字符串的副本,其中所有出现的substring old都替换为new。

So you can do a first replace() , then do a second one on the obtained result . 所以你可以先做一个replace() ,然后再对获得的结果做第二个。

To sum up, you'll have: 总而言之,你将拥有:

c = x.replace('\n', '').replace('\t', '').strip()

Note: if you have many characters to remove, you'd better use str.translate() but for two of them str.replace() is far more readable. 注意:如果你有很多的字符删除,你最好使用str.translate()但其中两个str.replace()是更具可读性。

Cheers! 干杯!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM