简体   繁体   English

计算字符串中前导空格的 pythonic 方法是什么?

[英]What is the pythonic way to count the leading spaces in a string?

I know I can count the leading spaces in a string with this:我知道我可以用这个来计算字符串中的前导空格:

>>> a = "   foo bar baz qua   \n"
>>> print "Leading spaces", len(a) - len(a.lstrip())
Leading spaces 3
>>>

But is there a more pythonic way?但是有没有更pythonic的方式?

Your way is pythonic but incorrect, it will also count other whitespace chars, to count only spaces be explicit a.lstrip(' ') :你的方式是 pythonic 但不正确,它还会计算其他空白字符,只计算空格是显式a.lstrip(' ')

a = "   \r\t\n\tfoo bar baz qua   \n"
print "Leading spaces", len(a) - len(a.lstrip())
>>> Leading spaces 7
print "Leading spaces", len(a) - len(a.lstrip(' '))
>>> Leading spaces 3

You could use itertools.takewhile你可以使用itertools.takewhile

sum( 1 for _ in itertools.takewhile(str.isspace,a) )

And demonstrating that it gives the same result as your code:并证明它给出了与您的代码相同的结果:

>>> import itertools
>>> a = "    leading spaces"
>>> print sum( 1 for _ in itertools.takewhile(str.isspace,a) )
4
>>> print "Leading spaces", len(a) - len(a.lstrip())
Leading spaces 4

I'm not sure whether this code is actually better than your original solution.我不确定这段代码是否真的比你原来的解决方案更好 It has the advantage that it doesn't create more temporary strings, but that's pretty minor (unless the strings are really big).它的优点是它不会创建更多的临时字符串,但这是非常小的(除非字符串真的很大)。 I don't find either version to be immediately clear about that line of code does, so I would definitely wrap it in a nicely named function if you plan on using it more than once (with appropriate comments in either case).我没有发现任何一个版本都可以立即清楚地说明那行代码,所以如果您计划多次使用它(在任何一种情况下都带有适当的注释),我肯定会将它包装在一个很好命名的函数中。

Just for variety, you could theoretically use regex.只是为了多样性,理论上您可以使用正则表达式。 It's a little shorter, and looks nicer than the double call to len() .它比对len()的两次调用更短,看起来更好。

>>> import re
>>> a = "   foo bar baz qua   \n"
>>> re.search('\S', a).start() # index of the first non-whitespace char
3

Or alternatively:或者:

>>> re.search('[^ ]', a).start() # index of the first non-space char
3

But I don't recommend this;但我不推荐这样做; according to a quick test I did, it's much less efficient than len(a)-len(lstrip(a)) .根据我做的快速测试,它的效率比len(a)-len(lstrip(a))

I recently had a similar task of counting indents, because of which I wanted to count tab as four spaces:我最近有一个类似的计算缩进的任务,因此我想将制表符计算为四个空格:

def indent(string: str):
    return sum(4 if char is '\t' else 1 for char in string[:-len(string.lstrip())])

Using next and enumerate :使用nextenumerate

next((i for i, c in enumerate(a) if c != ' '), len(a))

For any whitespace:对于任何空格:

next((i for i, c in enumerate(a) if not c.isspace()), len(a))

You can use a regular expression:您可以使用正则表达式:

def count_leading_space(s): 
    match = re.search(r"^\s*", s) 
    return 0 if not match else match.end()

In [17]: count_leading_space("    asd fjk gl")                                  
Out[17]: 4

In [18]: count_leading_space(" asd fjk gl")                                     
Out[18]: 1

In [19]: count_leading_space("asd fjk gl")                                      
Out[19]: 0

That looks... great to me.这看起来……对我来说很棒。 Usually I answer "Is X Pythonic?"通常我会回答“X 是 Pythonic 吗?” questions with some functional magic, but I don't feel that approach is appropriate for string manipulation.一些功能魔术的问题,但我认为这种方法不适合字符串操作。

If there were a built-in to only return the leading spaces, and the take the len() of that, I'd say go for it- but AFAIK there isn't, and re and other solutions are absolutely overkill.如果有一个返回前导空格的内置函数,并且取其中的len() ,我会说去做吧——但 AFAIK 没有, re和其他解决方案绝对是矫枉过正。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM