[英]how to find the index of the first non-whitespace character in a string in python?
Scenario: 场景:
>>> a=' Hello world'
index = 3
In this case the "H" index is '3'. 在这种情况下,“H”指数为“3”。 But I need a more general method such that for any string variable 'a' takes I need to know the index of the first character?
但是我需要一个更通用的方法,这样对于任何字符串变量'a'需要我需要知道第一个字符的索引?
Alternative scenario: 替代方案:
>>> a='\tHello world'
index = 1
If you mean the first non-whitespace character, I'd use something like this ... 如果你的意思是第一个非空白字符,我会用这样的东西......
>>> a=' Hello world'
>>> len(a) - len(a.lstrip())
3
Another one which is a little fun: 另一个有点乐趣:
>>> sum(1 for _ in itertools.takewhile(str.isspace,a))
3
But I'm willing to bet that the first version is faster as it does essentially this exact loop, only in C -- Of course, it needs to construct a new string when it's done, but that's essentially free. 但是我愿意打赌第一个版本更快,因为它基本上是这个确切的循环,只在C中 - 当然,它需要在完成时构造一个新的字符串,但这基本上是免费的。
For completeness, if the string is empty or composed of entirely whitespace, both of these will return len(a)
(which is invalid if you try to index with it...) 为了完整性,如果字符串为空或由完全空格组成,则这两个字符串都将返回
len(a)
(如果您尝试使用它进行索引,则无效)
>>> a = "foobar"
>>> a[len(a)]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: string index out of range
Using regex
: 使用
regex
:
>>> import re
>>> a=' Hello world'
>>> re.search(r'\S',a).start()
3
>>> a='\tHello world'
>>> re.search(r'\S',a).start()
1
>>>
Function to handle the cases when the string is empty or contains only white spaces: 当字符串为空或仅包含空格时处理案例的函数:
>>> def func(strs):
... match = re.search(r'\S',strs)
... if match:
... return match.start()
... else:
... return 'No character found!'
...
>>> func('\t\tfoo')
2
>>> func(' foo')
3
>>> func(' ')
'No character found!'
>>> func('')
'No character found!'
You can also try: 你也可以尝试:
a = ' Hello world'
a.index(a.lstrip()[0])
=> 3
It'll work as long as the string contains at least one non-space character. 只要字符串包含至少一个非空格字符,它就会起作用。 We can be a bit more careful and check this before:
我们可以更加小心,然后再检查一下:
a = ' '
-1 if not a or a.isspace() else a.index(a.lstrip()[0])
=> -1
Another method, just for fun... Using a special function! 另一种方法,只是为了好玩...使用特殊功能!
>>> def first_non_space_index(s):
for idx, c in enumerate(s):
if not c.isspace():
return idx
>>> a = ' Hello world'
>>> first_non_space_index(a)
3
Following mgilson's answer, you can use lstrip to strip any characters you'd like - 根据mgilson的回答,您可以使用lstrip去除您想要的任何字符 -
unwanted = ':!@#$%^&*()_+ \t\n'
a= ' _Hello world'
res = len(a) - len(a.lstrip(unwanted))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.