[英]how to find the index of the first non-whitespace character in a string in python?
[英]how do i find the index of the first trail whitespace on python
我正在编写一个函数,该函数可以找到字符串的第一行空白的索引,但是我不确定该怎么做,有人可以教我吗?
例如“我在这里”。句子后面有三个空格。 该功能将给我“ 10”。
输入的内容是一个文本python文件,该文件被分成句子(字符串列表)
这就是我尝试过的
alplist = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z"]
space = [' ', ',', '.', '(', ')', ':', ':']
def TRAIL_WHITESPACE(python_filename):
whitespace = []
LINE_NUMBER = 1
index = 0
for item in lines:
for index in range(len(item)):
if len(item) > 0:
if item[index] + item[index + 1] in alplist + space:
index = index
if item[index:] in " ":
whitespace.append({'ERROR_TYPE':'TRAIL_WHITESPACE', 'LINE_NUMBER': str(LINE_NUMBER),'COLUMN': str(index),'INFO': '','SOURCE_LINE': str(lines[ len(item) - 1])})
LINE_NUMBER += 1
else:
LINE_NUMBER += 1
else:
LINE_NUMBER += 1
else:
LINE_NUMBER += 1
return whitespace
谢谢
可以使用str.rstrip()
方法轻松完成此操作:
#! /usr/bin/env python
#Find index of any trailing whitespace of string s
def trail(s):
return len(s.rstrip())
for s in ("i am here. ", "nospace", " no trail", "All sorts of spaces \t \n", ""):
i = trail(s)
print `s`, i, `s[:i]`
输出
'i am here. ' 10 'i am here.'
'nospace' 7 'nospace'
' no trail' 12 ' no trail'
'All sorts of spaces \t \n' 19 'All sorts of spaces'
'' 0 ''
您可以尝试使用正则表达式。 像这样的东西:
import re
my_re = re.compile(r'\S\s')
res = my_re.search("some long string")
if res:
print("start: {}, end: {}".format(res.start(0), res.end(0)))
正如@Alexey所说,正则表达式似乎是一种进行方法。
以下应该做您想要的。 请注意,“空白”包括换行符。
这样称呼它: list_of_indexes = find_ws("/path/to/file.txt")
import re
def find_ws(filename):
"""Return a list of indexes, each indicating the location of the
first trailing whitespace character on a line. Return an index of
-1 if there is no trailing whitespace character (at the end of a file)"""
text = open(filename).readlines()
# Any characters, then whitespace, then end of line
# Use a non-"greedy" match
# Make the whitespace before the end of the line a group
match_space = re.compile(r'^.*?\S*?(\s+?)\Z')
indexes = []
for s in text:
m = match_space.match(s)
if m == None:
indexes.append(-1)
else:
# find the start of the matching group
indexes.append(m.start(1))
return indexes
提供了有关Python中正则表达式的文档。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.