繁体   English   中英

我如何在python上找到第一个路径空白的索引

[英]how do i find the index of the first trail whitespace on python

我正在编写一个函数,该函数可以找到字符串的第一行空白的索引,但是我不确定该怎么做,有人可以教我吗?

例如“我在这里”。句子后面有三个空格。 该功能将给我“ 10”。

输入的内容是一个文本python文件,该文件被分成句子(字符串列表)

这就是我尝试过的

alplist = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z"] 
space = [' ', ',', '.', '(', ')', ':', ':']

def TRAIL_WHITESPACE(python_filename): 
    whitespace = [] 
    LINE_NUMBER = 1 
    index = 0 
    for item in lines: 
        for index in range(len(item)): 
            if len(item) > 0: 
                if item[index] + item[index + 1] in alplist + space: 
                    index = index 
                    if item[index:] in " ": 
                        whitespace.append({'ERROR_TYPE':'TRAIL_WHITESPACE', 'LINE_NUMBER': str(LINE_NUMBER),'COLUMN': str(index),'INFO': '','SOURCE_LINE': str(lines[ len(item) - 1])}) 
                        LINE_NUMBER += 1 
                    else: 
                        LINE_NUMBER += 1 
                else: 
                    LINE_NUMBER += 1 
            else: 
                LINE_NUMBER += 1 
    return whitespace

谢谢

可以使用str.rstrip()方法轻松完成此操作:

#! /usr/bin/env python

#Find index of any trailing whitespace of string s
def trail(s):
    return len(s.rstrip())

for s in ("i am here. ", "nospace", "   no  trail", "All sorts of spaces \t \n", ""):
    i = trail(s)
    print `s`, i, `s[:i]`

输出

'i am here. ' 10 'i am here.'
'nospace' 7 'nospace'
'   no  trail' 12 '   no  trail'
'All sorts of spaces \t \n' 19 'All sorts of spaces'
'' 0 ''

您可以尝试使用正则表达式。 像这样的东西:

import re

my_re = re.compile(r'\S\s')

res = my_re.search("some long string")

if res:
    print("start: {}, end: {}".format(res.start(0), res.end(0)))

正如@Alexey所说,正则表达式似乎是一种进行方法。

以下应该做您想要的。 请注意,“空白”包括换行符。

这样称呼它: list_of_indexes = find_ws("/path/to/file.txt")

import re

def find_ws(filename):
    """Return a list of indexes, each indicating the location of the 
    first trailing whitespace character on a line.  Return an index of 
    -1 if there is no trailing whitespace character (at the end of a file)"""

    text = open(filename).readlines()

    # Any characters, then whitespace, then end of line
    # Use a non-"greedy" match
    # Make the whitespace before the end of the line a group
    match_space = re.compile(r'^.*?\S*?(\s+?)\Z') 

    indexes = []

    for s in text:
        m = match_space.match(s)
        if m == None:
            indexes.append(-1)
        else:
            # find the start of the matching group
            indexes.append(m.start(1)) 

    return indexes

提供了有关Python中正则表达式的文档。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM