简体   繁体   English

将.startswith()用于字符串中的特定位置的问题

[英]Issues using .startswith() for a specific location in a string

I have a text file, which has many lines of data in it. 我有一个文本文件,里面有很多行数据。 I need to check each line of this text file and process the data contained within the line accordingly (ie save to a separate, tabulated .txt for analysis) 我需要检查此文本文件的每一行并相应地处理该行中包含的数据(即保存到单独的表格.txt进行分析)

The text file is in the following format: 文本文件采用以下格式:

  • Number 1 or 0 (denoting relevance of data) 数字1或0(表示数据的相关性)
  • An ID for each line (referring to what the data is) 每行的ID(指数据是什么)
  • The data itself (contained in rest of line) 数据本身(包含在其余部分)

So this is what two example lines may look like: 这就是两个示例行的样子:

1 ID:K-95 list of data 1 ID:K-95 数据列表

0 ID:D-56 list of other data 0 ID:D-56 其他数据列表

Such that the first line had relevant data to ID K-95 and the second had irrelevant data to ID D-56. 这样第一行与ID K-95有相关数据,第二行与ID D-56有无关数据。

I want to parse the text file, and sort the data contained within each line based on the relevance (0 or 1) and the data ID. 我想解析文本文件,并根据相关性(0或1)和数据ID对每行中包含的数据进行排序。 Ie save each line with the same ID in order of relevance (first all the lines with 1 and then with 0). 即按相关性顺序保存每个具有相同ID的行(首先是所有行,然后是0,然后是0)。 Lines can have the same ID, but different data. 行可以具有相同的ID,但数据不同。 Lines are also always of a fixed length. 线也总是固定长度。

To do this I came up with: 为此,我想出了:

idtag = input('Enter ID:')

with open("example.txt", 'r') as f:                                                                                         
    for line in f.readlines():                                                                                              
        if line.startswith('1') and line.startswith(idtag, 5, 3):                                                                                            
            print line

Having trouble with this however. 但是遇到了麻烦。 Specifically around the second condition after the and operator. 特别是在和运营商之后的第二个条件。 I can print/select lines based on whether there is a 0 or 1, no problem. 我可以根据是否有0或1打印/选择行,没问题。 However, using the .startswith() method with a defined position seems to return nothing: no error, no printing - it simply executes and returns nothing. 但是,使用带有定义位置的.startswith()方法似乎没有返回任何内容:没有错误,没有打印 - 它只是执行并且不返回任何内容。

Any ideas? 有任何想法吗? Maybe a better way of parsing this data to meet my objective? 也许是一种更好的方法来解析这些数据以满足我的目标?

The start and end are interpreted as absolute positions (specifically: end is not interpreted relative to start ) for str.startswith : 对于str.startswithstartend被解释为绝对位置(具体来说: end不是相对于start来解释的):

str.startswith(prefix[, start[, end]])

Return True if string starts with the prefix , otherwise return False . 返回True如果字符串以前缀开头,否则返回False prefix can also be a tuple of prefixes to look for. 前缀也可以是要查找的前缀元组。 With optional start , test string beginning at that position. 使用可选的启动 ,测试字符串从该位置开始。 With optional end , stop comparing string at that position. 使用可选结束 ,停止比较该位置的字符串。

So instead of 而不是

line.startswith(idtag, 5, 3)

you need to use 你需要使用

line.startswith(idtag, 5, 5+4)

The two parameters are equivalent to slicing notation: 这两个参数相当于切片表示法:

line[5: 5+4].startswith(idtag)

For example: 例如:

>>> a = 'abcdefg'
>>> a.startswith('c', 2, 1)
False
>>> a[2:1]
''

>>> a.startswith('c', 2)
True
>>> a[2:]
'cdefg'

>>> a.startswith('c', 2, 3)
True
>>> a[2:3]
'c'

I realise there's already an answer, but as an alternative you could also just check if idtag exists in the line: 我意识到已经有了答案,但作为替代方案,您还可以检查行中是否存在idtag:

idtag = input('Enter ID:')

with open("example.txt", 'r') as f:                                                                                         
    for line in f.readlines():                                                                                              
        if line.startswith('1') and idtag in line:                                                                                            
            print line

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM