在python中读取某个部分的文件

Question

我正在尝试遵循此处给出的答案：

在阅读我走过布尔路线或第二个答案的特定短语之后的行时。

我只需要从文件中获取两个开头和结尾部分之间的数字

<type>
1 
2
3
<type>

但是，当我使用此代码时：

found_type = False
t_ype = [] 
with open('test.xml', 'r') as f:
    for line in f:
        if '<type>' in line:
            found_type = True
        if found_type:
            if '</type>' in line:
               found_type = False               
            else:    
                t_line = str(line).rstrip('\n')
                t_ype.append(t_line)

我不能跳过第一行并得到：

'<type>', '1','2','3'

我只想在哪里

'1','2','3'

当我点击时结束添加到列表的末尾，因为我不需要列表中的那个

我不确定自己在做什么错，也无法在页面上提问，因为我的代表不够高。

Answer 1

在检测到“标头”之后，您必须跳过其余的for循环。 在代码中，您将found_type设置为True ，然后将if found_type: check设置为。

found_type = False
t_ype = [] 
with open('test.xml', 'r') as f:
    for line in f:
        if '<type>' in line:
            found_type = True
            continue                    # This is the only change to your code.
                                        # When the header is found, immediately go to the next line
        if found_type:
            if '</type>' in line:
               found_type = False               
            else:    
                t_line = str(line).rstrip('\n')
                t_ype.append(t_line)

Answer 2

最简单的方法是产生收益的双循环：

def section(fle, begin, end):
    with open(fle) as f:
        for line in f:
            # found start of section so start iterating from next line
            if line.startswith(begin):
                for line in f: 
                    # found end so end function
                    if line.startswith(end):
                        return
                    # yield every line in the section
                    yield line.rstrip()

然后只需调用list(section('test.xml','<type>','</type>'))或遍历for line in section('test.xml','<type>','</type>'):use lines ，如果您有重复的节，则将返回的内容换成休止符。 您也不需要在行上调用str，因为它们已经是字符串了，如果文件很大，则注释中的groupby方法可能是更好的选择。

在python中读取某个部分的文件

问题描述

2 个解决方案

解决方案1
1 已采纳 2016-02-15 19:44:38

解决方案2
0 2016-02-15 20:05:21

在python中读取某个部分的文件

问题描述

2 个解决方案

解决方案1 1 已采纳 2016-02-15 19:44:38

解决方案2 0 2016-02-15 20:05:21

解决方案1
1 已采纳 2016-02-15 19:44:38

解决方案2
0 2016-02-15 20:05:21