简体   繁体   English

在python中读取和分组文本文件的内容

[英]reading and grouping contents of text file in python

I have a text file to read in Python 我有一个用Python阅读的文本文件

Contents 内容

line1
line2

line3
line4
line5

line6

....

Reading: 读:

with open(path, encoding="utf8", errors='ignore') as f1:
   contents = f1.readlines()
   print (contents)

OP: OP:

[line1, line2,.... line6]

But I want to read the contents based on white space separating the lines. 但我想根据分隔线条的空白区域来阅读内容。

Expected OP: 预期OP:

[[line1, line2], [line3,line4,line5], [line6]]

Is there a shorter approach than reading the entire contents of the file iterating through the list and then grouping based on the whitespaces. 是否有一个比读取文件的整个内容更短的方法,迭代列表,然后基于空格分组。 Any suggestion on the approach? 关于这种方法的任何建议?

Something like this should do what you need: 这样的事情应该做你需要的:

In [8]: result = []

In [9]: with open(path, encoding="utf8", errors='ignore') as fh:
   ...:     group = []
   ...:     for l in fh:
   ...:         l = l.strip()
   ...:         if not l:
   ...:             result.append(group)
   ...:             group = []
   ...:         else:
   ...:             group.append(l)
   ...:     if group:
   ...:         result.append(group)
   ...:

In [10]: result
Out[10]: [['line1', 'line2'], ['line3', 'line4', 'line5'], ['line6']]

Or another (not as readable) oneliner version using itertools groupby 或者使用itertools groupby的另一个(不是可读的)oneliner版本

from itertools import groupby    
[g for g in [list(g) for _, g in groupby(open(path).read().splitlines(), lambda l: bool(l.strip()))] if all(g)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM