[英]how to grab a block of lines from a text file using python
I have a large file which has lines following a pattern like this:我有一个大文件,其中包含以下模式的行:
Headline type 1标题类型 1
============================= ==============================
Common line type 1普通线路类型 1
Headline type 2标题类型 2
Common line type 2
Some random text一些随机文本
Headline type 2标题类型 2
Common line type 2
Common line type 2
Headline type 2标题类型 2
Common line type 2
Common line type 2
Common line type 2
Common line type 2
Headline type 2标题类型 2
Common line type 2
Common line type 2
Headline type 2标题类型 2
Common line type 2
Common line type 2
Headline type 2标题类型 2
Headline type 2标题类型 2
Common line type 2
Common line type 2
Headline type 2标题类型 2
Headline type 2标题类型 2
Some random text一些随机文本
Headline type 2标题类型 2
=============================== ================================
Common line type 1普通线路类型 1
Headline type 2标题类型 2
Common line type 2
Some random text一些随机文本
Headline type 2标题类型 2
Common line type 2
Common line type 2
My question is how to grab a block of lines in the form of a set like我的问题是如何以集合的形式抓取一行行
--------------------------------------------set1------------------------------------------------ --------------------------------------------------------set1----- ------------------------------------------
Headline type 1标题类型 1
============================= ==============================
Common line type 1普通线路类型 1
Headline type 2标题类型 2
Common line type 2
Some random text一些随机文本
Headline type 2标题类型 2
Common line type 2
Common line type 2
Headline type 2标题类型 2
Common line type 2
Common line type 2
Common line type 2
Common line type 2
Headline type 2标题类型 2
Common line type 2
Common line type 2
Headline type 2标题类型 2
Common line type 2
Common line type 2
Headline type 2标题类型 2
Headline type 2标题类型 2
Common line type 2
Common line type 2
Headline type 2标题类型 2
Headline type 2标题类型 2
Some random text一些随机文本
--------------------------------------------set2------------------------------------------------ --------------------------------------------set2------ ------------------------------------------
Headline type 1标题类型 1
============================= ==============================
Common line type 1普通线路类型 1
Headline type 2标题类型 2
Common line type 2
Some random text一些随机文本
Headline type 2标题类型 2
Common line type 2
Common line type 2
Headline type 2标题类型 2
Common line type 2
Common line type 2
Common line type 2
Common line type 2
I do not how to use line "=============================" to indentify start and end of a block of lines.我不如何使用 line "==============================" 来识别一行行的开始和结束。
I would appreciate any help.我将不胜感激任何帮助。
you can use a comparison inside a for loop to do that.您可以在 for 循环中使用比较来做到这一点。
# only need this if you cannot guarantee first line is '===='
start_record = False
# assume you want to store all lines with in same '====' in same list
results = []
for line in file.readlines():
# converting string to set will get all unique characters
# if only ==== exist in a line, start a new list for new block
if set(line) == {'='}:
start_record = True
results.append([])
continue
if start_record:
results[-1].append(line)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.