I have a large text file with lines in this format:
DELIM
filename1
information
information
DELIM
filename2
information
information
information
information
DELIM
and so on, where the amount of data in between the delimiters varies. How do I go about writing everything between the delimiters as a list?
Provided that DELIM
cannot be found in the in-between lines, you could do that quite easily by:
str.split
on DELIM
split
) in a list comprehension My proposal:
with open("file.txt") as f:
lines = [x.split() for x in f.read().split("DELIM") if x]
print(lines)
result with your input (as a list of lists of lines):
[['filename1', 'information', 'information'], ['filename2', 'information', 'information', 'information', 'information']]
Edit: with a big file, you could use itertools.groupy
as follows (avoids reading the file at once)
with open("file.txt") as f:
lines = [list(v) for k,v in itertools.groupby(f,key=lambda x : x.strip()=="DELIM") if not k]
groupby
groups the non-delim lines together and the delim lines together as well, with a True/False
key: we filter out the True
key with corresponds to DELIM
groups and convert to list
, to reach the same value as in the previous code, only we don't need to read the file beforehand, so it would work with a huge file as well.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.