I'm trying to create a list of lists from a text-file. My text-file contains different categories each containing three sentences. It looks kind of like this:
Sentence 1
Sentence 2
Sentence 3
Sentence 1
Sentence 2
Sentence 3
Sentence 1... etc.
I want to read these and save each category into a list, and then make a list of those lists/categories. Unfortunally all of my attempts have failed so far, since they cant handle more than one line at a time. The blank line in between the categories is intended as a partition.
You can use a list comprehension:
with open('file', 'r') as f:
data = f.readlines()
result = [ data[i:i+3] for i in range(0,len(data),4)]
What is happening is that data
contains each sentence, data[i:i+3]
is a category, and I use a list comprehension to make a list of categories.
You can use itertools.groupby
:
>>> from itertools import groupby
with open('filename') as f:
lis = [map(str.strip, g) for k,g in
groupby(f, key = lambda x : not x.strip()) if not k]
...
>>> lis
[['Sentence 1', 'Sentence 2', 'Sentence 3'],
['Sentence 1', 'Sentence 2', 'Sentence 3'],
['Sentence 1']]
If the file is small then this is also fine:
with open('abc1') as f:
print [map(str.strip, x.rsplit('\n')) for x in f.read().rsplit('\n\n')]
...
[['Sentence 1', 'Sentence 2', 'Sentence 3'],
['Sentence 1', 'Sentence 2', 'Sentence 3'],
['Sentence 1']]
It will be python one-liner :)
result = list(list(l for l in e.split("\n") if l) for e in open("file").read().split("\n\n"))
How it works?
open("file").read().split("\\n\\n")
opens file, reads it and splits on blocks divided by double-enter.
list(l for l in e.split("\\n") if l)
splits one block (named as e
) to lines and makes list from it. if l
is used to eliminate empty lines, if you'd used more than two enters or have got empty lasy line.
The last thing is to connect it - result = list( expression_2 for e in expression_1)
- We just use expression_2
on every block from expression_1
and make list from them. Simply and in one line :)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.