I'm having trouble splitting text in a data file such that suppose the the data file consisted of:
Row 1
apple
bob
cat
dog
ear
fun
Row 2
glow
horse
idea
joke
kick
lemon
Row 3
money
new
odd
park
queen
run
I want to split it so that it becomes a nested list like the following:
[[apple, bob], [cat, dog], [ear, fun]],
[[glow, horse], [idea, joke], [kick, lemon]],
[[money, new], [odd, park], [queen, run]]
This is my work so far:
def text_file(data_file):
nested_list = []
main_list = []
my_list = ''
for index in data_file:
index = index.strip()
if (index in my_list):
main_list.append(nested_list)
nested_list = []
else:
nested_list.append(index)
if (nested_list):
main_list.append(nested_list)
return (main_list)
but this returns:
text_file(open("data_file.txt", "r"))
[['Row 1', 'apple', 'bob', 'cat', 'dog', 'ear', 'fun'],
['Row 2', 'glow', 'horse', 'idea', 'joke', 'kick', 'lemon'],
['Row 3', 'money', 'new', 'odd', 'park', 'queen', 'run']]
Without importing anything, how can I achieve this? If possible what can I add into my code?
What you need to do is split the file by \\n\\n
(two newlines) which will give you the groups, then split the result of that by line, then use zip
to step over the file appropriately to build your required lists, an eg:
s = """Row 1
apple
bob
cat
dog
ear
fun
Row 2
glow
horse
idea
joke
kick
lemon
Row 3
money
new
odd
park
queen
run"""
lines = s.split('\n\n')
for line in lines:
words = line.splitlines()
print([ [i, j] for i, j in zip(words[1::2], words[2::2]) ])
[['apple', 'bob'], ['cat', 'dog'], ['ear', 'fun']]
[['glow', 'horse'], ['idea', 'joke'], ['kick', 'lemon']]
[['money', 'new'], ['odd', 'park'], ['queen', 'run']]
something like this, using regex
and iterators
.
using regex
split at Row number
, and then you can either use zip
or iterator
to get the expected output.
In [8]: with open("data.txt") as f:
spl=re.split(r"Row \d+",f.read())[1:]
for x in spl:
sp=x.split()
it=iter(sp)
print ([[next(it),next(it)] for _ in range(len(sp)//2)])
...:
[['apple', 'bob'], ['cat', 'dog'], ['ear', 'fun']]
[['glow', 'horse'], ['idea', 'joke'], ['kick', 'lemon']]
[['money', 'new'], ['odd', 'park'], ['queen', 'run']]
if (nested_list):
new_list = nested_list[1:]
main_list.append(zip(new_list[::2], new_list[1::2]))
Try this out
The above code, instead of appending the nested list in the main list, first forms pairs of consecutive elements and then appends it.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.