I have a txt file with data from various series. It looks like this:
2017.07.16
Games
7x01
60
1
2017.07.23
Games
7x02
60
1
2017.07.30
Games
7x03
60
1
...
Every 5 data is for 1 episode of one series (there is multiple) So it looks like this for the first five:
Date of air: 2017.07.16
Title: Games
SeasonXepisode: 7x01
Length: 60
Seen it or not: 1
I want to store all of this data in a dictionary, but couldn't figure out a way to do so.
One of the ways I tried and didn't work:
series = []
with open("lista.txt", "rt", encoding="utf-8") as file:
lines = file.readlines()
count = 0
while count < len(lines):
serie = {"air_date": lines[count],
"title": lines[count + 1],
"season_episode": lines[count + 2],
"length": lines[count + 3],
"seen": lines[count + 4]}
series.append(serie)
count += 1
this also doesn't work:
while count < len(lines):
count += 1
air_date = lines[count]
title = lines[count + 1]
season_episode = lines[count + 2]
length = lines[count + 3]
seen = lines[count + 4]
series.append([air_date, title, season_episode, length, seen])
this also doesn't work:
lg = len(lines)
for line in range(lg):
serie = {"air_date": lines[0],
"title": lines[1],
"season_episode": lines[3],
"length": lines[4]}
series.append(serie)
tried the Java way as well:
air_date = []
title = []
season_episode = []
length = []
seen = []
line = ""
count = 0
for lines in file:
count += 1
air_date[count - 1] = file.readline()
title[count - 1] = file.readline()
season_episode[count - 1] = file.readline()
length[count - 1] = file.readline()
seen[count - 1] = file.readline()
I'm very new to Python, and after Java and JavaScript it's quite strange why iterating doesn't work. Any ideas?
If the sample below is your data in a file, say data.txt
.
2017.07.16
Games
7x01
60
1
2017.07.23
Games
7x02
60
1
2017.07.30
Games
7x03
60
1
Then you might want to try this:
import json
with open("data.txt") as f:
data = [l.strip() for l in f.readlines()]
chunked = [data[i:i+5] for i in range(0, len(data), 5)]
your_list_of_dicts = [
{
"air_date": i[0],
"title": i[1],
"season_episode": i[2],
"length": i[3],
"seen": i[4],
} for i in chunked
]
print(json.dumps(your_list_of_dicts, indent=2))
Output:
[
{
"air_date": "2017.07.16",
"title": "Games",
"season_episode": "7x01",
"length": "60",
"seen": "1"
},
{
"air_date": "2017.07.23",
"title": "Games",
"season_episode": "7x02",
"length": "60",
"seen": "1"
},
{
"air_date": "2017.07.30",
"title": "Games",
"season_episode": "7x03",
"length": "60",
"seen": "1"
}
]
Make a generator
function that chunks an iterator in n
elements, using itertools.islice
:
from itertools import islice
def chunk(iterable, n):
iterable = iter(iterable)
ch = list(islice(iterable, 0, n))
while ch:
yield ch
ch = list(islice(iterable, 0, n))
keys = ["Date of air", "Title", "SeasonXepisode", "Length", "Seen it or not"]
with open('input.txt') as fh:
database = {}
for ch in chunk(fh, 5):
ch = map(str.strip, ch)
epi = dict(zip(keys, ch))
database[f"{epi['Title']}_{epi['SeasonXepisode']}"] = epi
print(database)
Output:
{'Games_7x01': {'Date of air': '2017.07.16',
'Length': '60',
'SeasonXepisode': '7x01',
'Seen it or not': '1',
'Title': 'Games'},
'Games_7x02': {'Date of air': '2017.07.23',
'Length': '60',
'SeasonXepisode': '7x02',
'Seen it or not': '1',
'Title': 'Games'},
'Games_7x03': {'Date of air': '2017.07.30',
'Length': '60',
'SeasonXepisode': '7x03',
'Seen it or not': '1',
'Title': 'Games'}}
You could go a little further with collections.defaultdict
, but as your main problem is accessing 5 successive elements, it is your choice how to format the dict,
from collections import defaultdict
database = defaultdict(dict)
with open('input.txt') as fh:
for ch in chunk(fh, 5):
ch = map(str.strip, ch)
epi = dict(zip(keys, ch))
database[f"{epi['Title']}"][f"{epi['SeasonXepisode']}"] = epi
print(database)
Output:
defaultdict(<class 'dict'>,
{'Games': {'7x01': {'Date of air': '2017.07.16',
'Length': '60',
'SeasonXepisode': '7x01',
'Seen it or not': '1',
'Title': 'Games'},
'7x02': {'Date of air': '2017.07.23',
'Length': '60',
'SeasonXepisode': '7x02',
'Seen it or not': '1',
'Title': 'Games'},
'7x03': {'Date of air': '2017.07.30',
'Length': '60',
'SeasonXepisode': '7x03',
'Seen it or not': '1',
'Title': 'Games'}}})
with open("lista.txt", "r") as f:
data = [elem.strip() for elem in f.readlines()] # delete '\n' and strip white spaces in each element of the list
data_json = []
for i, elem in enumerate(data):
if i % 5 == 0:
data_json.append(
{
"air_date":data[i],
"title":data[i+1],
"season_episode":data[i+2],
"length":data[i+3],
"seen":data[i+4]
}
)
print(data_json)
Output:
[{'air_date': '2017.07.16', 'title': 'Games', 'season_episode': '7x01', 'length': '60', 'seen': '1'}, {'air_date': '2017.07.23', 'title': 'Games', 'season_episode': '7x02', 'length': '60', 'seen': '1'}, {'air_date': '2017.07.30', 'title': 'Games', 'season_episode': '7x03', 'length': '60', 'seen': '1'}]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.