簡體   English   中英

從txt文件存儲數據的最簡單方法?

[英]Easiest way to store data from txt file?

我有一個包含各種系列數據的 txt 文件。 它看起來像這樣:

2017.07.16
Games
7x01
60
1
2017.07.23
Games
7x02
60
1
2017.07.30
Games
7x03
60
1
...

每 5 個數據是一個系列的 1 集(有多個)所以前五個看起來像這樣:

Date of air: 2017.07.16
Title: Games
SeasonXepisode: 7x01
Length: 60
Seen it or not: 1

我想將所有這些數據存儲在字典中,但無法找到這樣做的方法。

我嘗試過但沒有奏效的方法之一:

series = []
with open("lista.txt", "rt", encoding="utf-8") as file:
    lines = file.readlines()

count = 0

while count < len(lines):
    serie = {"air_date": lines[count],
             "title": lines[count + 1],
             "season_episode": lines[count + 2],
             "length": lines[count + 3],
             "seen": lines[count + 4]}
    series.append(serie)
    count += 1

這也不起作用:

while count < len(lines):
    count += 1
    air_date = lines[count]
    title = lines[count + 1]
    season_episode = lines[count + 2]
    length = lines[count + 3]
    seen = lines[count + 4]
    series.append([air_date, title, season_episode, length, seen])

這也不起作用:

lg = len(lines)
    for line in range(lg):
        serie = {"air_date": lines[0],
                   "title": lines[1],
                   "season_episode": lines[3],
                   "length": lines[4]}
        series.append(serie)

也嘗試了 Java 方式:

air_date = []
title = []
season_episode = []
length = []
seen = []

line = ""
count = 0

for lines in file:
    count += 1
    air_date[count - 1] = file.readline()
    title[count - 1] = file.readline()
    season_episode[count - 1] = file.readline()
    length[count - 1] = file.readline()
    seen[count - 1] = file.readline()

我對 Python 很陌生,在 Java 和 JavaScript 之后,為什么迭代不起作用很奇怪。 有任何想法嗎?

如果下面的示例是您在文件中的數據,請說data.txt

2017.07.16
Games
7x01
60
1
2017.07.23
Games
7x02
60
1
2017.07.30
Games
7x03
60
1

然后你可能想試試這個:

import json

with open("data.txt") as f:
    data = [l.strip() for l in f.readlines()]
    chunked = [data[i:i+5] for i in range(0, len(data), 5)]
    your_list_of_dicts = [
        {
            "air_date": i[0],
            "title": i[1],
            "season_episode": i[2],
            "length": i[3],
            "seen": i[4],
        } for i in chunked
    ]

    print(json.dumps(your_list_of_dicts, indent=2))

Output:

[
  {
    "air_date": "2017.07.16",
    "title": "Games",
    "season_episode": "7x01",
    "length": "60",
    "seen": "1"
  },
  {
    "air_date": "2017.07.23",
    "title": "Games",
    "season_episode": "7x02",
    "length": "60",
    "seen": "1"
  },
  {
    "air_date": "2017.07.30",
    "title": "Games",
    "season_episode": "7x03",
    "length": "60",
    "seen": "1"
  }
]

使用itertools.islice創建一個generator function 將迭代器分塊為n元素:

from itertools import islice

def chunk(iterable, n):
    iterable = iter(iterable)
    ch = list(islice(iterable, 0, n))
    while ch:
        yield ch
        ch = list(islice(iterable, 0, n))
        
keys = ["Date of air", "Title", "SeasonXepisode", "Length", "Seen it or not"]

with open('input.txt') as fh:
    database = {}
    for ch in chunk(fh, 5):
        ch = map(str.strip, ch)
        epi = dict(zip(keys, ch))
        database[f"{epi['Title']}_{epi['SeasonXepisode']}"] = epi

print(database)

Output:

{'Games_7x01': {'Date of air': '2017.07.16',
                'Length': '60',
                'SeasonXepisode': '7x01',
                'Seen it or not': '1',
                'Title': 'Games'},
 'Games_7x02': {'Date of air': '2017.07.23',
                'Length': '60',
                'SeasonXepisode': '7x02',
                'Seen it or not': '1',
                'Title': 'Games'},
 'Games_7x03': {'Date of air': '2017.07.30',
                'Length': '60',
                'SeasonXepisode': '7x03',
                'Seen it or not': '1',
                'Title': 'Games'}}

您可以使用 go 進一步使用collections.defaultdict ,但是由於您的主要問題是訪問5個連續元素,因此您可以選擇如何格式化字典,

from collections import defaultdict

database = defaultdict(dict)

with open('input.txt') as fh:
    for ch in chunk(fh, 5):
        ch = map(str.strip, ch)
        epi = dict(zip(keys, ch))
        database[f"{epi['Title']}"][f"{epi['SeasonXepisode']}"] = epi
print(database)

Output:

defaultdict(<class 'dict'>,
            {'Games': {'7x01': {'Date of air': '2017.07.16',
                                'Length': '60',
                                'SeasonXepisode': '7x01',
                                'Seen it or not': '1',
                                'Title': 'Games'},
                       '7x02': {'Date of air': '2017.07.23',
                                'Length': '60',
                                'SeasonXepisode': '7x02',
                                'Seen it or not': '1',
                                'Title': 'Games'},
                       '7x03': {'Date of air': '2017.07.30',
                                'Length': '60',
                                'SeasonXepisode': '7x03',
                                'Seen it or not': '1',
                                'Title': 'Games'}}})
with open("lista.txt", "r") as f:
    data = [elem.strip() for elem in f.readlines()] # delete '\n' and strip white spaces in each element of the list


data_json = []

for i, elem in enumerate(data):
    if i % 5 == 0:
        data_json.append(
             {
                "air_date":data[i],
                "title":data[i+1],
                "season_episode":data[i+2],
                "length":data[i+3],
                "seen":data[i+4]
             }
        )

print(data_json)

Output:

[{'air_date': '2017.07.16', 'title': 'Games', 'season_episode': '7x01', 'length': '60', 'seen': '1'}, {'air_date': '2017.07.23', 'title': 'Games', 'season_episode': '7x02', 'length': '60', 'seen': '1'}, {'air_date': '2017.07.30', 'title': 'Games', 'season_episode': '7x03', 'length': '60', 'seen': '1'}]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM