簡體   English   中英

如何使用 Python 從 JSON 行文件中解析特定的唯一值並將其存儲到數組中

[英]How to parse for specific unique values from a JSON lines file with Python and store into an array

該程序需要解析一個 JSON 行文件並將數據存儲到一個數組中。 實際需要存儲在數組中的唯一數據是“SRC/Word1”之后的任何值。

這是 JSON 行文件的示例:

{"Event UTC": "2020-12-21 05:23:06", "Event Time": "00:23:06:94", "SRC/Word1": " ", "Word2": " ", "Word3": " "}
{"Event UTC": "2020-12-21 05:30:53", "Event Time": "00:30:53:95", "SRC/Word1": "E1F25701", "Word2": "A29C7E68", "Word3": " "}
{"Event UTC": "2020-12-21 05:31:04", "Event Time": "00:31:04:34", "SRC/Word1": "E1F25701", "Word2": "D529F3D7", "Word3": " "}
{"Event UTC": "2020-12-21 10:18:54", "Event Time": "05:18:54:45", "SRC/Word1": "E15511D7", "Word2": "1F6FC55C", "Word3": " "}

這是我到目前為止的代碼:

import json

data = []
with open('stela_zerrl_t01_201222_084053_test.json') as fin:
    for line in fin:
        data.append(json.loads(line))
        print(data)

數據數組將包含類似 data = [E1F25701, E15511D7] 的內容

知道如何做到這一點嗎?

見下文( data代表從文件加載的行)

data = [{"Event UTC": "2020-12-21 05:23:06", "Event Time": "00:23:06:94", "SRC/Word1": " ", "Word2": " ", "Word3": " "},
        {"Event UTC": "2020-12-21 05:30:53", "Event Time": "00:30:53:95", "SRC/Word1": "E1F25701", "Word2": "A29C7E68",
         "Word3": " "},
        {"Event UTC": "2020-12-21 05:31:04", "Event Time": "00:31:04:34", "SRC/Word1": "E1F25701", "Word2": "D529F3D7",
         "Word3": " "},
        {"Event UTC": "2020-12-21 10:18:54", "Event Time": "05:18:54:45", "SRC/Word1": "E15511D7", "Word2": "1F6FC55C",
         "Word3": " "}]
data_sub_set = list(set(x["SRC/Word1"] for x in data if x["SRC/Word1"].strip()))
print(data_sub_set)

輸出

['E1F25701', 'E15511D7']

JSON 對象只需要像字典一樣訪問。 如果您正在尋找SRC/Word1字段,那么您會要求:

import json

data = []
with open('stela_zerrl_t01_201222_084053_test.json') as fin:
    for line in fin:
        data.append(json.loads(line)['SRC/Word1']) # not field access here
        print(data)

但是如果 json 並不總是具有該字段,您可能希望省略空字符串條目或進行一些錯誤處理。

編輯:剛剛看到您的“跳過重復項並省略空項”評論。

import json

data = []
with open('stela_zerrl_t01_201222_084053_test.json') as fin:
    for line in fin:
        value = json.loads(line).get('SRC/Word1', '')
        # check not all spaces and also not already present in array
        if not value.isspace() and value not in data:
            data.append(value)
            print(data)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM