[英]Store specific lines from a multiline file as values in a dictionary (Python)
[英]How to parse for specific unique values from a JSON lines file with Python and store into an array
该程序需要解析一个 JSON 行文件并将数据存储到一个数组中。 实际需要存储在数组中的唯一数据是“SRC/Word1”之后的任何值。
这是 JSON 行文件的示例:
{"Event UTC": "2020-12-21 05:23:06", "Event Time": "00:23:06:94", "SRC/Word1": " ", "Word2": " ", "Word3": " "}
{"Event UTC": "2020-12-21 05:30:53", "Event Time": "00:30:53:95", "SRC/Word1": "E1F25701", "Word2": "A29C7E68", "Word3": " "}
{"Event UTC": "2020-12-21 05:31:04", "Event Time": "00:31:04:34", "SRC/Word1": "E1F25701", "Word2": "D529F3D7", "Word3": " "}
{"Event UTC": "2020-12-21 10:18:54", "Event Time": "05:18:54:45", "SRC/Word1": "E15511D7", "Word2": "1F6FC55C", "Word3": " "}
这是我到目前为止的代码:
import json
data = []
with open('stela_zerrl_t01_201222_084053_test.json') as fin:
for line in fin:
data.append(json.loads(line))
print(data)
数据数组将包含类似 data = [E1F25701, E15511D7] 的内容
知道如何做到这一点吗?
见下文( data
代表从文件加载的行)
data = [{"Event UTC": "2020-12-21 05:23:06", "Event Time": "00:23:06:94", "SRC/Word1": " ", "Word2": " ", "Word3": " "},
{"Event UTC": "2020-12-21 05:30:53", "Event Time": "00:30:53:95", "SRC/Word1": "E1F25701", "Word2": "A29C7E68",
"Word3": " "},
{"Event UTC": "2020-12-21 05:31:04", "Event Time": "00:31:04:34", "SRC/Word1": "E1F25701", "Word2": "D529F3D7",
"Word3": " "},
{"Event UTC": "2020-12-21 10:18:54", "Event Time": "05:18:54:45", "SRC/Word1": "E15511D7", "Word2": "1F6FC55C",
"Word3": " "}]
data_sub_set = list(set(x["SRC/Word1"] for x in data if x["SRC/Word1"].strip()))
print(data_sub_set)
输出
['E1F25701', 'E15511D7']
JSON 对象只需要像字典一样访问。 如果您正在寻找SRC/Word1
字段,那么您会要求:
import json
data = []
with open('stela_zerrl_t01_201222_084053_test.json') as fin:
for line in fin:
data.append(json.loads(line)['SRC/Word1']) # not field access here
print(data)
但是如果 json 并不总是具有该字段,您可能希望省略空字符串条目或进行一些错误处理。
编辑:刚刚看到您的“跳过重复项并省略空项”评论。
import json
data = []
with open('stela_zerrl_t01_201222_084053_test.json') as fin:
for line in fin:
value = json.loads(line).get('SRC/Word1', '')
# check not all spaces and also not already present in array
if not value.isspace() and value not in data:
data.append(value)
print(data)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.