简体   繁体   中英

Read JSON array from a text file in python

I am trying to read a text file with a JSON array in it. Can someone help me to find my mistake?

   ["io", {"in": 8, "out": 0, "dev": "68", "time": 1532035082.614868}]
   ["io", {"in": 0, "out": 0, "dev": "68", "time": 1532035082.97122}]
   ["test", {"A": [{"para1":[], "para2": true, "para3": 68, "name":"", "observation":[[2,3],[3,2]],"time": 1532035082.97122}]}]

I did not manage to have it run with

  import gzip
  with gzip.open('myfile', 'rb') as f
      json_data = jsonload(f)
  print(json_data)

I have an error: json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 278) I can see that my file is not a JSON file but represents a JSON array I manage to have it work with pandas but I'd like to find out how to do it without pandas.

import pandas as pd
data = pd.read_json('myfile', lines=True)
print(data.head())

data = [json.loads(e) for e in f if e.strip()]

Example of usage

The data in your text file is not a valid JSON. These are actually three JSON files one after another.

You have at last two options for fixing it:

  • put all the JSON to a single one, for example by making a list at the top of the structure
  • read your file line by line and load JSONs separately

I found 2 options:

option1: pandas

pd.read_json(filepath,compression='infer', orient='records, lines=True)

option 2: read line by line

with gzip.open('myfile', 'rb') as f:
     lines=f.readlines()
     data = [json.loads(l) for l in lines]
print(data)

Option 1 is fastest:

option 1 = 0.31 s

option 2 = 0.42 s

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM