簡體   English   中英

ValueError: Expected object or value <-> Can't load a json file to pandas dataframe, or convert to csv, either will suffice

[英]ValueError: Expected object or value <-> Can't load a json file to pandas dataframe, or convert to csv, either will suffice

我有一個大約1.5 GB大小的JSON文件,我需要用作Z6A8064B5DF479455555555555555555555555555555555057DZ,我將所有問題逐漸逐步逐漸加載到Z66647C551571414716A,第二個選項我嘗試將其轉換為 csv,然后將其加載為 dataframe,但這也失敗了,以及在先前回答的問題中,人們只是解釋了錯誤而不是給出代碼:這里是 Z466DEEC76ECDF2FCA6D38571F63

{'work': '2505753', 'flags': [], 'unixtime': 1260403200, 'stars': 1.0, 'nhelpful': 0, 'time': 'Dec 10, 2009', 'comment': "I really thought that I would like this book. I'm fascinated by this time period, and the plots to assassinate Hitler have always intrigued me. However, this book was so boring that I had to force myself to read it. The author no doubt has a commanding vocabulary, but his writing style and word choices made the book a chore to read. I've read dry textbooks that had more life to them than this novel. ", 'user': 'schatzi'}
{'work': '12458291', 'flags': [], 'unixtime': 1361664000, 'stars': 4.0, 'nhelpful': 0, 'time': 'Feb 24, 2013', 'comment': "After her father's death, Lena discovers that her father had been keeping many secrets from her. Lena is a member of the. Silenti, telepaths who came to our world through a portal. She must learn to navigate through the social, religious, and political pitfalls of her new life. Who can she trust? What will her role be? I enjoyed this story and the world the author created very much. ", 'user': 'aztwinmom'}

我嘗試將此代碼作為轉換為 csv 的第二個選項,我調試的錯誤是單引號,但是在這個龐大的數據中用"\""替換"\'"將花費大量時間。

嘗試使用 json

import json
import csv
import os

f = open('test.json')
data = json.load(f)
f.close()

f = open('data.json')
csv_file = csv.writer(f)
count=0
for item in data:
    f.writerow(item)
    count+=1
    if(count==10):
        break

f.close()

追溯

---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
<ipython-input-115-d75bae392cae> in <module>
      1 f = open('test.json')
----> 2 data = json.load(f)
      3 f.close()

e:\Anaconda3\lib\json\__init__.py in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    294         cls=cls, object_hook=object_hook,
    295         parse_float=parse_float, parse_int=parse_int,
--> 296         parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
    297 
    298 

e:\Anaconda3\lib\json\__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    346             parse_int is None and parse_float is None and
    347             parse_constant is None and object_pairs_hook is None and not kw):
--> 348         return _default_decoder.decode(s)
    349     if cls is None:
    350         cls = JSONDecoder

e:\Anaconda3\lib\json\decoder.py in decode(self, s, _w)
    335 
    336         """
--> 337         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    338         end = _w(s, end).end()
    339         if end != len(s):

e:\Anaconda3\lib\json\decoder.py in raw_decode(self, s, idx)
    351         """
    352         try:
--> 353             obj, end = self.scan_once(s, idx)
    354         except StopIteration as err:
    355             raise JSONDecodeError("Expecting value", s, err.value) from None

JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
  • pd.read_json('test.json')結果:
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-118-771e17311e28> in <module>
----> 1 pd.read_json('test.json')

e:\Anaconda3\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
    212                 else:
    213                     kwargs[new_arg_name] = new_arg_value
--> 214             return func(*args, **kwargs)
    215 
    216         return cast(F, wrapper)

e:\Anaconda3\lib\site-packages\pandas\io\json\_json.py in read_json(path_or_buf, orient, typ, dtype, convert_axes, convert_dates, keep_default_dates, numpy, precise_float, date_unit, encoding, lines, chunksize, compression)
    606         return json_reader
    607 
--> 608     result = json_reader.read()
    609     if should_close:
    610         filepath_or_buffer.close()

e:\Anaconda3\lib\site-packages\pandas\io\json\_json.py in read(self)
    729             obj = self._get_object_parser(self._combine_lines(data.split("\n")))
    730         else:
--> 731             obj = self._get_object_parser(self.data)
    732         self.close()
    733         return obj

e:\Anaconda3\lib\site-packages\pandas\io\json\_json.py in _get_object_parser(self, json)
    751         obj = None
    752         if typ == "frame":
--> 753             obj = FrameParser(json, **kwargs).parse()
    754 
    755         if typ == "series" or obj is None:

e:\Anaconda3\lib\site-packages\pandas\io\json\_json.py in parse(self)
    855 
    856         else:
--> 857             self._parse_no_numpy()
    858 
    859         if self.obj is None:

e:\Anaconda3\lib\site-packages\pandas\io\json\_json.py in _parse_no_numpy(self)
   1087         if orient == "columns":
   1088             self.obj = DataFrame(
-> 1089                 loads(json, precise_float=self.precise_float), dtype=None
   1090             )
   1091         elif orient == "split":

ValueError: Expected object or value
  • 該錯誤非常清楚地表明這些不是 JSON 格式,因為您有{'work'和 JSON 將是{"work" ,單引號與雙引號。
  • 使用.replace("'", '"')將不起作用,因為'comment'的值被正確地雙引號( "..." ),因為有些單詞帶有撇號(例如"...father's..." ). 使用替換,將產生類似'...father"s...'結果。
  • 您有一個文件,其中包含dicts行。
  • 需要讀入文件,將每一行轉換為str類型
  • 使用ast.literal_eval將每一行轉換回dict類型
  • 將字典列表rows讀入 dataframe。
import pandas as pd
from ast import literal_eval
from pathlib import Path

# read file
file = Path('e:/PythonProjects/stack_overflow/test.json')  # path to file
with file.open('r', encoding='utf-8') as f:  # open the file
    rows = [literal_eval(row) for row in f.readlines()]  # list comprehension to convert each row back to a dict

# convert rows to a dataframe
df = pd.DataFrame(rows)

# display(df)
       work flags    unixtime  stars  nhelpful          time                                                                                                                                                                                                                                                                                                                                                                                                              comment       user
0   2505753    []  1260403200    1.0         0  Dec 10, 2009  I really thought that I would like this book. I'm fascinated by this time period, and the plots to assassinate Hitler have always intrigued me. However, this book was so boring that I had to force myself to read it. The author no doubt has a commanding vocabulary, but his writing style and word choices made the book a chore to read. I've read dry textbooks that had more life to them than this novel.     schatzi
1  12458291    []  1361664000    4.0         0  Feb 24, 2013                   After her father's death, Lena discovers that her father had been keeping many secrets from her. Lena is a member of the. Silenti, telepaths who came to our world through a portal. She must learn to navigate through the social, religious, and political pitfalls of her new life. Who can she trust? What will her role be? I enjoyed this story and the world the author created very much.   aztwinmom

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM