简体   繁体   中英

ValueError: Expected object or value when reading json as pandas dataframe

Sample data :

{
   "_id": "OzE5vaa3p7",
   "categories": [
      {
         "__type": "Pointer",
         "className": "Category",
         "objectId": "nebCwWd2Fr"
      }
   ],
   "isActive": true,
   "imageUrl": "https://firebasestorage.googleapis.com/v0/b/shopgro-1376.appspot.com/o/Barcode%20Data%20Upload%28II%29%2FAnil_puttu_flour_500g.png?alt=media&token=9cf63197-0925-4360-a31a-4675f4f46ae2",
   "barcode": "8908001921015",
   "isFmcg": true,
   "itemName": "Anil puttu flour 500g",
   "mrp": 58,
   "_created_at": "2016-10-02T13:49:03.281Z",
   "_updated_at": "2017-02-22T08:48:09.548Z"
}

{
   "_id": "ENPCL8ph1p",
   "categories": [
      {
         "__type": "Pointer",
         "className": "Category",
         "objectId": "B4nZeUHmVK"
      }
   ],
   "isActive": true,
   "imageUrl": "https://firebasestorage.googleapis.com/v0/b/kirananearby-9eaa8.appspot.com/o/Barcode%20data%20upload%2FYippee_Magic_Masala_Noodles,_70_g.png?alt=media&token=d9e47bd7-f847-4d6f-9460-4be8dbcaae00",
   "barcode": "8901725181222",
   "isFmcg": true,
   "itemName": "Yippee Magic Masala Noodles, 70 G",
   "mrp": 12,
   "_created_at": "2016-10-02T13:49:03.284Z",
   "_updated_at": "2017-02-22T08:48:09.074Z"
}

I tried:

import pandas as pd
data= pd.read_json('Data.json')

getting error ValueError: Expected object or value

also

import json
with open('gdb.json') as datafile:
    data = json.load(datafile)
retail = pd.DataFrame(data)

error: json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 509)

with open('gdb.json') as datafile:
for line in datafile:
    data = json.loads(line)
retail = pd.DataFrame(data)

error: json.decoder.JSONDecodeError: Extra data: line 1 column 577 (char 576)

How to read this json into pandas

I got the same error, read the function documentation and play around with different parameters.

I solved it by using the one below,

data= pd.read_json('Data.json', lines=True)

you can try out other things like

data= pd.read_json('Data.json', lines=True, orient='records')

data= pd.read_json('Data.json', orient=str)

Your JSON is malformed.

ValueError: Expected object or value can occur if you mistyped the file name. Does Data.json exist? I noticed for your other attempts you used gdb.json .

Once you confirm the file name is correct, you have to fix your JSON. What you have now is two disconnected records separated by a space. Lists in JSON must be valid arrays inside square brackets and separated by a comma: [{record1}, {record2}, ...]

Also, for pandas you should put your array under a root element called "data" :

{ "data": [ {record1}, {record2}, ... ] }

Your JSON should end up looking like this:

{"data":
    [{
        "_id": "OzE5vaa3p7",
        "categories": [
            {
                "__type": "Pointer",
                "className": "Category",
                "objectId": "nebCwWd2Fr"
            }
        ],
        "isActive": true,
        "imageUrl": "https://firebasestorage.googleapis.com/v0/b/shopgro-1376.appspot.com/o/Barcode%20Data%20Upload%28II%29%2FAnil_puttu_flour_500g.png?alt=media&token=9cf63197-0925-4360-a31a-4675f4f46ae2",
        "barcode": "8908001921015",
        "isFmcg": true,
        "itemName": "Anil puttu flour 500g",
        "mrp": 58,
        "_created_at": "2016-10-02T13:49:03.281Z",
        "_updated_at": "2017-02-22T08:48:09.548Z"
    }
    ,
    {
        "_id": "ENPCL8ph1p",
        "categories": [
            {
                "__type": "Pointer",
                "className": "Category",
                "objectId": "B4nZeUHmVK"
            }
        ],
        "isActive": true,
        "imageUrl": "https://firebasestorage.googleapis.com/v0/b/kirananearby-9eaa8.appspot.com/o/Barcode%20data%20upload%2FYippee_Magic_Masala_Noodles,_70_g.png?alt=media&token=d9e47bd7-f847-4d6f-9460-4be8dbcaae00",
        "barcode": "8901725181222",
        "isFmcg": true,
        "itemName": "Yippee Magic Masala Noodles, 70 G",
        "mrp": 12,
        "_created_at": "2016-10-02T13:49:03.284Z",
        "_updated_at": "2017-02-22T08:48:09.074Z"
    }]}

Finally, pandas calls this format split orientation , so you have to load it as follows:

df = pd.read_json('gdb.json', orient='split')

df now contains the following data frame:

          _id                                                   categories  isActive                                                     imageUrl        barcode  isFmcg                           itemName  mrp                      _created_at                      _updated_at
0  OzE5vaa3p7  [{'__type': 'Pointer', 'className': 'Category', 'objectI...      True  https://firebasestorage.googleapis.com/v0/b/shopgro-1376...  8908001921015    True              Anil puttu flour 500g   58 2016-10-02 13:49:03.281000+00:00 2017-02-22 08:48:09.548000+00:00
1  ENPCL8ph1p  [{'__type': 'Pointer', 'className': 'Category', 'objectI...      True  https://firebasestorage.googleapis.com/v0/b/kirananearby...  8901725181222    True  Yippee Magic Masala Noodles, 70 G   12 2016-10-02 13:49:03.284000+00:00 2017-02-22 08:48:09.074000+00:00

你应该确保终端目录与文件目录相同(当我发生这个错误时,因为我使用了vscode,对我来说意味着vscode中的终端目录与我想要的python文件不同执行)

I dont think this would be the problem as it should be the default (I think). But have you tried this? Adding an 'r' to specify the file is read only.

import json with open('gdb.json', 'r') as datafile: data = json.load(datafile) retail = pd.DataFrame(data)

make your path easy, it will be helpful to read data. meanwhile, just put your file on your desktop and give that path to read the data. It works.

You can try to change relative path to absolute path For your situation change

import pandas as pd
data= pd.read_json('Data.json')

to

import pandas as pd
data= pd.read_json('C://Data.json')#the absolute path in explore

I got the same error when I run the same code from jupyter notebook to pycharm's jupyter notebook in console

Another variation, combining tips from the thread that all failed independently but this worked for me:

pd.read_json('file.json', lines=True, encoding = 'utf-8-sig')

If you type in the absolute path of and use \ it should work. At least thats how I fixed the issue

I am not sure if I clearly understood your question, you just trying to read json data ?

I just collected your sample data into list as shown below

[
  {
   "_id": "OzE5vaa3p7",
   "categories": [
      {
         "__type": "Pointer",
         "className": "Category",
         "objectId": "nebCwWd2Fr"
      }
   ],
   "isActive": true,
   "imageUrl": "https://firebasestorage.googleapis.com/v0/b/shopgro-1376.appspot.com/o/Barcode%20Data%20Upload%28II%29%2FAnil_puttu_flour_500g.png?alt=media&token=9cf63197-0925-4360-a31a-4675f4f46ae2",
   "barcode": "8908001921015",
   "isFmcg": true,
   "itemName": "Anil puttu flour 500g",
   "mrp": 58,
   "_created_at": "2016-10-02T13:49:03.281Z",
   "_updated_at": "2017-02-22T08:48:09.548Z"
},
{
   "_id": "ENPCL8ph1p",
   "categories": [
      {
         "__type": "Pointer",
         "className": "Category",
         "objectId": "B4nZeUHmVK"
      }
   ],
   "isActive": true,
   "imageUrl": "https://firebasestorage.googleapis.com/v0/b/kirananearby-9eaa8.appspot.com/o/Barcode%20data%20upload%2FYippee_Magic_Masala_Noodles,_70_g.png?alt=media&token=d9e47bd7-f847-4d6f-9460-4be8dbcaae00",
   "barcode": "8901725181222",
   "isFmcg": true,
   "itemName": "Yippee Magic Masala Noodles, 70 G",
   "mrp": 12,
   "_created_at": "2016-10-02T13:49:03.284Z",
   "_updated_at": "2017-02-22T08:48:09.074Z"
}
]

and ran this code

import pandas as pd
df = pd.read_json('Data.json')
print(df)

Output:-

              _created_at ... mrp
0 2016-10-02 13:49:03.281 ...  58
1 2016-10-02 13:49:03.284 ...  12

[2 rows x 10 columns]

如果您尝试下面的代码,它将解决问题:

data_set = pd.read_json(r'json_file_address\file_name.json', lines=True)

I encountered this error message today, and in my case the problem was that the encoding of the text file was UTF-8-BOM instead of UTF-8, which is the default for read_json(). This can be solved by specifying the encoding:

data= pd.read_json('Data.json', encoding = 'utf-8-sig')

I faced the same problem the reason behind this is the json file has something that doesn't abide by json rules. In my case i had used single quotes in one of the values instead of double quotes.

在此处输入图片说明

this worked for me: pd.read_json('./dataset/healthtemp.json', typ="series")

every thing is ok except for one thing

in the.json file put the code below:

{
"a": {
    "_id": "OzE5vaa3p7",
    "categories": [
    {
        "__type": "Pointer",
        "className": "Category",
        "objectId": "nebCwWd2Fr"
    }
    ],
    "isActive": true,
    "imageUrl": "https://firebasestorage.googleapis.com/v0/b/shopgro-1376.appspot.com/o/Barcode%20Data%20Upload%28II%29%2FAnil_puttu_flour_500g.png?alt=media&token=9cf63197-0925-4360-a31a-4675f4f46ae2",
    "barcode": "8908001921015",
    "isFmcg": true,
    "itemName": "Anil puttu flour 500g",
    "mrp": 58,
    "_created_at": "2016-10-02T13:49:03.281Z",
    "_updated_at": "2017-02-22T08:48:09.548Z"
},
"b": {
    "_id": "ENPCL8ph1p",
    "categories": [
    {
        "__type": "Pointer",
        "className": "Category",
        "objectId": "B4nZeUHmVK"
    }
    ],
    "isActive": true,
    "imageUrl": "https://firebasestorage.googleapis.com/v0/b/kirananearby-9eaa8.appspot.com/o/Barcode%20data%20upload%2FYippee_Magic_Masala_Noodles,_70_g.png?alt=media&token=d9e47bd7-f847-4d6f-9460-4be8dbcaae00",
    "barcode": "8901725181222",
    "isFmcg": true,
    "itemName": "Yippee Magic Masala Noodles, 70 G",
    "mrp": 12,
    "_created_at": "2016-10-02T13:49:03.284Z",
    "_updated_at": "2017-02-22T08:48:09.074Z"
}
}

Thank you

The problem of ValueError: All arrays must be of the same length that happens with

df = pd.read_json (r'./filename.json')#,lines=True)

can be solved by changing the line above to the following.

df = pd.read_json (r'./filename.json',lines=True)

I just solved this problem by adding a "/" at the beggining of the absolute path.

import pandas as pd    
pd_from_json = pd.read_json("/home/miguel/folder/information.json")

Seems like there's a million things that can cause this. In my case, it was that my json file started had a byte order mark, denoted by [BOM] [unix] in the vim-airline. I don't know what the byte order mark is or when it would be needed. To remove that, in vim, I ran :set nobomb and then saved the file. Then, pandas could read it and I was good to go.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM