简体   繁体   中英

Read a JSON file into a pandas dataframe

I'm trying to read a series of JSON files and convert to Pandas DataFrame, however, none of the examples I've followed worked for the reading part.

This is an example of JSON files I have:

{
    "created_at": "Thu Nov 02 01:09:12 +0000 2017",
    "text": "RT @coindesk: SEC: Celebrity ICO Endorsements Could Be Illegal gHoWduXOBp t.co/iyWla0Ryuk",
    "tweet_id": 925892516087558145,
    "user_id": 153962533,
    "user_name": "Christine Duhaime"
}{
    "created_at": "Thu Nov 02 01:09:44 +0000 2017",
    "text": "Cornell Professor C t.co/RuNu6UQyr9",
    "tweet_id": 925892650884108289,
    "user_id": 1255045351,
    "user_name": "Local SEO Somerset"
}

I've tried:

with codecs.open('./output/streamer_20171022-2010.json', 'r+', encoding='utf-8') as data_file:
    data = json.load(data_file)

That resulted in

JSONDecodeError: Extra data: line 1 column 416 (char 415)

I also tried reading line by line...no success.

Any idea?

Your JSON file has an invalid format. You can only have one top level element in valid JSON

Try Placing the top level objects into an array.

[
    { "created_at": "Thu Nov 02 01:09:12 +0000 2017", 
      "text": "RT @coindesk: SEC: Celebrity ICO Endorsements Could Be Illegal gHoWduXOBp t.co/iyWla0Ryuk",
      "tweet_id": 925892516087558145,
      "user_id": 153962533, 
      "user_name": "Christine Duhaime" 
    }, { 
      "created_at": "Thu Nov 02 01:09:44 +0000 2017",
      "text": "Cornell Professor C t.co/RuNu6UQyr9", 
      "tweet_id": 925892650884108289,
      "user_id": 1255045351,
      "user_name": "Local SEO Somerset" 
    }
]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM