简体   繁体   English

JSON文件到Pandas df

[英]JSON file to Pandas df

I'm trying to convert a JSON file into a pandas df to remove unwanted data and limit to a csv of ID's the data looks like this: 我正在尝试将JSON文件转换为pandas df以删除不需要的数据并限制为ID的csv,数据如下所示:

{
     "data": [
    {
      "message": "Uneeded message",
      "created_time": "2017-04-02T17:20:37+0000",
      "id": "723456782912449_1008262099345654"
    },
    {
      "message": "Uneeded message",
      "created_time": "2017-03-28T06:26:28+0000",
      "id": "771345678912449_1003934567871010"
    },

I've not used JSON before but the code i've used to load this data is 我之前没有使用过JSON,但我用来加载这些数据的代码是

import pandas as pd
import json

with open('fileName.json', encoding="utf8" ) as f:
    w = json.loads(f.read(), strict=False)

The end output should just be a CSV with a column of ID's 结束输出应该只是一个带有ID列的CSV

I think you need json_normalize : 我认为你需要json_normalize

from pandas.io.json import json_normalize 
import json

with open('file.json') as data_file:    
    d = json.load(data_file)

print (d)
{
    "data": [{
        "message": "Uneeded message",
        "created_time": "2017-04-02T17:20:37+0000",
        "id": "723456782912449_1008262099345654"
    }, {
        "message": "Uneeded message",
        "created_time": "2017-03-28T06:26:28+0000",
        "id": "771345678912449_1003934567871010"
    }]
}

df = json_normalize(d, 'data')
print (df)
               created_time                                id          message
0  2017-04-02T17:20:37+0000  723456782912449_1008262099345654  Uneeded message
1  2017-03-28T06:26:28+0000  771345678912449_1003934567871010  Uneeded message

using json.loads 使用json.loads

setup 设定

json_str = """{
 "data": [
        {
          "message": "Uneeded message",
          "created_time": "2017-04-02T17:20:37+0000",
          "id": "723456782912449_1008262099345654"
        },
        {
          "message": "Uneeded message",
          "created_time": "2017-03-28T06:26:28+0000",
          "id": "771345678912449_1003934567871010"
        }]}"""

solution

import json
import pandas as pd

pd.DataFrame(json.loads(json_str)['data'])

               created_time                                id          message
0  2017-04-02T17:20:37+0000  723456782912449_1008262099345654  Uneeded message
1  2017-03-28T06:26:28+0000  771345678912449_1003934567871010  Uneeded message

Or with the json in the file 或者使用文件中的json

with open('neutraluk1.json') as f:
    print(pd.DataFrame(json.load(f)['data']))

               created_time                                id          message
0  2017-04-02T17:20:37+0000  723456782912449_1008262099345654  Uneeded message
1  2017-03-28T06:26:28+0000  771345678912449_1003934567871010  Uneeded message

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM