繁体   English   中英

在解析JSON文件时,如何处理被导入的“ {key:value}”对的存在?

[英]How do I handle the presence of `{key:value}' pairs that get imported as is when parsing a JSON file?

我有一个标准的Android设备状态JSON文件,尝试将其作为Pandas DF读取,然后将其导出到Excel文件。

我在下面粘贴文件的前两行:

{"ageCorrectionFactor":{"d":"Age Correction Factor","i":"1252"},"backCamera":{"d":"Working Fine. No issues","i":"79"},"battery":{"d":"Working Fine. No issues","i":"86"},"bill":{"d":"No"},"bluetooth":{"d":"Working Fine. No issues"},"box":{"d":"No"},"boxHidden":{"d":"Box hidden","i":"467"},"cameraHidden":{"d":"Camera hidden ","i":"494"},"charger":{"d":"No","i":"87"},"chargerHidden":{"d":"Charger hidden","i":"476"},"chargingDefect":{"d":"Working Fine. No issues"},"chargingPortHidden":{"d":"Charging Port Hidden","i":"764"},"earphone":{"d":"No","i":"88"},"frontCamera":{"d":"Front Camera"},"hiddenBattery":{"d":"Hidden battery","i":"777"},"mobileAge":{"d":"Above 11 months","i":"97"},"physicalCondition":{"d":"Physical Condition","i":"800"},"powerButton":{"d":"Working Fine. No issues"},"screen":{"d":"Working Fine. No issues"},"screenHidden":{"d":"NA","i":"220"},"screenIssue":{"d":"Screen Touch Issue"},"speakers":{"d":"Working Fine. No issues"},"tmsPrice":{"d":"TMS Price Improvement"},"volumeButton":{"d":"Working Fine. No issues"},"wifiGpsBluetooth":{"d":"Working Fine. No issues"},"workingNonworking":{"d":"Yes","i":"76"}},
{"ageCorrectionFactor":{"d":"Age Correction Factor","i":"1252"},"backCamera":{"d":"Working Fine. No issues","i":"79"},"battery":{"d":"Working Fine. No issues"},"bill":{"d":"No","i":"90"},"bluetooth":{"d":"Working Fine. No issues"},"box":{"d":"No","i":"89"},"boxHidden":{"d":"Box hidden","i":"467"},"cameraHidden":{"d":"Camera hidden ","i":"496"},"charger":{"d":"No","i":"87"},"chargerHidden":{"d":"Charger hidden","i":"477"},"chargingDefect":{"d":"Working Fine. No issues"},"chargingPortHidden":{"d":"Charging Port Hidden","i":"764"},"earphone":{"d":"No","i":"88"},"frontCamera":{"d":"Front Camera"},"hiddenBattery":{"d":"Hidden battery","i":"779"},"mobileAge":{"d":"Above 11 months","i":"96"},"physicalCondition":{"d":"Physical Condition","i":"91"},"powerButton":{"d":"Working Fine. No issues"},"screen":{"d":"Working Fine. No issues"},"screenHidden":{"d":"NA","i":"219"},"screenIssue":{"d":"Screen Touch Issue"},"speakers":{"d":"Working Fine. No issues"},"tmsPrice":{"d":"TMS Price Improvement"},"volumeButton":{"d":"Working Fine. No issues"},"wifiGpsBluetooth":{"d":"Working Fine. No issues","i":"81"},"workingNonworking":{"d":"Yes"}},

现在,我使用read_json()函数将文件转换为Pandas数据框对象,并得到以下结果(我粘贴在这里的第一行的一部分):

>>> df.head(1)


ageCorrectionFactor  \
0  {u'i': u'1252', u'd': u'Age Correction Factor'}   

                                        backCamera  \
0  {u'i': u'79', u'd': u'Working Fine. No issues'}   

                                           battery           bill  \
0  {u'i': u'86', u'd': u'Working Fine. No issues'}  {u'd': u'No'}   

                            bluetooth            box  \
0  {u'd': u'Working Fine. No issues'}  {u'd': u'No'}   

显然,问题是我无法分解内部的“键”:“值”对,因此我得到的输出不正确。

另外,我使用正则表达式删除了不需要的对,但是我的目的是尽量不要更改任何原始数据。

有什么方法可以使用pandas或regex和python本地JSON解析函数的组合来获得正确的输出?


R中的相同操作产生了令人信服的结果,

json_file <- fromJSON("E:/pathto/file.json")
json_file <- lapply(json_file, function(x) {
  x[sapply(x, is.null)] <- NA
  unlist(x)
})    
JSON_DF <- as.data.frame(do.call("rbind", json_file))

在此处输入图片说明

尝试使用Vaishali Garg的方法,但首先使用json模块加载文件。

import json
import pandas as pd

with open('E:/pathto/file.json') as f:
    data = json.load(f)

df = pd.io.json.json_normalize(data)

尝试这个:

df = pd.io.json.json_normalize(f) #f is the json filename

它返回一个具有40列的数据框

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM