简体   繁体   English

如何读取从API检索到的JSON并将其保存到CSV文件中?

[英]How to read a JSON retrieved from an API and save it into a CSV file?

I am using a weather API that responses with a JSON file. 我正在使用使用JSON文件响应的天气API。 Here is a sample of the returned readings: 这是返回的读数的示例:

{
  'data': {
    'request': [{
      'type': 'City',
      'query': 'Karachi, Pakistan'
    }],
    'weather': [{
      'date': '2019-03-10',
      'astronomy': [{
        'sunrise': '06:46 AM',
        'sunset': '06:38 PM',
        'moonrise': '09:04 AM',
        'moonset': '09:53 PM',
        'moon_phase': 'Waxing Crescent',
        'moon_illumination': '24'
      }],
      'maxtempC': '27',
      'maxtempF': '80',
      'mintempC': '22',
      'mintempF': '72',
      'totalSnow_cm': '0.0',
      'sunHour': '11.6',
      'uvIndex': '7',
      'hourly': [{
        'time': '24',
        'tempC': '27',
        'tempF': '80',
        'windspeedMiles': '10',
        'windspeedKmph': '16',
        'winddirDegree': '234',
        'winddir16Point': 'SW',
        'weatherCode': '116',
        'weatherIconUrl': [{
          'value': 'http://cdn.worldweatheronline.net/images/wsymbols01_png_64/wsymbol_0002_sunny_intervals.png'
        }],
        'weatherDesc': [{
          'value': 'Partly cloudy'
        }],
        'precipMM': '0.0',
        'humidity': '57',
        'visibility': '10',
        'pressure': '1012',
        'cloudcover': '13',
        'HeatIndexC': '25',
        'HeatIndexF': '78',
        'DewPointC': '15',
        'DewPointF': '59',
        'WindChillC': '24',
        'WindChillF': '75',
        'WindGustMiles': '12',
        'WindGustKmph': '19',
        'FeelsLikeC': '25',
        'FeelsLikeF': '78',
        'uvIndex': '0'
      }]
    }]
  }
}

I used the following Python code in my attempt to reading the data stored in JSON file: 在尝试读取存储在JSON文件中的数据时,我使用了以下Python代码:

import simplejson as json 
data_file = open("new.json", "r") 
values = json.load(data_file)

But this outputs with an error as follows: 但这输出的错误如下:

JSONDecodeError: Expecting value: line 1 column 1 (char 0) error

I am also wondering how I can save the result in a structured format in a CSV file using Python. 我也想知道如何使用Python将结果以结构化格式保存在CSV文件中。

As stated below by Rami, the simplest way to do this would to use pandas to either a) .read_json() , or to use pd.DataFrame.from_dict() . 如下面Rami所述,最简单的方法是使用pandas来a) .read_json()pd.DataFrame.from_dict() however the issue with this particular case is you have nested dictionary/json. 但是,这种情况下的问题是您嵌套了dictionary / json。 What do I mean it's nested? 我是什么意思嵌套呢? Well, if you were to simply put this into a dataframe, you'd have this: 好吧,如果您只是将其放入数据框,则将具有以下内容:

print (df)
                                          request                                            weather
0  {'type': 'City', 'query': 'Karachi, Pakistan'}  {'date': '2019-03-10', 'astronomy': [{'sunrise...

Which is fine if that's what you want. 如果那是您想要的,那很好。 However, I am assuming you'd like all the data/instance flattened into a singe row. 但是,我假设您希望将所有数据/实例展平为单一行。

So you'll need to either use json_normalize to unravel it (which is possible, but you'd need to be certain the json file follows the same format/keys throughout. And you'd still need to pull out each of the dictionaries within the list, within the dictionaries. Other option is use some function to flatten out the nested json. Then from there you can simply write to file: 因此,您将需要使用json_normalize解散它(这是可能的,但您需要确定json文件始终遵循相同的格式/键。而且,您仍然需要提取其中的每个字典列表,在字典中。其他选择是使用一些函数来展平嵌套的json。然后您可以从那里简单地写入文件:

I choose to flatten it using a function, then construct the dataframe: 我选择使用函数将其展平,然后构造数据框:

import pandas as pd
import json
import re
from pandas.io.json import json_normalize


data = {'data': {'request': [{'type': 'City', 'query': 'Karachi, Pakistan'}], 'weather': [{'date': '2019-03-10', 'astronomy': [{'sunrise': '06:46 AM', 'sunset': '06:38 PM', 'moonrise': '09:04 AM', 'moonset': '09:53 PM', 'moon_phase': 'Waxing Crescent', 'moon_illumination': '24'}], 'maxtempC': '27', 'maxtempF': '80', 'mintempC': '22', 'mintempF': '72', 'totalSnow_cm': '0.0', 'sunHour': '11.6', 'uvIndex': '7', 'hourly': [{'time': '24', 'tempC': '27', 'tempF': '80', 'windspeedMiles': '10', 'windspeedKmph': '16', 'winddirDegree': '234', 'winddir16Point': 'SW', 'weatherCode': '116', 'weatherIconUrl': [{'value': 'http://cdn.worldweatheronline.net/images/wsymbols01_png_64/wsymbol_0002_sunny_intervals.png'}], 'weatherDesc': [{'value': 'Partly cloudy'}], 'precipMM': '0.0', 'humidity': '57', 'visibility': '10', 'pressure': '1012', 'cloudcover': '13', 'HeatIndexC': '25', 'HeatIndexF': '78', 'DewPointC': '15', 'DewPointF': '59', 'WindChillC': '24', 'WindChillF': '75', 'WindGustMiles': '12', 'WindGustKmph': '19', 'FeelsLikeC': '25', 'FeelsLikeF': '78', 'uvIndex': '0'}]}]}}

def flatten_json(y):
    out = {}
    def flatten(x, name=''):
        if type(x) is dict:
            for a in x:
                flatten(x[a], name + a + '_')
        elif type(x) is list:
            i = 0
            for a in x:
                flatten(a, name + str(i) + '_')
                i += 1
        else:
            out[name[:-1]] = x
    flatten(y)
    return out


flat = flatten_json(data['data'])


results = pd.DataFrame()
special_cols = []

columns_list = list(flat.keys())
for item in columns_list:
    try:
        row_idx = re.findall(r'\_(\d+)\_', item )[0]
    except:
        special_cols.append(item)
        continue
    column = re.findall(r'\_\d+\_(.*)', item )[0]
    column = column.replace('_', '')

    row_idx = int(row_idx)
    value = flat[item]

    results.loc[row_idx, column] = value

for item in special_cols:
    results[item] = flat[item]

results.to_csv('path/filename.csv', index=False)

Output: 输出:

print (results.to_string())
   type              query        date astronomy0sunrise astronomy0sunset astronomy0moonrise astronomy0moonset astronomy0moonphase astronomy0moonillumination maxtempC maxtempF mintempC mintempF totalSnowcm sunHour uvIndex hourly0time hourly0tempC hourly0tempF hourly0windspeedMiles hourly0windspeedKmph hourly0winddirDegree hourly0winddir16Point hourly0weatherCode                        hourly0weatherIconUrl0value hourly0weatherDesc0value hourly0precipMM hourly0humidity hourly0visibility hourly0pressure hourly0cloudcover hourly0HeatIndexC hourly0HeatIndexF hourly0DewPointC hourly0DewPointF hourly0WindChillC hourly0WindChillF hourly0WindGustMiles hourly0WindGustKmph hourly0FeelsLikeC hourly0FeelsLikeF hourly0uvIndex
0  City  Karachi, Pakistan  2019-03-10          06:46 AM         06:38 PM           09:04 AM          09:53 PM     Waxing Crescent                         24       27       80       22       72         0.0    11.6       7          24           27           80                    10                   16                  234                    SW                116  http://cdn.worldweatheronline.net/images/wsymb...            Partly cloudy             0.0              57                10            1012                13                25                78               15               59                24                75                   12                  19                25                78              0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM