简体   繁体   English

将来自 API 服务的嵌套 JSON 响应解析为 python 中的 csv

[英]Parse nested JSON response from API service into csv in python

I'm trying with no luck to save the output of an API response into a CSV file in a clear and ordered way, this is the script to retrieve API data:我试图以清晰有序的方式将 API 响应的输出保存到 CSV 文件中,但没有运气,这是检索 API 数据的脚本:

import json
import requests
import csv

# List of keywords to be checked
keywords = open("/test.txt", encoding="ISO-8859-1")

keywords_to_check = []

try:
    for keyword in keywords:
        keyword = keyword.replace("\n", "")
        keywords_to_check.append(keyword)
except Exception:
        print("An error occurred. I will try again!")
        pass

apikey = # my api key
apiurl = # api url
apiparams = {
    'apikey': apikey, 
    'keyword': json.dumps(keywords_to_check), 
    'metrics_location': '2840',
    'metrics_language': 'en',
    'metrics_network': 'googlesearchnetwork',
    'metrics_currency': 'USD',
    'output': 'csv'
}
response = requests.post(apiurl, data=apiparams)
jsonize = json.dumps(response.json(), indent=4, sort_keys=True)

if response.status_code == 200:
    print(json.dumps(response.json(), indent=4, sort_keys=True))

The output I get is the following:我得到的输出如下:

{
    "results": {
        "bin": {
            "cmp": 0.795286539,
            "cpc": 3.645033,
            "m1": 110000,
            "m10": 90500,
            "m10_month": 2,
            "m10_year": 2019,
            "m11": 135000,
            "m11_month": 1,
            "m11_year": 2019,
            "m12": 135000,
            "m12_month": 12,
            "m12_year": 2018,
            "m1_month": 11,
            "m1_year": 2019,
            "m2": 110000,
            "m2_month": 10,
            "m2_year": 2019,
            "m3": 110000,
            "m3_month": 9,
            "m3_year": 2019,
            "m4": 135000,
            "m4_month": 8,
            "m4_year": 2019,
            "m5": 135000,
            "m5_month": 7,
            "m5_year": 2019,
            "m6": 110000,
            "m6_month": 6,
            "m6_year": 2019,
            "m7": 110000,
            "m7_month": 5,
            "m7_year": 2019,
            "m8": 90500,
            "m8_month": 4,
            "m8_year": 2019,
            "m9": 90500,
            "m9_month": 3,
            "m9_year": 2019,
            "string": "bin",
            "volume": 110000
        },
        "chair": {
            "cmp": 1,
            "cpc": 1.751945,
            "m1": 1000000,
            "m10": 823000,
            "m10_month": 2,
            "m10_year": 2019,
            "m11": 1500000,
            "m11_month": 1,
            "m11_year": 2019,
            "m12": 1500000,
            "m12_month": 12,
            "m12_year": 2018,
            "m1_month": 11,
            "m1_year": 2019,
            "m2": 1000000,
            "m2_month": 10,
            "m2_year": 2019,
            "m3": 1000000,
            "m3_month": 9,
            "m3_year": 2019,
            "m4": 1220000,
            "m4_month": 8,
            "m4_year": 2019,
            "m5": 1220000,
            "m5_month": 7,
            "m5_year": 2019,
            "m6": 1000000,
            "m6_month": 6,
            "m6_year": 2019,
            "m7": 1000000,
            "m7_month": 5,
            "m7_year": 2019,
            "m8": 1000000,
            "m8_month": 4,
            "m8_year": 2019,
            "m9": 1000000,
            "m9_month": 3,
            "m9_year": 2019,
            "string": "chair",
            "volume": 1220000
        }, ....

What I'd like to achieve is a csv file showing the following info and ordering, with the columns being string, cmp, cpc and volume:我想要实现的是一个 csv 文件,显示以下信息和排序,列是 string、cmp、cpc 和 volume:

string;cmp;cpc;volume字符串;cmp;每次点击费用;音量
bin;0.795286539;3.645033;110000斌;0.795286539;3.645033;110000
chair;1;1.751945;1220000椅子;1;1.751945;1220000

Following Sidous' suggestion I've come to the following:根据 Sidous 的建议,我得出以下结论:

import pandas as pd
data = response.json()
df = pd.DataFrame.from_dict(data)
df.head()

Which game me the following output:哪个游戏我有以下输出:

results结果
bin {'string': 'bin', 'volume': 110000, 'm1': 1100... bin {'string': 'bin', 'volume': 110000, 'm1': 1100 ...
chair {'string': 'chair', 'volume': 1220000, 'm1': 1...椅子{'字符串':'椅子','音量':1220000,'m1':1 ...
flower {'string': 'flower', 'volume': 1830000, 'm1': ...花{'字符串':'花','体积':1830000,'m1':...
table {'string': 'table', 'volume': 673000, 'm1': 82...表{'字符串':'表','体积':673000,'m1':82 ...
water {'string': 'water', 'volume': 673000, 'm1': 67...水{'字符串':'水','体积':673000,'m1':67 ...

Close, but still how can I show "string", "volume" etc as columns and avoid displaying the {'s of the dictioary?关闭,但仍然如何将“字符串”、“音量”等显示为列并避免显示字典的 { ?

Thanks a lot to whoever can help me sort this out :)非常感谢谁能帮我解决这个问题:)

Askew歪斜

I propose to save the response in pandas data frame, then store it by pandas (you know csv file are easily handled by pandas).我建议将响应保存在 Pandas 数据框中,然后由 Pandas 存储(你知道 csv 文件很容易被 Pandas 处理)。

import pandas as pd


# receiving results in a dictionary
dic = response.json()

# remove the results key from the dictionary
dic = dic.pop("results", None)

# convert dictionary to dataframe
data = pd.DataFrame.from_dict(dic, orient='index')

# string;cmp;cpc;volume
new_data = pd.concat([data['string'], data['cmp'], data['cpc'], data['volume']], axis=1)

# removing the default index (bin and chair keys)
new_data.reset_index(drop=True, inplace=True)

print(new_data)

# saving new_data into a csv file
new_data.to_csv('name_of_file.csv')

You find the csv file in the same directory of the python file (otherwise you can specify it in the .to_csv() method).你可以在 python 文件的同一目录中找到 csv 文件(否则你可以在 .to_csv() 方法中指定它)。

You can see the final result in screen shot below.您可以在下面的屏幕截图中看到最终结果。

在此处输入图片说明

Open a text file using with open command and further write the data down by iterating through the whole dict使用with open命令打开一个文本文件with open并通过遍历整个dict进一步写下数据

with open("text.csv", "w+") as f:
    f.write('string;cmp;cpc;volume\n')
    for res in response.values():     #This is after I assumed that `response` is of type dict
        for r in res.values():
            f.write(r['string']+';'+str(r['cmp'])+';'+str(r['cpc'])+';'+str(r['volume'])+'\n')

try this:尝试这个:

import pandas as pd

data = response.json()
cleaned_data = []

for key, val in data["results"].items():
    cleaned_data.append(val)

df = pd.DataFrame.from_dict(cleaned_data)
df1 = df[["string","cmp","cpc","volume"]]
df1.head()
df1.to_csv("output.csv")

What about using a csv.DictWriter , since your data are almost what it needs to function?使用csv.DictWriter怎么样,因为您的数据几乎是它需要的功能?

import csv

if __name__ is "__main__":
  results = {"chair": {"cmp": 1, "cpc": 3.64}, "bin": {"cmp": 0.5, "cpc": 1.75}} # two rows will do for the example
  # Now let's get the data structure we really want: a list of rows
  rows = []
  for key, value in results:
    rows.append(results)
    # And, while we're at it, set the string part
    rows[-1]["string"] = key

  # Create the header
  fieldnames = set()
  for row in rows:
    for fname in row:
      fieldnames.add(fname)

  # Write to the file
  with open("mycsv.csv", "w", newline="") as file_:
    writer = csv.DictWriter(file_, fieldnames=fieldnames)
    writer.writeheader()
    for row in rows:
      writer.writerow(row)

You should be good with that kind of stuff, without using any other lib你应该擅长这种东西,不使用任何其他库

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM