繁体   English   中英

如何获取以数据帧格式转换的 json 输出?

[英]How do I get my output which is json converted in a dataframe format?

我是 python 新手,并试图从 API 格式化我的输出:

输出数据帧为:

**data**
Out[8]: b'[{"date":"2020-01-19","stats":[{"metrics":{"blocks":5,"bounce_drops":6,"bounces":16,"clicks":278,"deferred":8,"delivered":1453,"invalid_emails":6,"opens":2502,"processed":155,"requests":1484,"spam_report_drops":0,"spam_reports":0,"unique_clicks":199,"unique_opens":1013,"unsubscribe_drops":0,"unsubscribes":0}}]}]\n'

我想以表格形式制作它,以便我可以将其导出到 csv:

我试过:

import pandas as pd
merge_HOO = {'blocks': [], 'bounce_drops': [], 'bounces': [], 'clicks': []}
for i, restaurant in enumerate(data):
    for item in restaurant['metrics']:
        merge_HOO['blocks'].append(i)
        merge_HOO['bounce_drops'].append(item['bounce_drops'])
        merge_HOO['bounces'].append(item['bounces'])
        merge_HOO['clicks'].append(item['clicks'])


merge_HOO = pd.DataFrame(merge_HOO,
                         columns=['blocks', 'bounce_drops', 'bounces', 'clicks'])
print(merge_HOO)
Traceback (most recent call last):

  File "<ipython-input-9-ad0eecd65eba>", line 4, in <module>
    for item in restaurant['metrics']:

TypeError: 'int' object is not subscriptable

但我得到了上述错误。

我希望它在我的 csv 中看起来像下面这样,我有各自的标题和每个下面的统计信息:

blocks bounce_drops  bounces
5       6             16

这是一种方法:

from pandas.io.json import json_normalize

# lets say d is your list containing dict
f = json_normalize(d)

# reshape the data
cols = ['blocks','bounce_drops','bounces']
df = f['stats'].apply(lambda x: pd.Series(x[0]))['metrics'].apply(pd.Series)[cols]

   blocks  bounce_drops  bounces
0       5             6       16

样本数据

d = [{"date":"2020-01-19","stats":[{"metrics":{"blocks":5,"bounce_drops":6,"bounces":16,"clicks":278,"deferred":8,"delivered":1453,"invalid_emails":6,"opens":2502,"processed":155,"requests":1484,"spam_report_drops":0,"spam_reports":0,"unique_clicks":199,"unique_opens":1013,"unsubscribe_drops":0,"unsubscribes":0}}]}]

您错过了一个列表,在错误item解析为stats list

File "<ipython-input-9-ad0eecd65eba>", line 4, in <module>
    for item in restaurant['metrics']:

你不需要熊猫只是为了输出 csv,只需使用csv模块。
假设 JSON 是列表中的字典:

import csv, io

j = [{
        "date": "2020-01-19",
        "stats": [
                    { "metrics":{
                                "blocks": 5,
                                "bounce_drops": 6,
                                "bounces": 16,
                                "clicks": 278,
                                "deferred": 8,
                                "delivered": 1453,
                                "invalid_emails": 6,
                                "opens": 2502,
                                "processed": 155,
                                "requests": 1484,
                                "spam_report_drops": 0,
                                "spam_reports": 0,
                                "unique_clicks": 199,
                                "unique_opens": 1013,
                                "unsubscribe_drops": 0,
                                "unsubscribes": 0
                                }
                    }
                ]
    },
    {
        "date": "2020-01-18",
        "stats": [
                    { "metrics":{
                                "blocks": 5,
                                "bounce_drops": 6,
                                "bounces": 16,
                                "clicks": 278,
                                "deferred": 8,
                                "delivered": 1453,
                                "invalid_emails": 6,
                                "opens": 2502,
                                "processed": 155,
                                "requests": 1484,
                                "spam_report_drops": 0,
                                "spam_reports": 0,
                                "unique_clicks": 199,
                                "unique_opens": 1013,
                                "unsubscribe_drops": 0,
                                "unsubscribes": 0
                                }
                    }
                ]
    },
    {
        "date": "2020-01-17",
        "stats": [
                    { "metrics":{
                                "blocks": 5,
                                "bounce_drops": 6,
                                "bounces": 16,
                                "clicks": 278,
                                "deferred": 8,
                                "delivered": 1453,
                                "invalid_emails": 6,
                                "opens": 2502,
                                "processed": 155,
                                "requests": 1484,
                                "spam_report_drops": 0,
                                "spam_reports": 0,
                                "unique_clicks": 199,
                                "unique_opens": 1013,
                                "unsubscribe_drops": 0,
                                "unsubscribes": 0
                                }
                    }
                ]
    }]

merged = {'blocks': 0, 'bounce_drops': 0, 'bounces': 0, 'clicks': 0}

for i, d in enumerate(j):
    for lst in d['stats']:
        metrics = lst['metrics']
        merged['blocks']       += metrics['blocks']
        merged['bounce_drops'] += metrics['bounce_drops']
        merged['bounces']      += metrics['bounces']
        merged['clicks']       += metrics['clicks']

print(merged)
# {'blocks': 15, 'bounce_drops': 18, 'bounces': 48, 'clicks': 834}

在写入文件之前测试使用io.StringIO

result = io.StringIO(initial_value='', newline='\n')
fieldnames = list(merged.keys())
writer = csv.DictWriter(result, fieldnames=fieldnames)
writer.writeheader()
writer.writerow(merged)
print(result.getvalue())
# blocks,bounce_drops,bounces,clicks
# 15,18,48,834

如果你坚持使用熊猫

import pandas as pd

df = pd.DataFrame(merged, index=[0])
csv = df.to_csv(index=False)
print(csv)
# 'blocks,bounce_drops,bounces,clicks\n15,18,48,834\n'

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM