繁体   English   中英

如何遍历字典以在最终数据框中的不同行中为它们获取输出?

[英]How do I loop through dictionary to get output for them in separate rows in my final dataframe?

我是 python 新手,并尝试处理数据循环:

我正在使用 sendgrid api 类别 stats 端点,并且我成功获取了一个 API 密钥的输出。

我的代码:

import pandas as pd
import json
from pandas.io.json import json_normalize
import datetime
yesterday = datetime.date.fromordinal(datetime.date.today().toordinal()-1).strftime("%F")
import http.client

conn = http.client.HTTPSConnection("api.sendgrid.com")
payload = "{}"

keys = {
    #"CF_APHI_marketing" : "SG.UdhzjmjYR**",
#    "CF_APHI_operational": "SG.FO0b2O*",
#    "CF_HPI_marketing": "SG._dnhFlnrSL-**",
#    "CF_HPI_operational": "SG.4duN1bHiT9G1**",
#    "CF_PPI_operational": "SG.F3zH1XEB**",
#    "CF_PP_operational": "SG.9TcrDUAkSLWT**",
#    "CF_PP_marketing" : "SG.GD5wMhE_**",
    "CF_SandBox" : "SG.eDU7TY**",
#    "CF_PKN_operational" : "SG.JsIws8hLT5iYxry0oJlGqw.**",
#    "CF_PTC_operational" : "SG.2-dxeY8BRsabipDlRuvZMg.-**",
#    "CF_SP_operational" : "SG.tSF_YmxlRkmmsiHNrK2CWQ.xtyhO4UX3U--**",
#    "CF_SP_marketing" : "SG.KDuHYR-bRq-T74UexS4WpA.**-**",
#    "CF_EI_operational" : "SG.2TRc1VQMT-WDirJG-U-PtQ.sAx-**",
#    "CF_D_operational" :  "SG.KNFxNavYQ1yJwGTpnIDKxA.**-**"   
}
df = []  # Create new Dataframe

for name, value in keys.items():
    headers = { 'authorization': "Bearer " + value }

    conn.request("GET", "/v3/categories/stats/sums?aggregated_by=&start_date={d}&end_date={d}".format(d=yesterday).format(d=yesterday), payload, headers)

    res = conn.getresponse()
    data = res.read()
    print(data.decode("utf-8"))

    d = json.loads(data.decode("utf-8"))
    c=d['stats']
#    row = d['stats'][0]['name']
    # Add Brand to data row here with 'name'
    df.append(c)  # Load data row into df

当我调用 df 时,我的输出是:

df输出[8]:

[[{'type': 'category',
   'name': 'Claim Notification Email',
   'metrics': {'blocks': 0,
    'bounce_drops': 0,
    'bounces': 0,
    'clicks': 0,
    'deferred': 0,
    'delivered': 5,
    'invalid_emails': 0,
    'opens': 0,
    'processed': 0,
    'requests': 5,
    'spam_report_drops': 0,
    'spam_reports': 0,
    'unique_clicks': 0,
    'unique_opens': 0,
    'unsubscribe_drops': 0,
    'unsubscribes': 0}},
  {'type': 'category',
   'name': 'Records Request 2',
   'metrics': {'blocks': 0,
    'bounce_drops': 0,
    'bounces': 0,
    'clicks': 0,
    'deferred': 0,
    'delivered': 1,
    'invalid_emails': 0,
    'opens': 0,
    'processed': 0,
    'requests': 1,
    'spam_report_drops': 0,
    'spam_reports': 0,
    'unique_clicks': 0,
    'unique_opens': 0,
    'unsubscribe_drops': 0,
    'unsubscribes': 0}},
  {'type': 'category',
   'name': 'Records Request 1',
   'metrics': {'blocks': 0,
    'bounce_drops': 0,
    'bounces': 0,
    'clicks': 0,
    'deferred': 0,
    'delivered': 1,
    'invalid_emails': 0,
    'opens': 0,
    'processed': 0,
    'requests': 1,
    'spam_report_drops': 0,
    'spam_reports': 0,
    'unique_clicks': 0,
    'unique_opens': 0,
    'unsubscribe_drops': 0,
    'unsubscribes': 0}},
  {'type': 'category',
   'name': 'Appeal Received Email',
   'metrics': {'blocks': 0,
    'bounce_drops': 0,
    'bounces': 0,
    'clicks': 0,
    'deferred': 0,
    'delivered': 0,
    'invalid_emails': 0,
    'opens': 1,
    'processed': 0,
    'requests': 0,
    'spam_report_drops': 0,
    'spam_reports': 0,
    'unique_clicks': 0,
    'unique_opens': 0,
    'unsubscribe_drops': 0,
    'unsubscribes': 0}}]]

我需要我的最终数据框看起来像:

Brand CF_APHI_marketing

name                  blocks  bounce_drops  bounces  ...  unique_opens  unsubscribe_dropsunsubscribes 
Appeal Received Email  0         7            14                9  ...          1240               0             
Records Request       18         12            7               2000                 0              0
Records Request2       2           10           8               3333                 1             0

基本上想要获得上述 13 个品牌中每个品牌的所有类别名称的品牌(键)输出。 有没有更好的方法来创建这个数据框?

我试过

sendgrid = pd.DataFrame(df)

但是当我将它导出到 csv 时,我没有收到像上面那样的输出,只是得到了一些json文本

如果:

df = [[{'type': 'category',
   'name': 'Claim Notification Email',
   'metrics': {'blocks': 0,
    'bounce_drops': 0,
    'bounces': 0,
    'clicks': 0,
    'deferred': 0,
    'delivered': 5,
    'invalid_emails': 0,
    'opens': 0,
    'processed': 0,
    'requests': 5,
    'spam_report_drops': 0,
    'spam_reports': 0,
    'unique_clicks': 0,
    'unique_opens': 0,
    'unsubscribe_drops': 0,
    'unsubscribes': 0}},
  {'type': 'category',
   'name': 'Records Request 2',
   'metrics': {'blocks': 0,
    'bounce_drops': 0,
    'bounces': 0,
    'clicks': 0,
    'deferred': 0,
    'delivered': 1,
    'invalid_emails': 0,
    'opens': 0,
    'processed': 0,
    'requests': 1,
    'spam_report_drops': 0,
    'spam_reports': 0,
    'unique_clicks': 0,
    'unique_opens': 0,
    'unsubscribe_drops': 0,
    'unsubscribes': 0}},
  {'type': 'category',
   'name': 'Records Request 1',
   'metrics': {'blocks': 0,
    'bounce_drops': 0,
    'bounces': 0,
    'clicks': 0,
    'deferred': 0,
    'delivered': 1,
    'invalid_emails': 0,
    'opens': 0,
    'processed': 0,
    'requests': 1,
    'spam_report_drops': 0,
    'spam_reports': 0,
    'unique_clicks': 0,
    'unique_opens': 0,
    'unsubscribe_drops': 0,
    'unsubscribes': 0}},
  {'type': 'category',
   'name': 'Appeal Received Email',
   'metrics': {'blocks': 0,
    'bounce_drops': 0,
    'bounces': 0,
    'clicks': 0,
    'deferred': 0,
    'delivered': 0,
    'invalid_emails': 0,
    'opens': 1,
    'processed': 0,
    'requests': 0,
    'spam_report_drops': 0,
    'spam_reports': 0,
    'unique_clicks': 0,
    'unique_opens': 0,
    'unsubscribe_drops': 0,
    'unsubscribes': 0}}]]

尝试这个:

df = pd.DataFrame(df[0])
df_new = df[['type', 'name']]
df_metric =pd.DataFrame(list(df['metrics'].values))
sendgrid = pd.concat([df_new, df_metric], axis=1, sort=False)
sendgrid.set_index('name', inplace = True)
sendgrid.to_csv("sendgrid.csv")

你应该得到:

                              type  blocks  bounce_drops  bounces  clicks  \
name                                                                        
Claim Notification Email  category       0             0        0       0   
Records Request 2         category       0             0        0       0   
Records Request 1         category       0             0        0       0   
Appeal Received Email     category       0             0        0       0   

                          deferred  delivered  invalid_emails  opens  \
name                                                                   
Claim Notification Email         0          5               0      0   
Records Request 2                0          1               0      0   
Records Request 1                0          1               0      0   
Appeal Received Email            0          0               0      1   

                          processed  requests  spam_report_drops  \
name                                                               
Claim Notification Email          0         5                  0   
Records Request 2                 0         1                  0   
Records Request 1                 0         1                  0   
Appeal Received Email             0         0                  0   

                          spam_reports  unique_clicks  unique_opens  \
name                                                                  
Claim Notification Email             0              0             0   
Records Request 2                    0              0             0   
Records Request 1                    0              0             0   
Appeal Received Email                0              0             0   

                          unsubscribe_drops  unsubscribes  
name                                                       
Claim Notification Email                  0             0  
Records Request 2                         0             0  
Records Request 1                         0             0  
Appeal Received Email                     0             0  

IIUC,我们可以遍历您的嵌套列表并创建一个数据框,然后对其进行整形

注意我已将您的变量从 df 更改为 json_d 以使其更加明确,因为它根本不是数据帧。

dfs = []
for _ in json_d:
    for inner_ in _:
        df = pd.DataFrame.from_dict(inner_)
        dfs.append(df)

new_df = pd.concat(dfs)

print(new_df)

                       type                      name  metrics
blocks             category  Claim Notification Email        0
bounce_drops       category  Claim Notification Email        0
bounces            category  Claim Notification Email        0
clicks             category  Claim Notification Email        0
deferred           category  Claim Notification Email        0
...                     ...                       ...      ...
spam_reports       category     Appeal Received Email        0
unique_clicks      category     Appeal Received Email        0
unique_opens       category     Appeal Received Email        0
unsubscribe_drops  category     Appeal Received Email        0
unsubscribes       category     Appeal Received Email        0

然后只是一个简单的重塑到你的目标df。

df = pd.crosstab(new_df.name, new_df.index, new_df.metrics, aggfunc=lambda x: x)

del df.columns.name

print(df)
                          blocks  bounce_drops  bounces  clicks  deferred  \
name                                                                        
Appeal Received Email          0             0        0       0         0   
Claim Notification Email       0             0        0       0         0   
Records Request 1              0             0        0       0         0   
Records Request 2              0             0        0       0         0

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM