当行内容是相关键的键值（每列的标题）时，如何用python在csv中编写嵌套字典？

Question

我有一本名为"output"的字典，其中嵌套了一些其他字典，如下所示：

>>> output.keys() 
dict_keys(['posts', 'totalResults', 'moreResultsAvailable', 'next', 'requestsLeft', 'warnings'])

>>> output['posts'][0].keys() 
dict_keys(['thread', 'uuid', 'url', 'ord_in_thread', 'parent_url', 'author', 'published', 'title','text', 'highlightText', 'highlightTitle', 'highlightThreadTitle', 'language', 'external_links', 'external_images', 'entities', 'rating', 'crawled', 'updated'])

>>> output['posts'][0]['thread'].keys() 
dict_keys(['uuid', 'url', 'site_full', 'site', 'site_section', 'site_categories', 'section_title', 'title', 'title_full', 'published', 'replies_count', 'participants_count', 'site_type', 'country', 'spam_score', 'main_image', 'performance_score', 'domain_rank', 'reach', 'social'])

>>> output['posts'][0]['thread']['social'].keys() 
dict_keys(['facebook', 'gplus', 'pinterest', 'linkedin', 'stumbledupon', 'vk'])

我想制作一个 csv 文件，其中包含来自output['posts'][0] 、 output['posts'][0]['thread']和output['posts'][0][的选定键列表'thread']['social']将相关值作为每行内容，我想出了以下代码：

post_keys = output['posts'][0].keys()
post_thread_keys = output['posts'][0]['thread'].keys()
social_keys = output['posts'][0]['thread']['social'].keys()

with open('file.csv', 'w', encoding='utf-8') as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=post_thread_keys)
    writer.writeheader()

    for i in range(len(output['posts'])):
         for key in output['posts'][i]['thread']:
            writer.writerow(output['posts'][i]['thread'])

它仅适用于第一级字典，即"output['posts'][0]['thread']" ，而不适用于其他内部人员，并且它也将行数加倍，现在是 200 而不是 100。

现在结果是这样的：

希望变成这样：

请查看我存储在谷歌驱动器上的输出文件以获得更具体的方法： file.csv

Answer 1

您需要一个函数来以您指定的格式创建子键。 通过使用函数，也可以调用它来为您提供标题所需的额外列名的列表。

当您添加 3 个子条目时，可以将它们从列中删除以避免重复（通过使用.pop() ）

import webhoseio
import csv

def get_social_entries(social):
    social_entries = {}

    for social_key, social_values in social.items():
        for key, value in social_values.items():
            social_entries[f'{social_key}_{key}'] = value
            
    return social_entries
        
    
    
# <<Get output here>>

csv_columns = []
 
first_post = output['posts'][0]

for key in first_post['thread']:
    csv_columns.append(key)
 
for key in first_post:
    if key not in ['entities', 'thread', 'social']:
        csv_columns.append(key)
 
for key in first_post['entities']:
    csv_columns.append(key)

csv_columns.extend(list(get_social_entries(first_post['thread']['social']).keys()))

with open('file.csv', 'w', encoding='utf-8', newline='') as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=csv_columns)
    writer.writeheader()
    
    for post in output['posts']:
        thread = post.pop('thread')
        entities = post.pop('entities')
        social = thread.pop('social')
        social_entries = get_social_entries(social)
        writer.writerow(post | thread | entities | social_entries)     # | operator needs Python 3.9

这假设您使用的是 Python 3.9，如果不是，您可以使用以下内容：

row = post
row.update(thread)
row.update(entities)
row.update(social_entries)
writer.writerow(row)

注意：添加newline=''以删除输出中多余的空白行。

您也可以使用类似的方法来扩展entities 。

当行内容是相关键的键值（每列的标题）时，如何用python在csv中编写嵌套字典？

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-07-14 09:05:51

当行内容是相关键的键值（每列的标题）时，如何用python在csv中编写嵌套字典？

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-07-14 09:05:51

解决方案1
0 已采纳 2021-07-14 09:05:51