繁体   English   中英

当行内容是相关键的键值(每列的标题)时,如何用python在csv中编写嵌套字典?

[英]How to write nested dictionary in csv with python when the row contents are key values of related key (the header of each column)?

我有一本名为"output"的字典,其中嵌套了一些其他字典,如下所示:

>>> output.keys() 
dict_keys(['posts', 'totalResults', 'moreResultsAvailable', 'next', 'requestsLeft', 'warnings'])

>>> output['posts'][0].keys() 
dict_keys(['thread', 'uuid', 'url', 'ord_in_thread', 'parent_url', 'author', 'published', 'title','text', 'highlightText', 'highlightTitle', 'highlightThreadTitle', 'language', 'external_links', 'external_images', 'entities', 'rating', 'crawled', 'updated'])

>>> output['posts'][0]['thread'].keys() 
dict_keys(['uuid', 'url', 'site_full', 'site', 'site_section', 'site_categories', 'section_title', 'title', 'title_full', 'published', 'replies_count', 'participants_count', 'site_type', 'country', 'spam_score', 'main_image', 'performance_score', 'domain_rank', 'reach', 'social'])

>>> output['posts'][0]['thread']['social'].keys() 
dict_keys(['facebook', 'gplus', 'pinterest', 'linkedin', 'stumbledupon', 'vk'])

我想制作一个 csv 文件,其中包含来自output['posts'][0]output['posts'][0]['thread']output['posts'][0][的选定键列表'thread']['social']将相关值作为每行内容,我想出了以下代码:

post_keys = output['posts'][0].keys()
post_thread_keys = output['posts'][0]['thread'].keys()
social_keys = output['posts'][0]['thread']['social'].keys()

with open('file.csv', 'w', encoding='utf-8') as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=post_thread_keys)
    writer.writeheader()

    for i in range(len(output['posts'])):
         for key in output['posts'][i]['thread']:
            writer.writerow(output['posts'][i]['thread'])

它仅适用于第一级字典,即"output['posts'][0]['thread']" ,而不适用于其他内部人员,并且它也将行数加倍,现在是 200 而不是 100。

现在结果是这样的: 电流输出

希望变成这样: 所需输出

请查看我存储在谷歌驱动器上的输出文件以获得更具体的方法: file.csv

您需要一个函数来以您指定的格式创建子键。 通过使用函数,也可以调用它来为您提供标题所需的额外列名的列表。

当您添加 3 个子条目时,可以将它们从列中删除以避免重复(通过使用.pop()

import webhoseio
import csv

def get_social_entries(social):
    social_entries = {}

    for social_key, social_values in social.items():
        for key, value in social_values.items():
            social_entries[f'{social_key}_{key}'] = value
            
    return social_entries
        
    
    
# <<Get output here>>

csv_columns = []
 
first_post = output['posts'][0]

for key in first_post['thread']:
    csv_columns.append(key)
 
for key in first_post:
    if key not in ['entities', 'thread', 'social']:
        csv_columns.append(key)
 
for key in first_post['entities']:
    csv_columns.append(key)

csv_columns.extend(list(get_social_entries(first_post['thread']['social']).keys()))

with open('file.csv', 'w', encoding='utf-8', newline='') as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=csv_columns)
    writer.writeheader()
    
    for post in output['posts']:
        thread = post.pop('thread')
        entities = post.pop('entities')
        social = thread.pop('social')
        social_entries = get_social_entries(social)
        writer.writerow(post | thread | entities | social_entries)     # | operator needs Python 3.9

这假设您使用的是 Python 3.9,如果不是,您可以使用以下内容:

row = post
row.update(thread)
row.update(entities)
row.update(social_entries)
writer.writerow(row)

注意:添加newline=''以删除输出中多余的空白行。

您也可以使用类似的方法来扩展entities

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM