![](/img/trans.png)
[英]Python: How to write dictionary to csv file when one key has multiple values, each key is a new row, but each value is a new column?
[英]How to write nested dictionary in csv with python when the row contents are key values of related key (the header of each column)?
我有一本名为"output"的字典,其中嵌套了一些其他字典,如下所示:
>>> output.keys()
dict_keys(['posts', 'totalResults', 'moreResultsAvailable', 'next', 'requestsLeft', 'warnings'])
>>> output['posts'][0].keys()
dict_keys(['thread', 'uuid', 'url', 'ord_in_thread', 'parent_url', 'author', 'published', 'title','text', 'highlightText', 'highlightTitle', 'highlightThreadTitle', 'language', 'external_links', 'external_images', 'entities', 'rating', 'crawled', 'updated'])
>>> output['posts'][0]['thread'].keys()
dict_keys(['uuid', 'url', 'site_full', 'site', 'site_section', 'site_categories', 'section_title', 'title', 'title_full', 'published', 'replies_count', 'participants_count', 'site_type', 'country', 'spam_score', 'main_image', 'performance_score', 'domain_rank', 'reach', 'social'])
>>> output['posts'][0]['thread']['social'].keys()
dict_keys(['facebook', 'gplus', 'pinterest', 'linkedin', 'stumbledupon', 'vk'])
我想制作一个 csv 文件,其中包含来自output['posts'][0] 、 output['posts'][0]['thread']和output['posts'][0][的选定键列表'thread']['social']将相关值作为每行内容,我想出了以下代码:
post_keys = output['posts'][0].keys()
post_thread_keys = output['posts'][0]['thread'].keys()
social_keys = output['posts'][0]['thread']['social'].keys()
with open('file.csv', 'w', encoding='utf-8') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=post_thread_keys)
writer.writeheader()
for i in range(len(output['posts'])):
for key in output['posts'][i]['thread']:
writer.writerow(output['posts'][i]['thread'])
它仅适用于第一级字典,即"output['posts'][0]['thread']" ,而不适用于其他内部人员,并且它也将行数加倍,现在是 200 而不是 100。
请查看我存储在谷歌驱动器上的输出文件以获得更具体的方法: file.csv
您需要一个函数来以您指定的格式创建子键。 通过使用函数,也可以调用它来为您提供标题所需的额外列名的列表。
当您添加 3 个子条目时,可以将它们从列中删除以避免重复(通过使用.pop()
)
import webhoseio
import csv
def get_social_entries(social):
social_entries = {}
for social_key, social_values in social.items():
for key, value in social_values.items():
social_entries[f'{social_key}_{key}'] = value
return social_entries
# <<Get output here>>
csv_columns = []
first_post = output['posts'][0]
for key in first_post['thread']:
csv_columns.append(key)
for key in first_post:
if key not in ['entities', 'thread', 'social']:
csv_columns.append(key)
for key in first_post['entities']:
csv_columns.append(key)
csv_columns.extend(list(get_social_entries(first_post['thread']['social']).keys()))
with open('file.csv', 'w', encoding='utf-8', newline='') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=csv_columns)
writer.writeheader()
for post in output['posts']:
thread = post.pop('thread')
entities = post.pop('entities')
social = thread.pop('social')
social_entries = get_social_entries(social)
writer.writerow(post | thread | entities | social_entries) # | operator needs Python 3.9
这假设您使用的是 Python 3.9,如果不是,您可以使用以下内容:
row = post
row.update(thread)
row.update(entities)
row.update(social_entries)
writer.writerow(row)
注意:添加newline=''
以删除输出中多余的空白行。
您也可以使用类似的方法来扩展entities
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.