简体   繁体   English

遍历嵌套的混合字典

[英]Iterate through Nested, mixed dictionary

I have been using CrunchBase API, and the output provided is as the following example (actual example here ): 我一直在使用CrunchBase API,提供的输出如以下示例( 此处为实际示例):

 output = {'name':'StackOverflow',
       'competitors':[{   'competitor':'bing',
                          'link':'bing.com'},
                      {   'competitor':'google',
                          'link':'google.com'}],
       'acquisition': {'acquired_day': 16,
                       'acquired_month': 12,
                       'acquired_year': 2013,
                       'acquiring_company': {'name': 'Viggle',
                                             'permalink': 'viggle'}}}  

(this is just an example). (这只是一个例子)。

The point is, in the output dict there are several values that can be unicode/int, lists or dictionaries. 关键是,在输出字典中有几个值可以是unicode / int,列表或字典。 There values can hold lists, dict or unicode as well. 这些值也可以包含列表,字典或Unicode。

How could I iterate through the dict? 我如何遍历该命令? I tried itertools.product but it only seems to work when the structure of the dict is uniform. 我尝试了itertools.product,但它似乎仅在dict的结构统一时才起作用。 My goal is to turn this output JSON file into a csv. 我的目标是将此输出JSON文件转换为csv。

I am not completely sure what you wish to achieve exactly, but if your output is actually one line in the requested CSV, you may need to "flatten" the nested dictionary first. 我不确定要实现什么,但是如果您的output实际上是所请求的CSV中的一行,则可能需要先“拼合”嵌套的字典。

Assuming your structure is a dict whose values are either "simple" (strings, floats, etc.), or dicts, or lists (nested, unlimited depth), and assume there's some character (for example, "_") which does not appear in any of the keys, you may flatten the dict using the following recursive function (or any other similar one): 假设您的结构是一个dict,其值要么是“简单”(字符串,浮点数等),要么是dicts或列表(嵌套,无限深度),并假设存在某个字符(例如“ _”)出现在任何键中,您可以使用以下递归函数(或任何其他类似的函数)来平整字典:

def _flatten_items(items, sep, prefix):
  _items = []
  for key, value in items:
    _prefix = "{}{}".format(prefix, key)
    if isinstance(value, list):
      _items.extend(_flatten_items(list(enumerate(value)), sep=sep,
                    prefix=_prefix+sep))
    elif isinstance(value, dict):
      _items.extend(_flatten_items(value.items(), sep=sep,
                    prefix=_prefix+sep))
    else:
      _items.append((_prefix, value))
  return _items


def flatten_dict(d, sep='_'):
  return dict(_flatten_items(d.items(), sep=sep, prefix=""))

As an example, in your output this should give: 例如,在您的output应该给出:

output = {'name':'StackOverflow',
       'competitors':[{   'competitor':'bing',
                          'link':'bing.com'},
                      {   'competitor':'google',
                          'link':'google.com'}],
       'acquisition': {'acquired_day': 16,
                       'acquired_month': 12,
                       'acquired_year': 2013,
                       'acquiring_company': {'name': 'Viggle',
                                             'permalink': 'viggle'}}}

print flatten_dict(output)
# {'acquisition_acquired_year': 2013, 'acquisition_acquiring_company_name': 'Viggle', 'name': 'StackOverflow', 'acquisition_acquiring_company_permalink': 'viggle', 'competitors_0_competitor': 'bing', 'acquisition_acquired_month': 12, 'competitors_1_link': 'google.com', 'acquisition_acquired_day': 16, 'competitors_1_competitor': 'google', 'competitors_0_link': 'bing.com'}

Then you may use csv DictWriter (or similar) to write the output data to csv. 然后,您可以使用csv DictWriter (或类似工具)将输出数据写入csv。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM