使用python将其他两个键的并集追加到字典中的键

Question

This is my input : 这是我的输入：

ClientData = {
'ClientName1': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2),

           'aggregate_Pageviews_VisitsByWeek': [],
           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                         ('2013-05-12', 1)]

                                       },


'ClientName2': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2),
                                       ('2013-03-24', 1),
      ],
           'aggregate_Pageviews_VisitsByWeek': [],
           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                      ('2013-03-31', 1),
                                      ('2013-05-12', 1),
                                      ('2013-05-19', 2),
                                      ('2013-06-30', 2)]
                                       }

}

How can I append to the key 'aggregate_Pageviews_VisitsByWeek' the union of the 'aggregate_PageviewsByWeek' and 'aggregate_VisitsByWeek' based on the date key 如何根据日期键将'aggregate_PageviewsByWeek'和'aggregate_VisitsByWeek'的并集追加到键'aggregate_Pageviews_VisitsByWeek'

the output will looks like something similar to this : 输出将类似于以下内容：

{
'ClientName1': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2)],

           'aggregate_Pageviews_VisitsByWeek': [

                                               ('2013-01-06', 2, 0),
                                               ('2013-02-03', 1, 0),
                                               ('2013-02-10', 1, ),
                                               ('2013-02-24', 1, 0),
                                               ('2013-03-03', 2, 1),
                                               ('2013-05-12', 0, 1)],
           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                         ('2013-05-12', 1)]

                                       },



'ClientName2': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2),
                                       ('2013-03-24', 1)],

           'aggregate_Pageviews_VisitsByWeek': [
                                       ('2013-01-06', 2, 0),
                                       ('2013-02-03', 1, 0),
                                       ('2013-02-10', 1, 0),
                                       ('2013-02-24', 1, 0),
                                       ('2013-03-03', 2, 1),
                                       ('2013-03-31', 1, 1),
                                       ('2013-05-12', 0, 1),
                                       ('2013-05-19', 0, 2),
                                       ('2013-06-30', 0, 2)],

           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                      ('2013-03-31', 1),
                                      ('2013-05-12', 1),
                                      ('2013-05-19', 2),
                                      ('2013-06-30', 2)]
                                       }

}

if the key "which the date in this case" is not on the other list I want to replace that value with 0 (Date, aggregate_PageviewsByWeek_Value, aggregate_VisitsByWeek_Value ) 如果键“在这种情况下为日期”不在另一个列表中，我想将该值替换为0（Date，aggregate_PageviewsByWeek_Value，aggregate_VisitsByWeek_Value）

example : 例如：
aggregate_PageviewsByWeek : ('2013-01-06', 12) and aggregate_VisitsByWeek : (2013-01-13, 30) Aggregate_PageviewsByWeek ：( ('2013-01-06', 12)和aggregate_VisitsByWeek： (2013-01-13, 30)

the output will be : 输出将是：
aggregate_Pageviews_VisitsByWeek : [('2013-01-06', 12, 0), (2013-01-13, 0, 30)] aggregate_Pageviews_VisitsByWeek： [('2013-01-06', 12, 0), (2013-01-13, 0, 30)]

my goal of thsi question is to get the trends of page views and visits based on the date. 我的这个问题的目的是根据日期了解网页浏览量和访问量的趋势。

Thanks! 谢谢！

Answer 1

First, you need a function that merges a single client's entries. 首先，您需要一个合并单个客户端条目的功能。

There are two easy ways to merge parallel sequences that might each be missing some values: You can iterate the two in parallel, or you can build a dictionary (or sorted map) of keys, and just handle each sequence separately. 有两种简单的方法可以合并可能每个序列都缺少某些值的并行序列：您可以并行地迭代两个序列，也可以构建键的字典（或排序映射），然后分别处理每个序列。 You can see an example of the first, eg, here . 您可以在这里看到第一个示例。 But the second is simpler, at least in Python, so long as the keys are hashable. 但是第二个更简单，至少在Python中，只要键是可哈希的即可。 So: 所以：

def merge_client(client):
    merged = {}
    for day, views in client['aggregate_PageviewsByWeek']:
        merged[day] = [views, 0]
    for day, visits in client['aggregate_VisitsByWeek']:
        merged.setdefault(day, [0, 0])[1] = visits
    flattened = [tuple([key] + value) for key, value in merged.items()]
    client['aggregate_Pageviews_VisitsByWeek'] = sorted(flattened)

To make this algorithm to more than two entries, you'd use append —or, if there may be a huge number of entires, just use a dict instead of a list (so we don't have to fill in all those default 0's until the end). 为了使这个算法两个以上的项目，你会使用append -or，如果有可能entires的一个庞大的数字，只是使用的，而不是一个列表的字典（所以我们不必填写所有这些默认的0直到最后）。

Now we just need to call this on each client in the list: 现在我们只需要在列表中的每个客户端上调用它：

for client in ClientData.values():
    merge_client(client)

Answer 2

Convert each list to dict, combine keys of these dicts, loop thru keys and generate another list, where each element is date, value from first dict or 0 and value from second dict or 0, it is better explained via code :) 将每个列表转换为dict，组合这些dict的键，通过键循环并生成另一个列表，其中每个元素是日期，第一个dict或0的值以及第二个dict或0的值，最好通过代码进行解释：）

def merge_lists(list1, list2):
    dict1 = dict(list1)
    dict2 = dict(list2)
    dates = list(set(dict1.keys())|set(dict2.keys()))
    dates.sort()
    merged_list = []
    for date in dates:
        item = [date]
        item.append(dict1.get(date,0))
        item.append(dict2.get(date,0))
        merged_list.append(item)

    return merged_list

merged_list = merge_lists([('2013-01-06', 2),
            ('2013-02-03', 1),
            ('2013-02-10', 1),
            ('2013-02-24', 1),
            ('2013-03-03', 2),
            ('2013-03-24', 1)],
            [('2013-03-03', 1),
            ('2013-03-31', 1),
            ('2013-05-12', 1),
            ('2013-05-19', 2),
            ('2013-06-30', 2)])


import pprint
pprint.pprint(merged_list)

output: 输出：

[['2013-01-06', 2, 0],
 ['2013-02-03', 1, 0],
 ['2013-02-10', 1, 0],
 ['2013-02-24', 1, 0],
 ['2013-03-03', 2, 1],
 ['2013-03-24', 1, 0],
 ['2013-03-31', 0, 1],
 ['2013-05-12', 0, 1],
 ['2013-05-19', 0, 2],
 ['2013-06-30', 0, 2]]

You can make it generic by merging any number of lists 您可以通过合并任意数量的列表来使其通用

def merge_lists(*lists):
    dicts = [dict(l) for l in lists]
    dates = set()
    for d in dicts:
        dates |= set(d.keys())
    dates = list(dates)
    dates.sort()
    merged_list = []
    for date in dates:
        item = [date]
        for d in dicts:
            item.append(d.get(date,0))
        merged_list.append(item)

    return merged_list

使用python将其他两个键的并集追加到字典中的键

问题描述

2 个解决方案

解决方案1
2 2013-09-18 19:48:28

解决方案2
1 已采纳 2013-09-18 19:39:08

使用python将其他两个键的并集追加到字典中的键

问题描述

2 个解决方案

解决方案1 2 2013-09-18 19:48:28

解决方案2 1 已采纳 2013-09-18 19:39:08

解决方案1
2 2013-09-18 19:48:28

解决方案2
1 已采纳 2013-09-18 19:39:08