简体   繁体   English

使用python将其他两个键的并集追加到字典中的键

[英]append to a key in a dictionary the union of other two keys using python

This is my input : 这是我的输入:

ClientData = {
'ClientName1': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2),

           'aggregate_Pageviews_VisitsByWeek': [],
           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                         ('2013-05-12', 1)]

                                       },


'ClientName2': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2),
                                       ('2013-03-24', 1),
      ],
           'aggregate_Pageviews_VisitsByWeek': [],
           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                      ('2013-03-31', 1),
                                      ('2013-05-12', 1),
                                      ('2013-05-19', 2),
                                      ('2013-06-30', 2)]
                                       }

}

How can I append to the key 'aggregate_Pageviews_VisitsByWeek' the union of the 'aggregate_PageviewsByWeek' and 'aggregate_VisitsByWeek' based on the date key 如何根据日期键将'aggregate_PageviewsByWeek'和'aggregate_VisitsByWeek'的并集追加到键'aggregate_Pageviews_VisitsByWeek'

the output will looks like something similar to this : 输出将类似于以下内容:

{
'ClientName1': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2)],

           'aggregate_Pageviews_VisitsByWeek': [

                                               ('2013-01-06', 2, 0),
                                               ('2013-02-03', 1, 0),
                                               ('2013-02-10', 1, ),
                                               ('2013-02-24', 1, 0),
                                               ('2013-03-03', 2, 1),
                                               ('2013-05-12', 0, 1)],
           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                         ('2013-05-12', 1)]

                                       },



'ClientName2': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2),
                                       ('2013-03-24', 1)],

           'aggregate_Pageviews_VisitsByWeek': [
                                       ('2013-01-06', 2, 0),
                                       ('2013-02-03', 1, 0),
                                       ('2013-02-10', 1, 0),
                                       ('2013-02-24', 1, 0),
                                       ('2013-03-03', 2, 1),
                                       ('2013-03-31', 1, 1),
                                       ('2013-05-12', 0, 1),
                                       ('2013-05-19', 0, 2),
                                       ('2013-06-30', 0, 2)],

           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                      ('2013-03-31', 1),
                                      ('2013-05-12', 1),
                                      ('2013-05-19', 2),
                                      ('2013-06-30', 2)]
                                       }

}

if the key "which the date in this case" is not on the other list I want to replace that value with 0 (Date, aggregate_PageviewsByWeek_Value, aggregate_VisitsByWeek_Value ) 如果键“在这种情况下为日期”不在另一个列表中,我想将该值替换为0(Date,aggregate_PageviewsByWeek_Value,aggregate_VisitsByWeek_Value)

example : 例如:
aggregate_PageviewsByWeek : ('2013-01-06', 12) and aggregate_VisitsByWeek : (2013-01-13, 30) Aggregate_PageviewsByWeek :( ('2013-01-06', 12)和aggregate_VisitsByWeek: (2013-01-13, 30)

the output will be : 输出将是:
aggregate_Pageviews_VisitsByWeek : [('2013-01-06', 12, 0), (2013-01-13, 0, 30)] aggregate_Pageviews_VisitsByWeek: [('2013-01-06', 12, 0), (2013-01-13, 0, 30)]

my goal of thsi question is to get the trends of page views and visits based on the date. 我的这个问题的目的是根据日期了解网页浏览量和访问量的趋势。

Thanks! 谢谢!

First, you need a function that merges a single client's entries. 首先,您需要一个合并单个客户端条目的功能。

There are two easy ways to merge parallel sequences that might each be missing some values: You can iterate the two in parallel, or you can build a dictionary (or sorted map) of keys, and just handle each sequence separately. 有两种简单的方法可以合并可能每个序列都缺少某些值的并行序列:您可以并行地迭代两个序列,也可以构建键的字典(或排序映射),然后分别处理每个序列。 You can see an example of the first, eg, here . 您可以在这里看到第一个示例。 But the second is simpler, at least in Python, so long as the keys are hashable. 但是第二个更简单,至少在Python中,只要键是可哈希的即可。 So: 所以:

def merge_client(client):
    merged = {}
    for day, views in client['aggregate_PageviewsByWeek']:
        merged[day] = [views, 0]
    for day, visits in client['aggregate_VisitsByWeek']:
        merged.setdefault(day, [0, 0])[1] = visits
    flattened = [tuple([key] + value) for key, value in merged.items()]
    client['aggregate_Pageviews_VisitsByWeek'] = sorted(flattened)

To make this algorithm to more than two entries, you'd use append —or, if there may be a huge number of entires, just use a dict instead of a list (so we don't have to fill in all those default 0's until the end). 为了使这个算法两个以上的项目,你会使用append -or,如果有可能entires的一个庞大的数字,只是使用的,而不是一个列表的字典(所以我们不必填写所有这些默认的0直到最后)。

Now we just need to call this on each client in the list: 现在我们只需要在列表中的每个客户端上调用它:

for client in ClientData.values():
    merge_client(client)

Convert each list to dict, combine keys of these dicts, loop thru keys and generate another list, where each element is date, value from first dict or 0 and value from second dict or 0, it is better explained via code :) 将每个列表转换为dict,组合这些dict的键,通过键循环并生成另一个列表,其中每个元素是日期,第一个dict或0的值以及第二个dict或0的值,最好通过代码进行解释:)

def merge_lists(list1, list2):
    dict1 = dict(list1)
    dict2 = dict(list2)
    dates = list(set(dict1.keys())|set(dict2.keys()))
    dates.sort()
    merged_list = []
    for date in dates:
        item = [date]
        item.append(dict1.get(date,0))
        item.append(dict2.get(date,0))
        merged_list.append(item)

    return merged_list

merged_list = merge_lists([('2013-01-06', 2),
            ('2013-02-03', 1),
            ('2013-02-10', 1),
            ('2013-02-24', 1),
            ('2013-03-03', 2),
            ('2013-03-24', 1)],
            [('2013-03-03', 1),
            ('2013-03-31', 1),
            ('2013-05-12', 1),
            ('2013-05-19', 2),
            ('2013-06-30', 2)])


import pprint
pprint.pprint(merged_list)

output: 输出:

[['2013-01-06', 2, 0],
 ['2013-02-03', 1, 0],
 ['2013-02-10', 1, 0],
 ['2013-02-24', 1, 0],
 ['2013-03-03', 2, 1],
 ['2013-03-24', 1, 0],
 ['2013-03-31', 0, 1],
 ['2013-05-12', 0, 1],
 ['2013-05-19', 0, 2],
 ['2013-06-30', 0, 2]]

You can make it generic by merging any number of lists 您可以通过合并任意数量的列表来使其通用

def merge_lists(*lists):
    dicts = [dict(l) for l in lists]
    dates = set()
    for d in dicts:
        dates |= set(d.keys())
    dates = list(dates)
    dates.sort()
    merged_list = []
    for date in dates:
        item = [date]
        for d in dicts:
            item.append(d.get(date,0))
        merged_list.append(item)

    return merged_list

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM