简体   繁体   English

python字典中部分匹配键的最大值

[英]Max values for partial matching keys in python dictionary

I have the following dictionary where the keys are 'month,country:ID' and values are just totals: 我有以下字典,其中的键是“ month,country:ID”,值仅是总计:

ID_dict = {'11,United Kingdom:14416': 129.22, '11,United Kingdom:17001': 357.6, 
'12,United States:14035': 90000.0, '12,United Kingdom:17850': 241.16,'12,United 
States:14099': 90000.0, '12,France:12583': 252.0, '12,United Kingdom:13047': 
215.13, '01,Germany:12662': 78.0, '01,Germany:12600': 14000}

The actual dictionary will be much larger than this one. 实际的字典将比这本大得多。

I am trying to return the key for each 'month, country' that contains the highest total. 我试图返回包含最高总数的每个“月,国家”的密钥。 If there is a tie the ID's would be separated by a comma. 如果有平局,则ID用逗号分隔。 Example Output based on dictionary above: 基于以上字典的示例输出:

'11,United Kingdom:17001'
'12,United Kingdom:17850'
'12,United States:14035, 14099'
'12,France:12583'
'01,Germany:12600'

I can get the strings of the highest values using the following code: 我可以使用以下代码获取最高值的字符串:

highest = max(ID_dict.values())
print([k for k, v in ID_dict.items() if v == highest])

But really struggling to get past this point. 但是,真正要努力克服这一点。 I was experimenting using re.match and re.search but was not getting very far with those. 我当时正在尝试使用re.match和re.search,但并没有走得太远。

You can find the maximum for each month, country pair, store this relation in a dictionary. 您可以找到month, country对的最大值,并将此关系存储在字典中。 Then create a dictionary that have as keys the pairs (month, country) and as values a list of IDs that have value equal to the maximum for that (month, country) pair: 然后创建一个字典,该字典具有对(month, country)和作为值的IDs列表,这些IDs值等于该对(month, country)对的最大值:

import re

ID_dict = {'11,United Kingdom:14416': 129.22, '11,United Kingdom:17001': 357.6,
           '12,United States:14035': 90000.0, '12,United Kingdom:17850': 241.16, '12,United States:14099': 90000.0,
           '12,France:12583': 252.0, '12,United Kingdom:13047': 215.13, '01,Germany:12662': 78.0,
           '01,Germany:12600': 14000}

table = {tuple(re.split(',|:', key)[:2]): value for key, value in sorted(ID_dict.items(), key=lambda e: e[1])}

result = {}
for key, value in ID_dict.items():
    splits = re.split(',|:', key)
    if value == table[tuple(splits[:2])]:
        result.setdefault(tuple(splits[:2]), []).append(splits[2])

for key, value in result.items():
    print('{}:{}'.format(','.join(key), ', '.join(value)))

Output 输出量

01,Germany:12600
12,United States:14099, 14035
12,United Kingdom:17850
11,United Kingdom:17001
12,France:12583

The above approach is O(nlogn) because it uses sorted , to make it O(n) you can change the dictionary comprehension by this loop: 上面的方法是O(nlogn),因为它使用sorted ,使其变为O(n),您可以通过以下循环更改字典理解:

table = {}
for s, v in ID_dict.items():
    key = tuple(re.split(',|:', s)[:2])
    table[key] = max(table.get(key, v), v)

The following code creates a new dictionary with 'month,country' keys and lists of (value, IDnum) as the values. 以下代码使用“ month,country”键和(值,IDnum)列表作为值创建一个新字典。 It then sorts each list, and collects all the IDnums that correspond to the highest value. 然后,它对每个列表进行排序,并收集与最大值对应的所有IDnum。

ID_dict = {
    '11,United Kingdom:14416': 129.22, '11,United Kingdom:17001': 357.6, 
    '12,United States:14035': 90000.0, '12,United Kingdom:17850': 241.16,
    '12,United States:14099': 90000.0, '12,France:12583': 252.0, 
    '12,United Kingdom:13047': 215.13, '01,Germany:12662': 78.0, 
    '01,Germany:12600': 14000
}

# Create a new dict with 'month,country' keys 
# and lists of (value, IDnum) as the values
new_data = {}
for key, val in ID_dict.items():
    newkey, idnum = key.split(':')
    new_data.setdefault(newkey, []).append((val, idnum))

# Sort the values for each 'month,country' key,
# and get the IDnums corresponding to the highest values
for key, val in new_data.items():
    val = sorted(val, reverse=True)
    highest = val[0][0]
    # Collect all IDnums that have the highest value
    ids = []
    for v, idnum in val:
        if v != highest:
            break
        ids.append(idnum)
    print(key + ':' + ', '.join(ids))

output 输出

11,United Kingdom:17001
12,United States:14099, 14035
12,United Kingdom:17850
12,France:12583
01,Germany:12600

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM