如何在python中按字典鍵的嵌套列表分組

Question

我很難根據字典鍵的嵌套列表對ID進行分組

下面的代碼對我來說可以根據位置對id和st值進行分組

null=''
dataset={"users": [
    {"id": 20, "loc": "Chicago", "st":"4", "sectors": [{"sname": "Retail"}, {"sname": "Manufacturing"}, {"sname": null}]}, 
    {"id": 21, "loc": "Frankfurt", "st":"4", "sectors": [{"sname": null}]}, 
    {"id": 22, "loc": "Berlin", "st":"6", "sectors": [{"sname": "Manufacturing"}, {"sname": "Banking"},{"sname": "Agri"}]}, 
    {"id": 23, "loc": "Chicago", "st":"2", "sectors": [{"sname": "Banking"}, {"sname": "Agri"}]},
    {"id": 24, "loc": "Bern", "st":"1", "sectors": [{"sname": "Retail"}, {"sname": "Agri"}]},
    {"id": 25, "loc": "Bern", "st":"4", "sectors": [{"sname": "Retail"}, {"sname": "Agri"}, {"sname": "Banking"}]}
    ]}

byloc = lambda x: x['loc']

it = (
    (loc, list(user_grp))
    for loc, user_grp in itertools.groupby(
        sorted(dataset['users'], key=byloc), key=byloc
    )
)
fs_loc = [
    {'loc': loc, 'ids': [{'id':x['id'],'st':x['st']} for x in grp], 'count': len(grp)}
    for loc, grp in it
]

print(fs_loc)

fs_loc為我提供了ID列表和各自的st值，如下所示（帶有id計數）

[
    {"loc": "Chicago","count":2,"ids": [{"id":"20","st":"4"}, {"id":"23","st":"2"}]}, 
    {"loc": "Bern","count":2,"ids": [{"id":"24","st":"1"}, {"id":"25","st":"4"}]}, 
    {"loc": "Frankfurt","count":1,"ids": [{"id":"21","st":"4"}]}, 
    {"loc": "Berlin","count":1,"ids": [{"id":"21","st":"4"}]}    
]

現在，我正在嘗試按部門名稱進行分組-我嘗試了以下代碼，但失敗了。無法找出如何實現以下結果-

所需結果：

[
    {"sname": "Retail","count":3,"ids": [{"id":"20","st":"4"}, {"id":"24","st":"1"}, {"id":"25","st":"4"}]}, 
    {"sname": "Manufacturing","count":2,"ids": [{"id":"20","st":"4"}, {"id":"22","st":"6"}]}, 
    {"sname": "Banking","count":2,"ids": [{"id":"22","st":"6"},{"id":"23","st":"2"}]}, 
    {"sname": "Agri","count":4,"ids": [{"id":"22","st":"6"},{"id":"23","st":"2"},{"id":"24","st":"1"},{"id":"25","st":"4"}]}    
]

我嘗試下面的代碼，它不適用於字典鍵的嵌套列表-

bysname = lambda x: x['sectors'][0]['sname']

it = (
    (sname, list(user_grp))
    for sname, user_grp in itertools.groupby(
        sorted(dataset['users'], key=bysname), key=bysname
    )
)
fs_sname= [
    {'sname': sname, 'ids': [{'id':x['id'],'st':x['st']} for x in grp], 'count': len(grp)}
    for sname, grp in it
]

print(fs_sname)

編輯-上面的代碼正在工作，但是它僅考慮扇區列表的第一項。 即，它給出了以下結果-

[
    {"sname": "","count":1,"ids": [{"id":"21","st":"4"}]}, 
    {"sname": "Manufacturing","count":1,"ids": [{"id":"22","st":"6"}]}, 
    {"sname": "Banking","count":1,"ids": [{"id":"23","st":"2"}]}, 
    {"sname": "Retail","count":3,"ids": [{"id":"20","st":"4"},{"id":"24","st":"1"},{"id":"25","st":"4"}]}    
]

如何達到預期結果中所述？

Answer 1

這應該起作用-根據需要調整summarize功能

allsectornames = set( sec['sname'] for record in dataset['users'] for sec in record['sectors'] )

summarize = lambda record:  record[ 'id' ]   # customize this to return whatever details you want (even just return the whole record itself if you prefer)

result = [
    {
        'sname':sname,
        'count':len(matches),
        'matches':[ summarize( match ) for match in matches ]
    }
    for sname in allsectornames
    for matches in [[
        record for record in dataset['users'] if sname in [ sec['sname'] for sec in record['sectors'] ]
    ]]
]

print(result)

如何在python中按字典鍵的嵌套列表分組

問題描述

1 個解決方案

解決方案1
1 已采納 2015-12-14 19:53:27

如何在python中按字典鍵的嵌套列表分組

問題描述

1 個解決方案

解決方案1 1 已采納 2015-12-14 19:53:27

解決方案1
1 已采納 2015-12-14 19:53:27