簡體   English   中英

如何在python中按字典鍵的嵌套列表分組

[英]How to group by nested list of dictionary key in python

我很難根據字典鍵的嵌套列表對ID進行分組

下面的代碼對我來說可以根據位置對id和st值進行分組

null=''
dataset={"users": [
    {"id": 20, "loc": "Chicago", "st":"4", "sectors": [{"sname": "Retail"}, {"sname": "Manufacturing"}, {"sname": null}]}, 
    {"id": 21, "loc": "Frankfurt", "st":"4", "sectors": [{"sname": null}]}, 
    {"id": 22, "loc": "Berlin", "st":"6", "sectors": [{"sname": "Manufacturing"}, {"sname": "Banking"},{"sname": "Agri"}]}, 
    {"id": 23, "loc": "Chicago", "st":"2", "sectors": [{"sname": "Banking"}, {"sname": "Agri"}]},
    {"id": 24, "loc": "Bern", "st":"1", "sectors": [{"sname": "Retail"}, {"sname": "Agri"}]},
    {"id": 25, "loc": "Bern", "st":"4", "sectors": [{"sname": "Retail"}, {"sname": "Agri"}, {"sname": "Banking"}]}
    ]}

byloc = lambda x: x['loc']

it = (
    (loc, list(user_grp))
    for loc, user_grp in itertools.groupby(
        sorted(dataset['users'], key=byloc), key=byloc
    )
)
fs_loc = [
    {'loc': loc, 'ids': [{'id':x['id'],'st':x['st']} for x in grp], 'count': len(grp)}
    for loc, grp in it
]

print(fs_loc)

fs_loc為我提供了ID列表和各自的st值,如下所示(帶有id計數)

[
    {"loc": "Chicago","count":2,"ids": [{"id":"20","st":"4"}, {"id":"23","st":"2"}]}, 
    {"loc": "Bern","count":2,"ids": [{"id":"24","st":"1"}, {"id":"25","st":"4"}]}, 
    {"loc": "Frankfurt","count":1,"ids": [{"id":"21","st":"4"}]}, 
    {"loc": "Berlin","count":1,"ids": [{"id":"21","st":"4"}]}    
]

現在,我正在嘗試按部門名稱進行分組-我嘗試了以下代碼,但失敗了。無法找出如何實現以下結果-

所需結果:

[
    {"sname": "Retail","count":3,"ids": [{"id":"20","st":"4"}, {"id":"24","st":"1"}, {"id":"25","st":"4"}]}, 
    {"sname": "Manufacturing","count":2,"ids": [{"id":"20","st":"4"}, {"id":"22","st":"6"}]}, 
    {"sname": "Banking","count":2,"ids": [{"id":"22","st":"6"},{"id":"23","st":"2"}]}, 
    {"sname": "Agri","count":4,"ids": [{"id":"22","st":"6"},{"id":"23","st":"2"},{"id":"24","st":"1"},{"id":"25","st":"4"}]}    
]

我嘗試下面的代碼,它不適用於字典鍵的嵌套列表-

bysname = lambda x: x['sectors'][0]['sname']

it = (
    (sname, list(user_grp))
    for sname, user_grp in itertools.groupby(
        sorted(dataset['users'], key=bysname), key=bysname
    )
)
fs_sname= [
    {'sname': sname, 'ids': [{'id':x['id'],'st':x['st']} for x in grp], 'count': len(grp)}
    for sname, grp in it
]

print(fs_sname)

編輯-上面的代碼正在工作,但是它僅考慮扇區列表的第一項。 即,它給出了以下結果-

[
    {"sname": "","count":1,"ids": [{"id":"21","st":"4"}]}, 
    {"sname": "Manufacturing","count":1,"ids": [{"id":"22","st":"6"}]}, 
    {"sname": "Banking","count":1,"ids": [{"id":"23","st":"2"}]}, 
    {"sname": "Retail","count":3,"ids": [{"id":"20","st":"4"},{"id":"24","st":"1"},{"id":"25","st":"4"}]}    
]

如何達到預期結果中所述?

這應該起作用-根據需要調整summarize功能

allsectornames = set( sec['sname'] for record in dataset['users'] for sec in record['sectors'] )

summarize = lambda record:  record[ 'id' ]   # customize this to return whatever details you want (even just return the whole record itself if you prefer)

result = [
    {
        'sname':sname,
        'count':len(matches),
        'matches':[ summarize( match ) for match in matches ]
    }
    for sname in allsectornames
    for matches in [[
        record for record in dataset['users'] if sname in [ sec['sname'] for sec in record['sectors'] ]
    ]]
]

print(result)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM