[英]How to add values in column by key in nested dictionary of list in python
[英]How to group by nested list of dictionary key in python
我很難根據字典鍵的嵌套列表對ID進行分組
下面的代碼對我來說可以根據位置對id和st值進行分組
null=''
dataset={"users": [
{"id": 20, "loc": "Chicago", "st":"4", "sectors": [{"sname": "Retail"}, {"sname": "Manufacturing"}, {"sname": null}]},
{"id": 21, "loc": "Frankfurt", "st":"4", "sectors": [{"sname": null}]},
{"id": 22, "loc": "Berlin", "st":"6", "sectors": [{"sname": "Manufacturing"}, {"sname": "Banking"},{"sname": "Agri"}]},
{"id": 23, "loc": "Chicago", "st":"2", "sectors": [{"sname": "Banking"}, {"sname": "Agri"}]},
{"id": 24, "loc": "Bern", "st":"1", "sectors": [{"sname": "Retail"}, {"sname": "Agri"}]},
{"id": 25, "loc": "Bern", "st":"4", "sectors": [{"sname": "Retail"}, {"sname": "Agri"}, {"sname": "Banking"}]}
]}
byloc = lambda x: x['loc']
it = (
(loc, list(user_grp))
for loc, user_grp in itertools.groupby(
sorted(dataset['users'], key=byloc), key=byloc
)
)
fs_loc = [
{'loc': loc, 'ids': [{'id':x['id'],'st':x['st']} for x in grp], 'count': len(grp)}
for loc, grp in it
]
print(fs_loc)
fs_loc為我提供了ID列表和各自的st值,如下所示(帶有id計數)
[
{"loc": "Chicago","count":2,"ids": [{"id":"20","st":"4"}, {"id":"23","st":"2"}]},
{"loc": "Bern","count":2,"ids": [{"id":"24","st":"1"}, {"id":"25","st":"4"}]},
{"loc": "Frankfurt","count":1,"ids": [{"id":"21","st":"4"}]},
{"loc": "Berlin","count":1,"ids": [{"id":"21","st":"4"}]}
]
現在,我正在嘗試按部門名稱進行分組-我嘗試了以下代碼,但失敗了。無法找出如何實現以下結果-
所需結果:
[
{"sname": "Retail","count":3,"ids": [{"id":"20","st":"4"}, {"id":"24","st":"1"}, {"id":"25","st":"4"}]},
{"sname": "Manufacturing","count":2,"ids": [{"id":"20","st":"4"}, {"id":"22","st":"6"}]},
{"sname": "Banking","count":2,"ids": [{"id":"22","st":"6"},{"id":"23","st":"2"}]},
{"sname": "Agri","count":4,"ids": [{"id":"22","st":"6"},{"id":"23","st":"2"},{"id":"24","st":"1"},{"id":"25","st":"4"}]}
]
我嘗試下面的代碼,它不適用於字典鍵的嵌套列表-
bysname = lambda x: x['sectors'][0]['sname']
it = (
(sname, list(user_grp))
for sname, user_grp in itertools.groupby(
sorted(dataset['users'], key=bysname), key=bysname
)
)
fs_sname= [
{'sname': sname, 'ids': [{'id':x['id'],'st':x['st']} for x in grp], 'count': len(grp)}
for sname, grp in it
]
print(fs_sname)
編輯-上面的代碼正在工作,但是它僅考慮扇區列表的第一項。 即,它給出了以下結果-
[
{"sname": "","count":1,"ids": [{"id":"21","st":"4"}]},
{"sname": "Manufacturing","count":1,"ids": [{"id":"22","st":"6"}]},
{"sname": "Banking","count":1,"ids": [{"id":"23","st":"2"}]},
{"sname": "Retail","count":3,"ids": [{"id":"20","st":"4"},{"id":"24","st":"1"},{"id":"25","st":"4"}]}
]
如何達到預期結果中所述?
這應該起作用-根據需要調整summarize
功能
allsectornames = set( sec['sname'] for record in dataset['users'] for sec in record['sectors'] )
summarize = lambda record: record[ 'id' ] # customize this to return whatever details you want (even just return the whole record itself if you prefer)
result = [
{
'sname':sname,
'count':len(matches),
'matches':[ summarize( match ) for match in matches ]
}
for sname in allsectornames
for matches in [[
record for record in dataset['users'] if sname in [ sec['sname'] for sec in record['sectors'] ]
]]
]
print(result)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.