简体   繁体   English

如何在添加数字时从 pandas 数据框创建嵌套字典

[英]How do I create nested dictionary from pandas data frame while adding numbers

I am trying to create a nested dictionary with the key as the office, then the remaining columns added within that office.我正在尝试创建一个以办公室为键的嵌套字典,然后将其余列添加到该办公室中。

Should look something like this.应该看起来像这样。

final_dict = {'YELLOW': {'Files Loaded': 21332, 'Files Assigned': 10613} 'RED':....}.... final_dict = {'YELLOW': {'Files Loaded': 21332, 'Files Assigned': 10613} 'RED':....}....

Current code is and I'm completely stuck on how to nest and add the values.当前代码是,我完全坚持如何嵌套和添加值。

d = {'Office': ['Yellow','Yellow','Red', 'Red', 'Blue', 'Blue'], 'Files Loaded': [1223, 3062, 10, 100, 1520, 75], 'Files Assigned': [1223, 30, 1500, 10, 75, 12],
     'Files Analyzed': [1223, 15, 25, 34, 98, 1000], 'Discrepancies Identified': [17, 30, 150, 1456, 186, 1896]}

df = pd.DataFrame(data=d)

fields = ['Files Loaded', 'Files Assigned', 'Files Analyzed', 'Discrepancies Identified']

final_dict = df.groupby('Office')[fields].apply(list).to_dict()
print(final_dict)

{'Blue': ['Files Loaded', 'Files Assigned', 'Files Analyzed', 'Discrepancies Identified'], 'Red': ['Files Loaded', 'Files Assigned', 'Files Analyzed', 'Discrepancies Identified'], 'Yellow': ['Files Loaded', 'Files Assigned', 'Files Analyzed', 'Discrepancies Identified']}


With the following input:使用以下输入:

import pandas as pd
from pprint import pprint

d = {'Office': ['Yellow', 'Yellow', 'Red', 'Red', 'Blue', 'Blue'], 
     'Files Loaded': [1223, 3062, 10, 100, 1520, 75],
     'Files Assigned': [1223, 30, 1500, 10, 75, 12],
     'Files Analyzed': [1223, 15, 25, 34, 98, 1000], 
     'Discrepancies Identified': [17, 30, 150, 1456, 186, 1896]}
df = pd.DataFrame(data=d)

We can use the pandas groupby and aggregation ( agg ) function to sum up the totals per office.我们可以使用 pandas groupby和聚合 ( agg ) function 来汇总每个办公室的总数。 Then by using to_dict on 'index' , we get the data provided as a dictionary, where the key is the Office and the values are a dictionary for which the key is the column name and the values are the aggregated count.然后通过在'index'上使用to_dict ,我们得到作为字典提供的数据,其中keyOffice ,值是字典,其中key是列名,值是聚合计数。

data = df.groupby('Office').agg('sum')
answer = data.to_dict('index')

pprint(answer)

Output: Output:

{'Blue': {'Discrepancies Identified': 2082,
          'Files Analyzed': 1098,
          'Files Assigned': 87,
          'Files Loaded': 1595},
 'Red': {'Discrepancies Identified': 1606,
         'Files Analyzed': 59,
         'Files Assigned': 1510,
         'Files Loaded': 110},
 'Yellow': {'Discrepancies Identified': 47,
            'Files Analyzed': 1238,
            'Files Assigned': 1253,
            'Files Loaded': 4285}}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM