简体   繁体   中英

Nested Dictionary from dataframe with inner dictionary containing a pandas series as the value

I am having trouble trying to create a nested dictionary where the inner nest take a series for the value.

Here's a simple dataframe:

import pandas as pd
import random

catGrpAll = ['Category_A']*3 + ['Category_B']*3
catGrpAll = catGrpAll*4
codeGrpAll = ['code1','code2','code3']
codeGrpAll = codeGrpAll*8
dateGrpAll = [pd.to_datetime('2021-03-31')]*6 + [pd.to_datetime('2021-04-30')]*6 +\
             [pd.to_datetime('2021-05-31')]*6 + [pd.to_datetime('2021-06-30')]*6

random.seed(0)
numAll = [ random.randint(100, 5000) for _ in range(24)]


df = pd.DataFrame(data={'Category':catGrpAll,
                        'Code':codeGrpAll,
                        'Time':dateGrpAll,
                        'Amount':numAll})                    
del catGrpAll,codeGrpAll,dateGrpAll,numAll



 #   Column    Non-Null Count  Dtype         
---  ------    --------------  -----         
 0   Category  24 non-null     object        
 1   Code      24 non-null     object        
 2   Time      24 non-null     datetime64[ns]
 3   Amount    24 non-null     int64 

df.head()
Out[294]: 
     Category   Code       Time  Amount
0  Category_A  code1 2021-03-31    3255
1  Category_A  code2 2021-03-31    3545
2  Category_A  code3 2021-03-31     431
3  Category_B  code1 2021-03-31    2221
4  Category_B  code2 2021-03-31    4288

I'm looking to get a result like this: The first Key-Value pair would be the Category-Code and the inner dictionary would be the Code-Series

nested_dict = { 
       'Category_A': [
             { 'code1': Series(Time/Amount),
               'code2': Series(Time/Amount),
               'code2': Series(Time/Amount) }
        ],
       'Category_B': [
             { 'code1': Series(Time/Amount),
               'code2': Series(Time/Amount),
               'code2': Series(Time/Amount) }
        ]
       }

Any help would be greatly appreciated

######################## UPDATED ######################################## Here is an example of how I would like the dictionary to look but wondering if there is a way to avoid loops?

data = {}
category = df.Category.unique()
code = df.Code.unique()

for i in category:
    data[i] = {}
    for j in code:
        data[i][j] = []   


for i in category:
    for j in code:
        data[i][j] = df[(df.Category == i) & (df.Code == j)]
        data[i][j].index = data[i][j]['Time']
        data[i][j] = data[i][j]['Amount']

I don't see any built-in functions that give the output you want. The closest is df.to_records(orient='index')

You can build the result dict manually:

from collections import defaultdict

result = defaultdict(list)
for category, group in df.groupby('Category'):
    result[category].append({
        code: subgroup['Time'].to_list()
        for code, subgroup in group.groupby('Code')
    })

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM