简体   繁体   中英

How to convert a column in a dataframe to a nested dictionary in python?

I have a column with named work records like this:

Records
Name: hours on date, Name: hours on date
Aya: 20 on 18/9/2021, Asmaa: 10 on 20/9/2021, Aya: 20 on 20/9/2021

I want to reach a structure for this column, so that when I try to aggregate on a range of dates (say from 1/9/2021 until 30/9/2021), it gives me the total hours spent by each name.

I tried changing the column to a list then to a dictionary, but it is not working.

How can I change this column structure in python? Should I use regex?

{18/9/2021: {Aya:20}, 20/9/2021: {Asmaa:10}, 20/9/2021: {Aya:20} }

You can use a dict here, but it will have to be nested, because you have multiple entries per date.

import pandas as pd
df = pd.DataFrame({'Records': ['Name: hours on date, Name: hours on date',
  'Aya: 20 on 18/9/2021, Asmaa: 10 on 20/9/2021, Aya: 20 on 20/9/2021']})

# Keep only rows that have the actual data
data = df.loc[~df['Records'].str.contains('Name')]

# Split on the comma delimiter and explode into a unique row per employee
data = data['Records'].str.split(',').explode()

# Use regex to capture the relevant data and construct the dictionary
data = data.str.extract('([a-zA-z]+)\:\s(\d{1,2})\son\s(\d{1,2}\/\d{1,2}\/\d{4})').reset_index(drop=True)

data.groupby(2).apply(lambda x: dict(zip(x[0],x[1]))).to_dict()

Output

{'18/9/2021': {'Aya': '20'}, '20/9/2021': {'Asmaa': '10', 'Aya': '20'}}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM