如何将 dataframe 中的列转换为 python 中的嵌套字典？

Question

I have a column with named work records like this:我有一个列，其中包含这样的命名工作记录：

Records记录
Name: hours on date, Name: hours on date名称：日期的小时数，名称：日期的小时数
Aya: 20 on 18/9/2021, Asmaa: 10 on 20/9/2021, Aya: 20 on 20/9/2021 Aya：2021 年 9 月 18 日 20 人，Asmaa：2021 年 9 月 20 日 10 人，Aya：2021 年 9 月 20 日 20 人

I want to reach a structure for this column, so that when I try to aggregate on a range of dates (say from 1/9/2021 until 30/9/2021), it gives me the total hours spent by each name.我想为这个专栏找到一个结构，这样当我尝试汇总一个日期范围（比如从 2021 年 1 月 9 日到 2021 年 9 月 30 日）时，它会给出每个名字花费的总小时数。

I tried changing the column to a list then to a dictionary, but it is not working.我尝试将列更改为列表，然后再更改为字典，但它不起作用。

How can I change this column structure in python?如何更改 python 中的列结构？ Should I use regex?我应该使用正则表达式吗？

{18/9/2021: {Aya:20}, 20/9/2021: {Asmaa:10}, 20/9/2021: {Aya:20} }

Answer 1

You can use a dict here, but it will have to be nested, because you have multiple entries per date.您可以在此处使用字典，但必须嵌套，因为每个日期有多个条目。

import pandas as pd
df = pd.DataFrame({'Records': ['Name: hours on date, Name: hours on date',
  'Aya: 20 on 18/9/2021, Asmaa: 10 on 20/9/2021, Aya: 20 on 20/9/2021']})

# Keep only rows that have the actual data
data = df.loc[~df['Records'].str.contains('Name')]

# Split on the comma delimiter and explode into a unique row per employee
data = data['Records'].str.split(',').explode()

# Use regex to capture the relevant data and construct the dictionary
data = data.str.extract('([a-zA-z]+)\:\s(\d{1,2})\son\s(\d{1,2}\/\d{1,2}\/\d{4})').reset_index(drop=True)

data.groupby(2).apply(lambda x: dict(zip(x[0],x[1]))).to_dict()

Output Output

{'18/9/2021': {'Aya': '20'}, '20/9/2021': {'Asmaa': '10', 'Aya': '20'}}

如何将 dataframe 中的列转换为 python 中的嵌套字典？

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-09-20 13:45:26

如何将 dataframe 中的列转换为 python 中的嵌套字典？

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-09-20 13:45:26

解决方案1
0 已采纳 2021-09-20 13:45:26