Pandas：逗号分隔的 Excel 单元格未转换为列表

Question

I've joined 3 Excel tab data sets to give me my base dataframe, and then for each line I want to count the int values in the comma separated values in DUAlloc and divide Amount by the DUAlloc Count then loop through the DuAlloc list and assign individual lines eg我加入了 3 个 Excel 选项卡数据集以提供我的基本数据框，然后对于每一行，我想计算 DUAlloc 中逗号分隔值中的 int 值，然后将 Amount 除以 DUAlloc 计数，然后遍历 DuAlloc 列表并分配单独的行，例如

Base Data:基础数据：

Description描述	DuAlloc DuAlloc	Amount数量
Blah废话	1,2,3,4,5 1,2,3,4,5	1000 1000
Yada雅达	30,15,3,4,5 30,15,3,4,5	200 200

Processed Data:处理数据：

Description描述	DuAlloc DuAlloc	Amount数量
Blah废话	1 1	200 200
Blah废话	2 2	200 200
Blah废话	3 3	200 200
Yada雅达	3 3	40 40
Blah废话	4 4	200 200
Yada雅达	4 4	40 40
Blah废话	5 5	200 200
Yada雅达	5 5	40 40
Yada雅达	15 15	40 40
Yada雅达	30 30	40 40

I've tried numerous ways to convert to a list: list(), tolist(), but either get the same number for all the counts, or the nearest I've come is [len(str(c)) for c in df3['DUAlloc']] which counts all the characters which I don't want.我尝试了多种转换为列表的方法：list()、tolist()，但要么对所有计数获得相同的数字，要么我最接近的是[len(str(c)) for c in df3['DUAlloc']]计算所有我不想要的字符。

How would I go about achieving this, and is Pandas the best route to take?我将如何实现这一目标，熊猫是最好的选择吗？

Answer 1

Use Series.str.split , df.explode , Groupby.transform and df.div :使用Series.str.split 、 df.explode 、 Groupby.transform和df.div ：

In [501]: out = df.assign(DuAlloc=df['DuAlloc'].str.split(',')).explode('DuAlloc')

In [506]: out['Amount'] = out['Amount'].div(out.groupby('Description')['Amount'].transform('size'))

In [507]: out
Out[507]: 
  Description DuAlloc  Amount
0        Blah       1   200.0
0        Blah       2   200.0
0        Blah       3   200.0
0        Blah       4   200.0
0        Blah       5   200.0
1        Yada      30    40.0
1        Yada      15    40.0
1        Yada       3    40.0
1        Yada       4    40.0
1        Yada       5    40.0

Answer 2

You can use .str.count to count the number of , in columns.您可以使用.str.count来计算,列的数量。

out = (df.assign(Amount=df['Amount'].div(df['DuAlloc'].str.count(',').add(1)),
                 DuAlloc=df['DuAlloc'].str.split(','))
       .explode('DuAlloc'))

print(out)

  Description DuAlloc  Amount
0        Blah       1   200.0
0        Blah       2   200.0
0        Blah       3   200.0
0        Blah       4   200.0
0        Blah       5   200.0
1        Yada      30    40.0
1        Yada      15    40.0
1        Yada       3    40.0
1        Yada       4    40.0
1        Yada       5    40.0

Pandas：逗号分隔的 Excel 单元格未转换为列表

问题描述

2 个解决方案

解决方案1
1 已采纳 2022-05-15 16:30:16

解决方案2
0 2022-05-15 17:18:50

Pandas：逗号分隔的 Excel 单元格未转换为列表

问题描述

2 个解决方案

解决方案1 1 已采纳 2022-05-15 16:30:16

解决方案2 0 2022-05-15 17:18:50

解决方案1
1 已采纳 2022-05-15 16:30:16

解决方案2
0 2022-05-15 17:18:50