简体   繁体   English

在Panda Dataframe中对一系列列表进行分组

[英]Group over Series of lists in Panda Dataframe

I have a dataframe with a list in each cell. 我在每个单元格中都有一个带有列表的数据框。 For each row of the dataframe I want to group over the 1st element of the lists and average the second element. 对于数据帧的每一行,我想对列表的第一个元素进行分组,并对第二个元素求平均值。 Here some dummy data and a screenshot of the df to illustrate the problem: 这里有一些虚拟数据和df的屏幕截图来说明问题:

import pandas as pd
df = pd.DataFrame({"Column A":[["Winter 2012",5],["Sommer 2012",10]],
                   "Column B":[["Sommer 2012",20],["Winter 2012",10]],
                   "Column C":[["Winter 2012",15],["Sommer 2012",30]]})
df

            Column A           Column B           Column C
0   [Winter 2012, 5]  [Sommer 2012, 20]  [Winter 2012, 15]
1  [Sommer 2012, 10]  [Winter 2012, 10]  [Sommer 2012, 30]

The desired output for the first line should look like this: 第一行的期望输出应如下所示:

            Column D           Column E
0  [Winter 2012, 10]  [Sommer 2012, 20]
1  [Sommer 2012, 20]  [Winter 2012, 10]

Being completely new to Python, I simply cannot wrap my head around how to I could approach this. 对Python完全陌生,我根本无法解决如何实现这一目标。

Here's one way 这是一种方法

In [410]: df.apply(lambda x: pd.Series(
                   x.apply(pd.Series)
                    .groupby(0, as_index=False, sort=False)
                    .mean()
                    .values.tolist(), index=['Column D', 'Column E']),
                   axis=1)
Out[410]:
            Column D           Column E
0  [Winter 2012, 10]  [Sommer 2012, 20]
1  [Sommer 2012, 20]  [Winter 2012, 10]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM