如何在 Pandas 中分组后绘制汇总结果？

Question

I've recently started learning Pandas and I'm having some trouble on how to plot results after using groupby and agg .我最近开始学习 Pandas，但在使用groupby和agg后如何绘制结果时遇到了一些麻烦。 Using Pandas, I have created a data frame and grouped it based on two columns 'ID' and 'x'.使用 Pandas，我创建了一个数据框，并根据两列“ID”和“x”对其进行分组。 Then I selected one specific column ('results') from the group to calculate the sem and mean.然后我从组中选择了一个特定的列（“结果”）来计算 sem 和平均值。

Specifically, the code:具体来说，代码：

df = pd.read_csv('pandas_2015-11-7.csv')              
df_group = df.groupby(['x','ID'])['results']          
df_group_results = df_group.agg([stats.sem, np.mean])

The results look like the following:结果如下所示：

            sem      mean
x    ID                    
2.5  0     0.010606  0.226674
     1     0.000369  0.490820
     2     0.000508  0.494094
5.0  0     0.001672  0.005059
     1     0.012252  0.190962
     2     0.003696  0.170342
7.5  0     0.001630  0.004506
     1     0.002567  0.016109
     2     0.002081  0.047301
10.0 0     0.000000  0.000000
     1     0.000000  0.000000
     2     0.000000  0.000000
12.5 0     0.000000  0.000000
     1     0.000000  0.000000
     2     0.000000  0.000000

My question is how do I make a line plot with error bars based on these results?我的问题是如何根据这些结果制作带有误差线的线图？ The x-axis should be based on the 'x' value and 'ID' determines the lines (in this case 3 lines with legends of 0, 1, and 2). x 轴应基于 'x' 值，而 'ID' 确定行（在本例中为 3 行，图例分别为 0、1 和 2）。 The desired plot that I want to achieve is like this我想要实现的理想情节是这样的
_{(source: matplotlib.org )} _{（来源： matplotlib.org ）}
. .

Answer 1

The groupby() method returns a hierarchical index (multi-index): groupby() 方法返回一个分层索引（多索引）：

http://pandas.pydata.org/pandas-docs/stable/advanced.html http://pandas.pydata.org/pandas-docs/stable/advanced.html

If I create a df with a similar hierarchical index:如果我创建一个具有类似分层索引的 df：

import pandas as pd
df = pd.DataFrame({'mean':[0.5,0.25,0.7,0.8],'sem':[0.1,0.1,0.1,0.2]})
df.index = pd.MultiIndex(levels=[[2.5,5.0],[0,1]],labels=[[0,0,1,1],[0,1,0,1]],names=['x','ID'])

Then I have the following df:然后我有以下df：

        mean  sem
x   ID           
2.5 0   0.50  0.1
    1   0.25  0.1
5.0 0   0.70  0.1
    1   0.80  0.2

I can grab the relevant information from the multi-index, and use it to select and plot the correct rows in sequence:我可以从多索引中获取相关信息，并使用它来按顺序选择和绘制正确的行：

x_values = df.index.levels[0]
ID_values = df.index.levels[1]

for ID in ID_values:
    mean_data = df.loc[[(x,ID) for x in x_values],'mean'].tolist()
    error_data = df.loc[[(x,ID) for x in x_values],'sem'].tolist()
    matplotlib.pyplot.errorbar(x_values,mean_data,yerr=error_data)

legend(ID_values)

如何在 Pandas 中分组后绘制汇总结果？

问题描述

1 个解决方案

解决方案1
2 已采纳 2015-11-08 10:33:20

如何在 Pandas 中分组后绘制汇总结果？

问题描述

1 个解决方案

解决方案1 2 已采纳 2015-11-08 10:33:20

解决方案1
2 已采纳 2015-11-08 10:33:20