使用 Pandas 在同一图中绘制分组数据

Question

在熊猫中，我正在做：

bp = p_df.groupby('class').plot(kind='kde')

p_df是一个dataframe p_df对象。

但是，这会产生两个图，每个类一个。 如何在同一情节中强制两个班级的情节？

Answer 1

版本 1：

您可以创建轴，然后使用DataFrameGroupBy.plot的ax关键字将所有内容添加到这些轴：

import matplotlib.pyplot as plt

p_df = pd.DataFrame({"class": [1,1,2,2,1], "a": [2,3,2,3,2]})
fig, ax = plt.subplots(figsize=(8,6))
bp = p_df.groupby('class').plot(kind='kde', ax=ax)

这是结果：

不幸的是，这里的图例标签没有太大意义。

版本 2：

另一种方法是遍历组并手动绘制曲线：

classes = ["class 1"] * 5 + ["class 2"] * 5
vals = [1,3,5,1,3] + [2,6,7,5,2]
p_df = pd.DataFrame({"class": classes, "vals": vals})

fig, ax = plt.subplots(figsize=(8,6))
for label, df in p_df.groupby('class'):
    df.vals.plot(kind="kde", ax=ax, label=label)
plt.legend()

这样您就可以轻松控制图例。 这是结果：

情节2

Answer 2

另一种方法是使用seaborn模块。 这将在同一轴上绘制两个密度估计值，而不指定一个变量来保持轴如下（使用其他答案中的一些数据框设置）：

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

# data to create an example data frame
classes = ["c1"] * 5 + ["c2"] * 5
vals = [1,3,5,1,3] + [2,6,7,5,2]
# the data frame 
df = pd.DataFrame({"cls": classes, "indices":idx, "vals": vals})

# this is to plot the kde
sns.kdeplot(df.vals[df.cls == "c1"],label='c1');
sns.kdeplot(df.vals[df.cls == "c2"],label='c2');

# beautifying the labels
plt.xlabel('value')
plt.ylabel('density')
plt.show()

这导致以下图像。

Answer 3

import matplotlib.pyplot as plt
p_df.groupby('class').plot(kind='kde', ax=plt.gca())

Answer 4

也许你可以试试这个：

fig, ax = plt.subplots(figsize=(10,8))
classes = list(df.class.unique())
for c in classes:
    df2 = data.loc[data['class'] == c]
    df2.vals.plot(kind="kde", ax=ax, label=c)
plt.legend()

Answer 5

有两种简单的方法可以在同一图中绘制每个组。
1. 使用pandas.DataFrame.groupby ，应指定要绘制的列（例如聚合列）。
2. 使用seaborn.kdeplot或seaborn.displot并指定hue参数
使用pandas v1.2.4 、 matplotlib 3.4.2 、 seaborn 0.11.1
OP 特定于绘制kde ，但对于许多绘图类型（例如kind='line' 、 sns.lineplot等），步骤是相同的。

导入和示例数据

对于示例数据，组位于'kind'列中，将绘制'duration'的kde ，忽略'waiting' 。

import pandas as pd
import seaborn as sns

df = sns.load_dataset('geyser')

# display(df.head())
   duration  waiting   kind
0     3.600       79   long
1     1.800       54  short
2     3.333       74   long
3     2.283       62  short
4     4.533       85   long

使用`pandas.DataFrame.plot`

使用.groupby或.pivot重塑数据

`.groupby`

指定聚合列['duration']和kind='kde' 。

ax = df.groupby('kind')['duration'].plot(kind='kde', legend=True)

`.pivot`

ax = df.pivot(columns='kind', values='duration').plot(kind='kde')

使用`seaborn.kdeplot`

指定hue='kind'

ax = sns.kdeplot(data=df, x='duration', hue='kind')

使用`seaborn.displot`

指定hue='kind'和kind='kde'

fig = sns.displot(data=df, kind='kde', x='duration', hue='kind')

使用 Pandas 在同一图中绘制分组数据

问题描述

5 个解决方案

解决方案1
83 2015-02-03 12:40:33

版本 1：

版本 2：

解决方案2
14 2017-09-01 21:37:11

解决方案3
8 2019-08-08 16:53:38

解决方案4
4 2019-07-09 20:40:51

解决方案5
0 2021-06-10 15:25:29

导入和示例数据

使用`pandas.DataFrame.plot`

`.groupby`

`.pivot`

使用`seaborn.kdeplot`

使用`seaborn.displot`

阴谋

使用 Pandas 在同一图中绘制分组数据

问题描述

5 个解决方案

解决方案1 83 2015-02-03 12:40:33

版本 1：

版本 2：

解决方案2 14 2017-09-01 21:37:11

解决方案3 8 2019-08-08 16:53:38

解决方案4 4 2019-07-09 20:40:51

解决方案5 0 2021-06-10 15:25:29

导入和示例数据

使用pandas.DataFrame.plot

.groupby

.pivot

使用seaborn.kdeplot

使用seaborn.displot

阴谋

解决方案1
83 2015-02-03 12:40:33

解决方案2
14 2017-09-01 21:37:11

解决方案3
8 2019-08-08 16:53:38

解决方案4
4 2019-07-09 20:40:51

解决方案5
0 2021-06-10 15:25:29

使用`pandas.DataFrame.plot`

`.groupby`

`.pivot`

使用`seaborn.kdeplot`

使用`seaborn.displot`