单个图中的Pandas groupby散点图

Question

这是此解决方案的后续问题。 当kind=line时会自动分配不同的颜色，但对于散点图则不是这样。

import pandas as pd
import matplotlib.pylab as plt
import numpy as np

# random df
df = pd.DataFrame(np.random.randint(0,10,size=(25, 3)), columns=['label','x','y'])

# plot groupby results on the same canvas 
fig, ax = plt.subplots(figsize=(8,6))
df.groupby('label').plot(kind='scatter', x = "x", y = "y", ax=ax)

有连接的问题在这里。 有没有简单的解决方法？

更新：

当我尝试@ImportanceOfBeingErnest为带字符串的label列推荐的解决方案时，它不起作用！

df = pd.DataFrame(np.random.randint(0,10,size=(5, 2)), columns=['x','y'])
df['label'] = ['yes','no','yes','yes','no']
fig, ax = plt.subplots(figsize=(8,6))
ax.scatter(x='x', y='y', c='label', data=df)

它引发以下错误，

ValueError：无效的RGBA参数：'是'

在处理上述异常期间，发生了另一个异常：

Answer 1

IIUC，您可以为此目的使用sns ：

df = pd.DataFrame(np.random.randint(0,10,size=(100, 2)), columns=['x','y'])
df['label'] = np.random.choice(['yes','no','yes','yes','no'], 100)
fig, ax = plt.subplots(figsize=(8,6))
sns.scatterplot(x='x', y='y', hue='label', data=df) 
plt.show()

输出：

另一个选项如注释中建议的那样：按类别类型将值映射到数字：

fig, ax = plt.subplots(figsize=(8,6))
ax.scatter(df.x, df.y, c = pd.Categorical(df.label).codes, cmap='tab20b')
plt.show()

输出：

Answer 2

您可以遍历groupby并为每个组创建散布图。 对于不到10个类别，这是有效的。

import pandas as pd
import matplotlib.pylab as plt
import numpy as np

# random df
df = pd.DataFrame(np.random.randint(0,10,size=(5, 2)), columns=['x','y'])
df['label'] = ['yes','no','yes','yes','no']

# plot groupby results on the same canvas 
fig, ax = plt.subplots(figsize=(8,6))

for n, grp in df.groupby('label'):
    ax.scatter(x = "x", y = "y", data=grp, label=n)
ax.legend(title="Label")

plt.show()

或者，您可以创建一个散布像

import pandas as pd
import matplotlib.pylab as plt
import numpy as np

# random df
df = pd.DataFrame(np.random.randint(0,10,size=(5, 2)), columns=['x','y'])
df['label'] = ['yes','no','yes','yes','no']

# plot groupby results on the same canvas 
fig, ax = plt.subplots(figsize=(8,6))

u, df["label_num"] = np.unique(df["label"], return_inverse=True)

sc = ax.scatter(x = "x", y = "y", c = "label_num", data=df)
ax.legend(sc.legend_elements()[0], u, title="Label")

plt.show()

Answer 3

如果我们已经有一个分组的数据，那么我发现以下解决方案可能会有用。

df = pd.DataFrame(np.random.randint(0,10,size=(5, 2)), columns=['x','y'])
df['label'] = ['yes','no','yes','yes','no']
fig, ax = plt.subplots(figsize=(7,3))


def plot_grouped_df(grouped_df,
                    ax,  x='x', y='y', cmap = plt.cm.autumn_r):

    colors = cmap(np.linspace(0.5, 1, len(grouped_df)))

    for i, (name,group) in enumerate(grouped_df):
        group.plot(ax=ax,
                   kind='scatter', 
                   x=x, y=y,
                   color=colors[i],
                   label = name)

# now we can use this function to plot the groupby data with categorical values
plot_grouped_df(df.groupby('label'),ax)

单个图中的Pandas groupby散点图

问题描述

3 个解决方案

解决方案1
2 已采纳 2019-05-31 11:55:42

解决方案2
1 2019-05-31 12:16:06

解决方案3
0 2019-05-31 12:15:26

单个图中的Pandas groupby散点图

问题描述

3 个解决方案

解决方案1 2 已采纳 2019-05-31 11:55:42

解决方案2 1 2019-05-31 12:16:06

解决方案3 0 2019-05-31 12:15:26

解决方案1
2 已采纳 2019-05-31 11:55:42

解决方案2
1 2019-05-31 12:16:06

解决方案3
0 2019-05-31 12:15:26