简体   繁体   中英

Python plotting by different dataframe columns (using Seaborn?)

I'm trying to create a scatterplot of a dataset with point coloring based on different categorical columns. Seaborn works well here for one plot:

fg = sns.FacetGrid(data=plot_data, hue='col_1')
fg.map(plt.scatter, 'x_data', 'y_data', **kws).add_legend()
plt.show()

I then want to display the same data, but with hue='col_2' and hue='col_3'. It works fine if I just make 3 plots, but I'm really hoping to find a way to have them all appear as subplots in one figure. Unfortunately, I haven't found any way to change the hue from one plot to the next. I know there are plotting APIs that allow for an axis keyword, thereby letting you pop it into a matplotlib figure, but I haven't found one that simultaneously allows you to set 'ax=' and 'hue='. Any ideas? Thanks in advance!

Edit: Here's some sample code to illustrate the idea

xx = np.random.rand(10,2)
cat1 = np.array(['cat','dog','dog','dog','cat','hamster','cat','cat','hamster','dog'])
cat2 = np.array(['blond','brown','brown','black','black','blond','blond','blond','brown','blond'])
d = {'x':xx[:,0], 'y':xx[:,1], 'pet':cat1, 'hair':cat2}
df = pd.DataFrame(data=d)

sns.set(style='ticks')
fg = sns.FacetGrid(data=df, hue='pet', size=5)
fg.map(plt.scatter, 'x', 'y').add_legend()
fg = sns.FacetGrid(data=df, hue='hair', size=5)
fg.map(plt.scatter, 'x', 'y').add_legend()
plt.show()

This plots what I want, but in two windows. The color scheme is set in the first plot by grouping by 'pet', and in the second plot by 'hair'. Is there any way to do this on one plot?

In order to plot 3 scatterplots with different colors for each, you may create 3 axes in matplotlib and plot a scatter to each axes.

import pandas as pd
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt

df = pd.DataFrame(np.random.rand(10,5), 
                  columns=["x", "y", "col1", "col2", "col3"])

fig, axes = plt.subplots(nrows=3)
for ax, col in zip(axes, df.columns[2:]):
    ax.scatter(df.x, df.y, c=df[col])

plt.show()

在此处输入图片说明

For categorical data it is often easier to plot several scatter plots, one per category.

import pandas as pd
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
import seaborn as sns


xx = np.random.rand(10,2)
cat1 = np.array(['cat','dog','dog','dog','cat','hamster','cat','cat','hamster','dog'])
cat2 = np.array(['blond','brown','brown','black','black','blond','blond','blond','brown','blond'])
d = {'x':xx[:,0], 'y':xx[:,1], 'pet':cat1, 'hair':cat2}
df = pd.DataFrame(data=d)


cols = ['pet',"hair"]
fig, axes = plt.subplots(nrows=len(cols ))
for ax,col in zip(axes,cols):
    for n, group in df.groupby(col):
        ax.scatter(group.x,group.y, label=n)
    ax.legend()

plt.show()

在此处输入图片说明

You may surely use a FacetGrid, if you really want, but that requires a different data format of the DataFrame.

import pandas as pd
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
import seaborn as sns

xx = np.random.rand(10,2)
cat1 = np.array(['cat','dog','dog','dog','cat','hamster','cat','cat','hamster','dog'])
cat2 = np.array(['blond','brown','brown','black','black','blond','blond','blond','brown','blond'])
d = {'x':xx[:,0], 'y':xx[:,1], 'pet':cat1, 'hair':cat2}
df = pd.DataFrame(data=d)

df2 = pd.melt(df, id_vars=['x','y'], value_name='category', var_name="kind")

fg = sns.FacetGrid(data=df2, row="kind",hue='category', size=3)
fg.map(plt.scatter, 'x', 'y').add_legend()

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM