I want to create (eg) violinplots from pandas dataframes which can belong to multiple categories, ideally in a single figure. Not sure how to go about this however -- any suggestions? Many thanks!
A simple example showing separate plots. Here, x
is main grouping variable, y
are the data to be grouped and z
defines membership/category. For simplicity, I've just set z
to an integer to [0,1,2]
randomly.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# dummy data
np.random.seed(12345)
x = np.random.randint(1,6,1000)
y = np.random.randn(1000)
z = np.random.randint(0,3,1000)
df = pd.DataFrame(data=np.array([x,y,z]).T,columns=['x','y','z'])
All data (for verification?):
sns.violinplot(x='x',y='y',data=df)
plt.title('all data')
Violin plot of all data regardless of Z
Individual plots:
fig,ax = plt.subplots(nrows=3,ncols=1,sharex=True)
sns.violinplot(x='x',y='y',data=df.loc[df['z']<=0],ax=ax[0])
ax[0].set_title('z <= 0')
sns.violinplot(x='x',y='y',data=df.loc[df['z']<=1],ax=ax[1])
ax[1].set_title('z <= 1')
sns.violinplot(x='x',y='y',data=df.loc[df['z']<=2],ax=ax[2])
ax[2].set_title('z <= 2')
plt.tight_layout();
3 violin plots of data with z<=[0,1,2] respectively
What I'd like is a plot that looks like the following, except that 'z' uses the grouping of the above plot:
plt.figure()
sns.violinplot(x='x',y='y',data=df,hue='z');
Violin plot using 'hue' where only data with z==[0,1,2] is grouped for each color
You can do this by creating a new dataframe containing the selections of z
that you want to show by hue:
import numpy as np # v 1.19.2
import pandas as pd # v 1.1.3
import seaborn as sns # v 0.11.0
# Create sample dataset
np.random.seed(12345)
x = np.random.randint(1,6,1000)
y = np.random.randn(1000)
z = np.random.randint(0,3,1000)
df = pd.DataFrame(data=np.array([x,y,z]).T,columns=['x','y','z'])
# Create new dataframe containing the selections of the 'z' variable
df0 = df.loc[df['z']<=0]
df1 = df.loc[df['z']<=1]
df2 = df.loc[df['z']<=2]
dfnew = pd.concat([df0, df1, df2], keys=['z <= 0', 'z <= 1', 'z <= 2'])
dfnew.reset_index(inplace=True)
dfnew.drop(columns='level_1', inplace=True)
dfnew.rename(columns={'level_0':'z selection'}, inplace=True)
dfnew.head()
# z selection x y z
# 0 z <= 0 3.0 -0.670121 0.0
# 1 z <= 0 3.0 -2.016201 0.0
# 2 z <= 0 2.0 -0.266742 0.0
# 3 z <= 0 2.0 -0.406730 0.0
# 4 z <= 0 2.0 -0.243281 0.0
ax = sns.violinplot(x='x', y='y', data=dfnew, hue='z selection')
ax.figure.set_size_inches(9, 6)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.