[英]Sort categorical x-axis in a seaborn scatter plot
I am trying to plot the top 30 percent values in a data frame using a seaborn scatter plot as shown below.我正在尝试使用 seaborn 散点图绘制数据框中前 30% 的值,如下所示。
The reproducible code for the same plot:同一图的可重现代码:
import seaborn as sns
df = sns.load_dataset('iris')
#function to return top 30 percent values in a dataframe.
def extract_top(df):
n = int(0.3*len(df))
top = df.sort_values('sepal_length', ascending = False).head(n)
return top
#storing the top values
top = extract_top(df)
#plotting
sns.scatterplot(data = top,
x='species', y='sepal_length',
color = 'black',
s = 100,
marker = 'x',)
Here, I want sort the x-axis in order = ['virginica','setosa','versicolor']
.在这里,我想按
order = ['virginica','setosa','versicolor']
对 x 轴进行order = ['virginica','setosa','versicolor']
。 When I tried to use order
as one of the parameter in sns.scatterplot()
, it returned an error AttributeError: 'PathCollection' object has no property 'order'
.当我尝试使用
order
作为sns.scatterplot()
中的参数之一时,它返回了一个错误AttributeError: 'PathCollection' object has no property 'order'
。 What is the right way to do it?正确的做法是什么?
Please note: In the dataframe, setosa
is also a category in species
, however, in the top 30% values non of its value is falling.请注意:在数据框中,
setosa
也是species
一个类别,但是,在前 30% 的值中,它的值没有下降。 Hence, that label is not shown in the example output from the reproducible code at the top.因此,该标签未显示在顶部可重现代码的示例输出中。 But I want even that label in the x-axis as well in the given order as shown below:
但我甚至希望 x 轴上的标签也按照给定的顺序排列,如下所示:
scatterplot()
is not the correct tool for the job. scatterplot()
不是该工作的正确工具。 Since you have a categorical axis you want to use stripplot()
and not scatterplot()
.由于您有一个分类轴,因此您想使用
stripplot()
而不是stripplot()
scatterplot()
。 See the difference between relational and categorical plots here https://seaborn.pydata.org/api.html在此处查看关系图和分类图之间的区别https://seaborn.pydata.org/api.html
sns.stripplot(data = top,
x='species', y='sepal_length',
order = ['virginica','setosa','versicolor'],
color = 'black', jitter=False)
This means sns.scatterplot()
does not take order
as one of its args
.这意味着
sns.scatterplot()
不会将order
作为其args
。 For species setosa
, you can use alpha
to hide the scatter points while keep the ticks.对于物种
setosa
,您可以使用alpha
来隐藏散点,同时保留刻度。
import seaborn as sns
df = sns.load_dataset('iris')
#function to return top 30 percent values in a dataframe.
def extract_top(df):
n = int(0.3*len(df))
top = df.sort_values('sepal_length', ascending = False).head(n)
return top
#storing the top values
top = extract_top(df)
top.append(top.iloc[0,:])
top.iloc[-1,-1] = 'setosa'
order = ['virginica','setosa','versicolor']
#plotting
for species in order:
alpha = 1 if species != 'setosa' else 0
sns.scatterplot(x="species", y="sepal_length",
data=top[top['species']==species],
alpha=alpha,
marker='x',color='k')
the output is输出是
For those wanting to make use of the extra arguments available in sns.scatterplot over sns.strpplot (size and style mappings for variables), it's possible to set the order of the x axis simply by sorting the dataframe before passing it to seaborn.对于那些想要在 sns.strpplot 上使用 sns.scatterplot 中可用的额外参数(变量的大小和样式映射)的人,可以在将数据帧传递给 seaborn 之前简单地通过对数据帧进行排序来设置 x 轴的顺序。 The following will sort alphabetically.
以下将按字母顺序排列。
df.sort_values(feature)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.