I am trying to plot the top 30 percent values in a data frame using a seaborn scatter plot as shown below.
The reproducible code for the same plot:
import seaborn as sns
df = sns.load_dataset('iris')
#function to return top 30 percent values in a dataframe.
def extract_top(df):
n = int(0.3*len(df))
top = df.sort_values('sepal_length', ascending = False).head(n)
return top
#storing the top values
top = extract_top(df)
#plotting
sns.scatterplot(data = top,
x='species', y='sepal_length',
color = 'black',
s = 100,
marker = 'x',)
Here, I want sort the x-axis in order = ['virginica','setosa','versicolor']
. When I tried to use order
as one of the parameter in sns.scatterplot()
, it returned an error AttributeError: 'PathCollection' object has no property 'order'
. What is the right way to do it?
Please note: In the dataframe, setosa
is also a category in species
, however, in the top 30% values non of its value is falling. Hence, that label is not shown in the example output from the reproducible code at the top. But I want even that label in the x-axis as well in the given order as shown below:
scatterplot()
is not the correct tool for the job. Since you have a categorical axis you want to use stripplot()
and not scatterplot()
. See the difference between relational and categorical plots here https://seaborn.pydata.org/api.html
sns.stripplot(data = top,
x='species', y='sepal_length',
order = ['virginica','setosa','versicolor'],
color = 'black', jitter=False)
This means sns.scatterplot()
does not take order
as one of its args
. For species setosa
, you can use alpha
to hide the scatter points while keep the ticks.
import seaborn as sns
df = sns.load_dataset('iris')
#function to return top 30 percent values in a dataframe.
def extract_top(df):
n = int(0.3*len(df))
top = df.sort_values('sepal_length', ascending = False).head(n)
return top
#storing the top values
top = extract_top(df)
top.append(top.iloc[0,:])
top.iloc[-1,-1] = 'setosa'
order = ['virginica','setosa','versicolor']
#plotting
for species in order:
alpha = 1 if species != 'setosa' else 0
sns.scatterplot(x="species", y="sepal_length",
data=top[top['species']==species],
alpha=alpha,
marker='x',color='k')
the output is
For those wanting to make use of the extra arguments available in sns.scatterplot over sns.strpplot (size and style mappings for variables), it's possible to set the order of the x axis simply by sorting the dataframe before passing it to seaborn. The following will sort alphabetically.
df.sort_values(feature)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.