简体   繁体   中英

How to change x-axis labels in Boxplots?

I want to get a picture like this (just focus on the boxplots):在此处输入图像描述 As you can see the x-labels of the picture represent the average spacing of the boxplot. I'm trying to make something similar using seaborn, matplotlib, pd.boxplot but I couldn't get it. The code I used is this

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
%matplotlib inline

x=np.array([  0.  ,  43.9 ,   0.  ,  21.43,   0.  ,  46.3 ,   0.  ,   0.  ,
        12.2 ,   0.  ,  54.88,  55.56,  53.44,  55.46,   0.  ,  95.58,
        91.03,  40.63,  65.22, 100.  ,  12.95,  97.89,  86.67, 100.  ,
        96.03,  94.12,  97.93, 100.  ,  95.8 ])
y=np.array([ 19.23,  54.29,  29.23,  23.33, 150.  ,  63.53,  35.  ,  12.5 ,
       117.14,  66.67, 273.33, 180.  , 163.75,  56.67, 230.  , 141.25,
       207.14, 118.75, 690.  ,  60.  ,  69.5 , 710.  ,  83.33, 760.  ,
       503.33, 218.57, 290.  , 158.33, 238.33])
z=np.array([ 82.66,  82.66,  82.66,  82.66,  82.66,  82.66, 120.  , 120.  ,
       120.  , 120.  , 120.  , 120.  , 120.  , 120.  , 120.  , 150.76,
       150.76, 150.76, 150.76, 150.76, 150.76, 150.76, 400.93, 400.93,
       400.93, 400.93, 400.93, 400.93, 400.93])
df=pd.DataFrame({'RQD (%)':x,
                'Spacing (mm)':y,
                'Avarage Spacing':z
                })
sns.boxplot(x='Avarage Spacing',y='RQD (%)',data=df,fliersize=False)

The resulting picture of the lines above is this:

在此处输入图像描述

If you look carefully at the last picture you will realize that its x-labels don't represent themselves because they represent 1, 2, 3 y 4. I say this because if x-labels would represent themself so the boxplots would be more separated

In that way I want you to help me in this problem. I want x-labels in the last picture to represent themselves in the x-axis like in the first picture. I want to achieve that because I want to draw another function to get something similar to the first picture.

For seaborn it is not really possible but you can find a workaround below. See also more details here: Matplotlib boxplot x axis

For Matplotlib use positions (see also below) and for log scale see the link posted by @JohanC in the comments ( Matplotlib boxplot width in log scale ).

For matplotlib you need to restructure the data a bit which I included in the code example below.

from matplotlib import pyplot as plt
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
%matplotlib inline

x=np.array([  0.  ,  43.9 ,   0.  ,  21.43,   0.  ,  46.3 ,   0.  ,   0.  ,
        12.2 ,   0.  ,  54.88,  55.56,  53.44,  55.46,   0.  ,  95.58,
        91.03,  40.63,  65.22, 100.  ,  12.95,  97.89,  86.67, 100.  ,
        96.03,  94.12,  97.93, 100.  ,  95.8 ])
y=np.array([ 19.23,  54.29,  29.23,  23.33, 150.  ,  63.53,  35.  ,  12.5 ,
       117.14,  66.67, 273.33, 180.  , 163.75,  56.67, 230.  , 141.25,
       207.14, 118.75, 690.  ,  60.  ,  69.5 , 710.  ,  83.33, 760.  ,
       503.33, 218.57, 290.  , 158.33, 238.33])
z=np.array([ 82.66,  82.66,  82.66,  82.66,  82.66,  82.66, 120.  , 120.  ,
       120.  , 120.  , 120.  , 120.  , 120.  , 120.  , 120.  , 150.76,
       150.76, 150.76, 150.76, 150.76, 150.76, 150.76, 400.93, 400.93,
       400.93, 400.93, 400.93, 400.93, 400.93])
df=pd.DataFrame({'RQD (%)':x,
                'Spacing (mm)':y,
                'Avarage Spacing':z
                })


# reorganize data so they fit into the matplotlib way 
data = df.groupby('Avarage Spacing').agg({'RQD (%)':lambda x: list(x)}).transpose().values[0]

# use positions for the x position
# adjust width to match the size of boxplots to your preferences
plt.boxplot(data,
            positions=df['Avarage Spacing'].unique(),
            widths=20)
plt.show()


# in seaborn its a bit more difficult as x positons are not supported
# you could convert the categories to int and then add a tick for each step
# your categories are then plotted at the right positions as all the other
# categories are empty and plotted as blank spaces 
df['Avarage Spacing'] = df['Avarage Spacing'].astype(int)
sns.boxplot(x='Avarage Spacing',y='RQD (%)',data=df,
            fliersize=False,
            order=range(df['Avarage Spacing'].min()-10,df['Avarage Spacing'].max()+10))
plt.show()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM