简体   繁体   English

如何指定要在x轴上绘制的离散值(matplotlib,boxplot)?

[英]How can I specify the discrete values that I want to plot on the x-axis (matplotlib, boxplot)?

I'm using boxplot in matplotlib (Python) to create box plots, I'm creating many graphs with different dates. 我在matplotlib(Python)中使用boxplot创建箱形图,正在创建许多具有不同日期的图。 On the x axis the data is discrete. 在x轴上,数据是离散的。

The values on the x axis in seconds are 0.25, 0.5, 1, 2, 5 .... 28800. These values were arbitrarily chosen (they are sampling periods). x轴上以秒为单位的值是0.25、0.5、1、2、5 ....28800。这些值是任意选择的(它们是采样周期)。 On some graphs one or two values are missing because the data wasn't available. 在某些图形上,缺少一两个值,因为数据不可用。 On these graphs the x axis resizes itself to spread out the other values. 在这些图上,x轴会自动调整大小以分散其他值。

I would like all the graphs to have the same values at the same place on the x axis (it doesn't matter if the x axis shows a value but there is no data plotted on the graph) 我希望所有图形在x轴上的相同位置具有相同的值(x轴是否显示值但图形上没有数据无关紧要)

Could someone tell me if there is a way to specify the x axis values? 有人可以告诉我是否可以指定x轴值吗? Or another way to keep the same values in the same place. 或将相同值保留在同一位置的另一种方法。

The relevant section of code is as follows: 代码的相关部分如下:


for i, group in myDataframe.groupby("Date"): 对于我,在myDataframe.groupby(“ Date”)中进行分组:

    graphFilename = (basename+'_' + str(i) + '.png')
    plt.figure(graphFilename)
    group.boxplot(by=["SamplePeriod_seconds"], sym='g+') ## colour = 'blue'
    plt.grid(True)
    axes = plt.gca()
    axes.set_ylim([0,30000])
    plt.ylabel('Average distance (m)', fontsize =8)
    plt.xlabel('GPS sample interval (s)', fontsize=8)
    plt.tick_params(axis='x', which='major', labelsize=8)
    plt.tick_params(axis='y', which='major', labelsize=8)
    plt.xticks(rotation=90)
    plt.title(str(i) + ' - ' + 'Average distance travelled by cattle over 24  hour period', fontsize=9) 
    plt.suptitle('')
    plt.savefig(graphFilename)
    plt.close()     

Any help appreciated, I will continue googling... .thanks :) 任何帮助表示赞赏,我将继续使用谷歌搜索...。谢谢:)

if you try somehting like this: 如果尝试这样的操作:

plt.xticks(np.arange(x.min(), x.max(), 5))

where x is your array of x values, and 5 the steps you take along the axis. 其中x是x值的数组,而5是沿轴执行的步骤。

Same applies for the y axis with yticks. 带有yticks的y轴也是如此。 Hope it helps! 希望能帮助到你! :) :)

EDIT: 编辑:

I have removed the instances that i did not have, but the following code should give you a grid to plot onto: 我删除了我没有的实例,但是以下代码应为您提供一个绘制网格的网格:

import matplotlib.pyplot as plt
import numpy as np


plt.grid(True)
axes = plt.gca()
axes.set_ylim([0, 30000])
plt.ylabel('Average distance (m)', fontsize=8)
plt.xlabel('GPS sample interval (s)', fontsize=8)
plt.tick_params(axis='x', which='major', labelsize=8)
plt.tick_params(axis='y', which='major', labelsize=8)
plt.xticks(rotation=90)
plt.suptitle('')
my_xticks =[0.25,0.5,1,2,5,10,20,30,60,120,300,600,1200,1800,2400,3‌000,3600,7200,10800,‌​ 14400,18000,21600,25‌​200,28800]
x = np.array(np.arange(0, len(my_xticks), 1))

plt.xticks(x, my_ticks)
plt.show()

Try plugging in your values on top of this :) 尝试在此基础上插入值:)

By default, boxplot simply plots the available data to successive positions on the axes. 默认情况下, boxplot仅将可用数据绘制到轴上的连续位置。 Missing data are left out, simply because the boxplot doesn't know they are missing. 丢失数据被遗漏了,仅仅是因为箱线图不知道它们丢失了。 However, the positions of the boxes can be set manually using the positions argument. 但是,可以使用positions参数手动设置框的positions The following example does this and thereby produces plots of equal extents even when values are missing. 下面的示例将执行此操作,从而即使丢失值也可以生成相等范围的图。

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd


basename = __file__+"_plot"
Nd = 4 # four different dates
Ns = 5 # five second intervals
N = 80 # each 80 values
date = []
seconds = []
avgdist = []
# fill lists
for i in range(Nd):
    # for each date, select a random SamplePeriod to be not part of the dataframe
    w = np.random.randint(0,5)
    for j in range(Ns):
        if j!=w:
            av = np.random.poisson(1.36+j/10., N)*4000+1000
            avgdist.append(av) 
            seconds.append([j]*N)
            date.append([i]*N)

date = np.array(date).flatten()
seconds = np.array(seconds).flatten()
avgdist = np.array(avgdist).flatten()
#put data into DataFrame
myDataframe = pd.DataFrame({"Date" : date, "SamplePeriod_seconds" : seconds, "avgdist" : avgdist}) 
# obtain a list of all possible Sampleperiods
globalunique = np.sort(myDataframe["SamplePeriod_seconds"].unique())

for i, group in myDataframe.groupby("Date"):

    graphFilename = (basename+'_' + str(i) + '.png')
    fig = plt.figure(graphFilename, figsize=(6,3))
    axes = fig.add_subplot(111)
    plt.grid(True)

    # omit the `dates` column
    dfgroup = group[["SamplePeriod_seconds", "avgdist"]]
    # obtain a list of Sampleperiods for this date
    unique = np.sort(dfgroup["SamplePeriod_seconds"].unique())
    # plot the boxes to the axes, one for each sample periods in dfgroup
    # set the boxes' positions to the values in unique
    dfgroup.boxplot(by=["SamplePeriod_seconds"], sym='g+', positions=unique, ax=axes)

    # set xticks to the unique positions, where boxes are
    axes.set_xticks(unique)
    # make sure all plots share the same extent.
    axes.set_xlim([-0.5,globalunique[-1]+0.5])
    axes.set_ylim([0,30000])

    plt.ylabel('Average distance (m)', fontsize =8)
    plt.xlabel('GPS sample interval (s)', fontsize=8)
    plt.tick_params(axis='x', which='major', labelsize=8)
    plt.tick_params(axis='y', which='major', labelsize=8)
    plt.xticks(rotation=90)
    plt.suptitle(str(i) + ' - ' + 'Average distance travelled by cattle over 24  hour period', fontsize=9) 
    plt.title("")
    plt.savefig(graphFilename)
    plt.close()    

在此处输入图片说明
在此处输入图片说明

This will still work, if the values in the SamplePeriod_seconds columnare non-equally spaced, but of course if they are extremely different, this will not produce nice results, because the bars will overlapp: 如果SamplePeriod_seconds列中的值间隔不相等,这仍然会起作用,但是,如果它们之间的差异非常大,则不会产生很好的结果,因为这些条会重叠p:

在此处输入图片说明

This however is not a problem with plotting itself. 但是,这对绘图本身不是问题。 And for further help, one would need to know how you expect the graph to look like at the end. 为了获得进一步的帮助,您需要知道您期望图形的外观如何。

Thank you everyone very much for the help, using your answers I got it working with the following code. 非常感谢大家的帮助,使用您的回答,我将其与以下代码结合使用。 (I realize it can probably be improved but happy that it works I can look at the data now :) ) (我意识到它可能会得到改进,但很高兴它能起作用,现在我可以查看数据了:))

valuesShouldPlot = ['0.25','0.5','1.0','2.0','5.0','10.0','20.0','30.0','60.0','120.0','300.0','600.0','1200.0','1800.0','2400.0','3000.0','3600.0','7200.0','10800.0','14400.0','18000.0','21600.0','25200.0','28800.0']       


for xDate, group in myDataframe.groupby("Date"):            ## for each date

    graphFilename = (basename+'_' + str(xDate) + '.png')    ## make up a suitable filename for the graph

    plt.figure(graphFilename)

    group.boxplot(by=["SamplePeriod_seconds"], sym='g+', return_type='both')  ## create box plot, (boxplots are placed in default positions)

    ## get information on where the boxplots were placed by looking at the values on the x-axis                                                    
    axes = plt.gca()  
    checkXticks= axes.get_xticks()
    numOfValuesPlotted =len(checkXticks)            ## check how many boxplots were actually plotted by counting the labels printed on the x-axis
    lengthValuesShouldPlot = len(valuesShouldPlot)  ## (check how many boxplots should have been created if no data was missing)



    if (numOfValuesPlotted < valuesShouldPlot): ## if number of values actually plotted is less than the maximum possible it means some values are missing
                                                ## if that occurs then want to move the plots across accordingly to leave gaps where the missing values should go


        labels = [item.get_text() for item in axes.get_xticklabels()]

        i=0                 ## counter to increment through the entire list of x values that should exist if no data was missing.
        j=0                 ## counter to increment through the list of x labels that were originally plotted (some labels may be missing, want to check what's missing)

        positionOfBoxesList =[] ## create a list which will eventually contain the positions on the x-axis where boxplots should be drawn  

        while ( j < numOfValuesPlotted): ## look at each value in turn in the list of x-axis labels (on the graph plotted earlier)

            if (labels[j] == valuesShouldPlot[i]):  ## if the value on the x axis matches the value in the list of 'valuesShouldPlot' 
                positionOfBoxesList.append(i)       ## then record that position as a suitable position to put a boxplot
                j = j+1
                i = i+1


            else :                                  ## if they don't match (there must be a value missing) skip the value and look at the next one             

                print("\n******** missing value ************")
                print("Date:"),
                print(xDate),
                print(", Position:"),
                print(i),
                print(":"),
                print(valuesShouldPlot[i])
                i=i+1               


        plt.close()     ## close the original plot (the one that didn't leave gaps for missing data)
        group.boxplot(by=["SamplePeriod_seconds"], sym='g+', return_type='both', positions=positionOfBoxesList) ## replot with boxes in correct positions

    ## format graph to make it look better        
    plt.ylabel('Average distance (m)', fontsize =8)
    plt.xlabel('GPS sample interval (s)', fontsize=8)
    plt.tick_params(axis='x', which='major', labelsize=8)
    plt.tick_params(axis='y', which='major', labelsize=8)
    plt.xticks(rotation=90)   
    plt.title(str(xDate) + ' - ' + 'Average distance travelled by cattle over 24 hour period', fontsize=9) ## put the title above the first subplot (ie. at the top of the page)
    plt.suptitle('')
    axes = plt.gca() 
    axes.set_ylim([0,30000])

    ## save and close 
    plt.savefig(graphFilename)  
    plt.close()         

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用分类 x 轴绘制 matplotlib 散点图,允许我根据第三个变量指定标记和颜色? - How can I do a matplotlib scatter plot with a categorical x-axis, that allows me to specify the marker and color based on a third variable? 如何使plot的x轴原点和y轴原点在matplotlib中重叠? - How can I make the origin of the x-axis and the origin of the y-axis of a plot overlap in matplotlib? Python Matplotlib-如何在X轴上绘制一条线? - Python matplotlib - How do I plot a line on the x-axis? 熊猫和Matplotlib-如何更改绘图区域以适合更多X轴文本? - Pandas and Matplotlib - how can I change plot area to fit in more x-axis text? Matplotlib/Seaborn 在 x 轴上绘制一个箱线图,其中包含不同范围的值类别 - Matplotlib/Seaborn plot a boxplot with on the x-axis different range of values categories Matplotlib:如何重新排列图表的 x 轴? - Matplotlib: How can I reorder the graphs' x-axis? 如何使用python更改matplotlib中x轴值的日期时间格式? - How can I change datetime format of x-axis values in matplotlib using python? 如何在Matplotlib图中有条件地对X轴值进行排序? - How to conditionally sort X-axis values in Matplotlib plot? 如何使用matplotlib在图形X轴上绘制分数值? - How to plot fraction values in a graph x-axis using matplotlib? 我怎样才能更好地格式化 plot 的 x 轴 - How can I better format the x-axis of a plot
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM