简体   繁体   English

调整水平条形图 matplotlib 以适应条形图

[英]adjusting horizontal bar chart matplotlib to accommodate the bars

I am doing a horizontal bar chart but struggling with adjusting ylim, or maybe another parameter to make my labels clearer and make all the labels fit the y axis.我正在做一个水平条形图,但正在努力调整 ylim,或者可能是另一个参数,以使我的标签更清晰,并使所有标签都适合 y 轴。 I played around with ylim and the text size can be bigger or smaller but the bars do not fit the y axis.我玩过 ylim,文本大小可以更大或更小,但条形不适合 y 轴。 Any idea about the right approach?关于正确方法的任何想法?

My code:我的代码:

import matplotlib.pyplot as plt #we load the library that contains the plotting capabilities
from operator import itemgetter
D=[]        
for att, befor, after in zip(df_portion['attributes'], df_portion['2005_2011 (%)'], df_portion['2012_2015 (%)']):
    i=(att, befor, after)
    D.append(i)
Dsort = sorted(D, key=itemgetter(1), reverse=False) #sort the list in order of usage
attri = [x[0] for x in Dsort] 
aft  = [x[1] for x in Dsort]
bef  = [x[2] for x in Dsort] 

ind = np.arange(len(attri))
width=3

ax = plt.subplot(111)
ax.barh(ind, aft, width,align='center',alpha=1, color='r', label='from 2012 to 2015') #a horizontal bar chart (use .bar instead of .barh for vertical)
ax.barh(ind - width, bef, width, align='center',  alpha=1, color='b', label='from 2005 to 2008') #a horizontal bar chart (use .bar instead of .barh for vertical)
ax.set(yticks=ind, yticklabels=attri,ylim=[1, len(attri)/2])
plt.xlabel('Frequency distribution (%)')
plt.title('Frequency distribution (%) of common attributes between 2005_2008 and between 2012_2015')
plt.legend()
plt.show()

This is the plot for above code这是上述代码的 plot

在此处输入图像描述

To make the labels fit, you need to set a smaller fontsize, or use a larger figsize.要使标签适合,您需要设置较小的字体大小,或使用较大的 figsize。 Changing the ylim will either just show a subset of the bars (in case ylim is set too narrow), or will show more whitespace (when ylim is larger).更改ylim将仅显示条的子集(以防ylim设置得太窄),或者将显示更多空白(当ylim较大时)。

The biggest problem in the code is width being too large.代码中最大的问题是width太大。 Twice the width needs to fit over a distance of 1.0 (the ticks are placed via ind , which is an array 0,1,2,... ).两倍的宽度需要适合1.0的距离(刻度通过ind放置,它是一个数组0,1,2,... )。 As matplotlib calls the thickness of a horizontal bar plot "height", this name is used in the example code below.由于 matplotlib 将水平条的厚度 plot 称为“高度”,因此在下面的示例代码中使用了此名称。 Using align='edge' lets you position the bars directly ( align='center' will move them half their "height").使用align='edge'可以让您直接 position 条形( align='center'会将它们移动到“高度”的一半)。

Pandas has simple functions to sort dataframes according to one or more rows. Pandas 具有根据一行或多行对数据帧进行排序的简单功能。

Code to illustrate the ideas:代码来说明这些想法:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# first create some test data
df = pd.DataFrame({'attributes': ["alpha", "beta", "gamma", "delta", "epsilon", "zata", "eta", "theta", "iota",
                                  "kappa", "lambda", "mu", "nu", "xi", "omikron", "pi", "rho", "sigma", "tau",
                                  "upsilon", "phi", "chi", "psi", "omega"]})
totals_2005_2011 = np.random.uniform(100, 10000, len(df))
totals_2012_2015 = totals_2005_2011 * np.random.uniform(0.70, 2, len(df))
df['2005_2011 (%)'] = totals_2005_2011 / totals_2005_2011.sum() * 100
df['2012_2015 (%)'] = totals_2012_2015 / totals_2012_2015.sum() * 100

# sort all rows via the '2005_2011 (%)' column, sort from large to small
df = df.sort_values('2005_2011 (%)', ascending=False)

ind = np.arange(len(df))
height = 0.3  # two times height needs to be at most 1

fig, ax = plt.subplots(figsize=(12, 6))
ax.barh(ind, df['2012_2015 (%)'], height, align='edge', alpha=1, color='crimson', label='from 2012 to 2015')
ax.barh(ind - height, df['2005_2011 (%)'], height, align='edge', alpha=1, color='dodgerblue', label='from 2005 to 2011')
ax.set_yticks(ind)
ax.set_yticklabels(df['attributes'], fontsize=10)
ax.grid(axis='x')

ax.set_xlabel('Frequency distribution (%)')
ax.set_title('Frequency distribution (%) of common attributes between 2005_2011 and between 2012_2015')
ax.legend()
ax.margins(y=0.01)  # use smaller margins in the y-direction
plt.tight_layout()
plt.show()

matplotlib 条形图

The seaborn library has some functions to create barplots with multiple bars per attribute, without the need to manually fiddle with bar positions. seaborn 库具有一些功能,可以创建每个属性具有多个条形的条形图,而无需手动调整条形位置。 Seaborn prefers its data in "long form", which can be created via pandas' melt() . Seaborn 更喜欢“长格式”的数据,可以通过 pandas 的melt()创建。

Example code:示例代码:

import seaborn as sns

df = df.sort_values('2005_2011 (%)', ascending=True)
df_long = df.melt(id_vars='attributes', value_vars=['2005_2011 (%)', '2012_2015 (%)'],
                  var_name='period', value_name='distribution')
fig, ax = plt.subplots(figsize=(12, 6))
sns.barplot(data=df_long, y='attributes', x='distribution', hue='period', palette='turbo', ax=ax)
ax.set_xlabel('Frequency distribution (%)')
ax.set_title('Frequency distribution (%) of common attributes between 2005_2011 and between 2012_2015')
ax.grid(axis='x')
ax.tick_params(axis='y', labelsize=12)
sns.despine()
plt.tight_layout()
plt.show()

seaborn 水平条形图

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM