简体   繁体   中英

How can a plot a 5 grouped bars bar chart in matplotlib?

I have the following dataframe:

meteo = [["January", 9.2, 13.6, 4.7, 37, 70],
["February",9.9, 14.3, 5.4, 35, 70],
["March", 11.8, 16.1, 7.4, 36, 70],
["April", 13.7, 18.0, 9.4, 40, 69],
["May", 16.9, 21.1, 12.8, 47, 70],
["June", 20.9, 24.9, 16.8, 30, 68],
["July", 23.9, 28.0, 19.8, 21, 67],
["August", 24.4, 28.5, 20.2, 62, 68],
["September", 21.7, 26.0, 17.4, 81, 70],
["October", 17.8, 22.1, 13.5, 91, 73],
["November", 13.0, 17.3, 8.6, 59, 71],
["December", 10.0, 14.3, 5.7, 40, 69]]

import pandas as pd

# Create dataframe with above data

df = pd.DataFrame(meteo)

# Drop useless column

df.drop(0, inplace = True, axis = 1)

# Rename columns

df.rename(columns = {1: "Temp_media_anual_mes", 2: "Temp_máxima_media", 3: "Temp_mínima_media", 4: "Media_lluvias_mensual", 5:"Humedad_media_rel"}, inplace = True)
df["mes"] = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]

Now, i would like to plot a grouped bar chart. I would like to have 5 grouped bars per month. I have tried this, but i have a little problem with the spaces between bars:

# Setting the positions and width for the bars
pos = list(range(len(df.mes))) 
width = 0.25 
    
# Plotting the bars
fig, ax = plt.subplots(figsize=(16,10))

# Create a bar with pre_score data,
# in position pos,
plt.bar(pos, 
        #using df['pre_score'] data,
        df['Temp_media_anual_mes'], 
        # of width
        width, 
        # with alpha 0.5
        alpha=0.5, 
        # with color
        color='red')
        # with label the first value in first_name
        #label=df['first_name'][0]) 

# Create a bar with mid_score data,
# in position pos + some width buffer,
plt.bar([p + width for p in pos], 
        #using df['mid_score'] data,
        df['Temp_máxima_media'],
        # of width
        width, 
        # with alpha 0.5
        alpha=0.5, 
        # with color
        color='green')
        # with label the second value in first_name
        #label=df['first_name'][1]) 

# Create a bar with post_score data,
# in position pos + some width buffer,
plt.bar([p + width*2 for p in pos], 
        #using df['post_score'] data,
        df['Temp_mínima_media'], 
        # of width
        width, 
        # with alpha 0.5
        alpha=0.5, 
        # with color
        color='blue')
        # with label the third value in first_name
        #label=df['first_name'][2]) 
           
plt.bar([p + width*2 for p in pos], 
        #using df['post_score'] data,
        df['Media_lluvias_mensual'], 
        # of width
        width, 
        # with alpha 0.5
        alpha=0.5, 
        # with color
        color='orange')
        # with label the third value in first_name
        #label=df['first_name'][2]) 
           
plt.bar([p + width*2 for p in pos], 
        #using df['post_score'] data,
        df['Humedad_media_rel'], 
        # of width
        width, 
        # with alpha 0.5
        alpha=0.5, 
        # with color
        color='purple')
        # with label the third value in first_name
        #label=df['first_name'][2]) 

# Set the y axis label
ax.set_ylabel('Amount')

# Set the chart's title
ax.set_title('Rain and temperature')

# Set the position of the x ticks
ax.set_xticks([p + 1.5 * width for p in pos])

# Set the labels for the x ticks
ax.set_xticklabels(df['mes'])

# Setting the x-axis and y-axis limits
plt.xlim(min(pos)-width, max(pos)+width*4)
plt.ylim([0, max(df['Temp_media_anual_mes'] + df['Temp_máxima_media'] + df['Temp_mínima_media'] + df["Media_lluvias_mensual"] + df["Humedad_media_rel"])] )

plt.grid()
plt.show()

This is the plot i'm getting

条形图

As you can see, it's showing 3 separate bars, and in the third one, there are 3 bars one behind another. I know the issue is in the spacing between the bars, but i don't know how to fix it. Could someone point me in the right direction please?

EDIT:

I would also like to display above of each bar, the measurements units of each plotted values. These are:

  • Celsius degrees for temperatures
  • mm for precipitation amounts
  • % for relative humidity

Thank you very much in advance

Here is some code that places the bars, puts the month name centered, ...

Note that the original calculation for ylim was wrong, it should not be the sum of the maxima but the maximum of the maxima. I also added some text with the units above the columns. I tried to find some suitable colors: red-yellowish for the temperatures, blue for rain, blue greenish for humidity.

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker

meteo = [["January", 9.2, 13.6, 4.7, 37, 70],
         ["February", 9.9, 14.3, 5.4, 35, 70],
         ["March", 11.8, 16.1, 7.4, 36, 70],
         ["April", 13.7, 18.0, 9.4, 40, 69],
         ["May", 16.9, 21.1, 12.8, 47, 70],
         ["June", 20.9, 24.9, 16.8, 30, 68],
         ["July", 23.9, 28.0, 19.8, 21, 67],
         ["August", 24.4, 28.5, 20.2, 62, 68],
         ["September", 21.7, 26.0, 17.4, 81, 70],
         ["October", 17.8, 22.1, 13.5, 91, 73],
         ["November", 13.0, 17.3, 8.6, 59, 71],
         ["December", 10.0, 14.3, 5.7, 40, 69]]
df = pd.DataFrame(meteo)
#df.rename(columns = {0:"mes", 1: "Temp. media mes", 2: "Temp. máxima media", 3: "Temp. mínima media", 4: "Media lluvias mensual", 5:"Humedad media rel"}, inplace = True)
df.rename(columns = {0:"month", 1: "Mean monthly temperature", 2: "Max. monthly temperature", 3: "Min. monthly temperature", 4: "Mean monthly rainfall", 5:"Mean relative humidity"}, inplace = True)

# Setting the positions and width for the bars
pos = list(range(len(df)))
num_col = len(df.columns) - 1
width = 0.95 / num_col

fig, ax = plt.subplots(figsize=(16,10))

bar_colors = ['#feb24c', '#f03b20', '#ffeda0', '#43a2ca', '#a8ddb5']
bar_labels = df.columns[1:]

for i, (colname, color, lbl) in enumerate(zip(df.columns[1:], bar_colors, bar_labels)):
    delta_p = 0.125 + width*i
    plt.bar([p + delta_p for p in pos],
            df[colname], width, color=color, label=lbl)
    for j in range(len(df)):
        ax.annotate("°C" if i < 3 else "mm" if i == 3 else "%",
                    xy=(pos[j] + delta_p, df[colname][j] + 1),
                    ha='center')

ax.set_ylabel('Amount')
ax.set_title('Temperatures, Rain and Humidity')
ax.set_xticks(pos)

def update_ticks(x, pos):
    return df['month'][pos]

ax.xaxis.set_major_formatter(ticker.NullFormatter())
ax.xaxis.set_minor_formatter(ticker.FuncFormatter(update_ticks))
ax.xaxis.set_minor_locator(ticker.FixedLocator([p+0.5 for p in pos]))
for tick in ax.xaxis.get_minor_ticks():
    tick.tick1line.set_markersize(0)
    tick.tick2line.set_markersize(0)
    tick.label1.set_horizontalalignment('center')
plt.xlim(min(pos), max(pos)+1)
plt.ylim([0, 10+max([max(df[colname]) for colname in df.columns[1:]])])
plt.legend()
plt.grid()
plt.show()

在此处输入图片说明

You don't need that many plot calls. You can do it in one go.

>>> ax = df.plot.bar(x='mes', y=list(df.columns[1:6]))
>>> plt.show()

Regarding displaying value above each bar, you can refer to this post where I have explained how to add text to the top of histogram. You can do the same for the bar plot too.

How can I add the counts to the histogram plot?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM