简体   繁体   中英

How to get pandas to plot on the same graph with the same y axis range

I am trying to plot multiple bar charts vertically on top of each other. There should be one labelled x axis (with the days of the week). The code I have so far is:

import pandas as pd
import matplotlib.pyplot as plt
import calendar

df = pd.read_csv("health.csv", header = None, names = ['Physical', 'Emotional'])
# Get Dayofweek index number (start with 6 for sunday) 6,0,1....
df['DayOfTheWeek'] = [(i+6) % 7  for i in range(len(df))]

# Get a map to translate to day of week
d = dict(zip(range(7),list(calendar.day_name)))
df['DayOfTheWeek'] = df['DayOfTheWeek'].map(d)

# Loop through the df (splitting week by week)
for i in range(int(round(len(df)/7))):
    plt.ylim([0,10])
    df.iloc[i*7:(i+1)*7].set_index('DayOfTheWeek').plot(kind='bar')
plt.show()

This has the following problems:

  1. For some reasons the first graph produced is blank.
  2. I would like subplots on the same graph separated vertically rather than lots of separate plots
  3. My dataframe has 39 rows but the method above doesn't plot the last 4 points at all.

The full input data is:

5,5
6,7
6,9
6,7
5,6
7,9
5,9
6,7
7,6
7,4
7,5
6,7
7,9
7,9
5,6
8,7
9,9
7,7
7,6
7,8
7,9
7,9
7,6
7,8
6,6
6,6
6,7
6,6
6,5
6,6
7,5
7,5
7,5
7,6
7,5
8,6
7,6
7,7
6,6

1. For some reasons the first graph produced is blank.

When you call plt.ylim() , it will "set the y-limits of the current axes.". It does this by calling plt.gca under the hood , which will "Get the current Axes instance (...), or create one.". Now, in the first iteration of your loop, no Axes exists, so it creates a new one. Then pandas.DataFrame.plot proceeds to create its own figure, ignoring the existing one. That's how you get an empty first plot.

The fix is simple: Swap the order of plt.ylim([0,10]) and the following line, or set it directly in .plot(kind='bar', ylim=(0, 10)) .

2. I would like subplots on the same graph separated vertically rather than lots of separate plots

Perhaps plt.subplots() is what you're looking for?

n_weeks = 6  # See pt 3 for an elaboration on this
fig, axs = plt.subplots(n_weeks, 1, figsize=(5, 12), sharex=True)

# Record the names of the first 7 days in the dataset
weekdays = df.head(7)['DayOfTheWeek'].values
for weekno, ax in enumerate(axs):
    week = df.iloc[weekno*7:(weekno+1)*7]
    week = week.set_index('DayOfTheWeek')
    # The final week is incomplete and will mess up our plot unless
    # we force it to contain all the weekdays.
    week = week.loc[weekdays]
    week.plot(kind='bar', ylim=(0, 10), ax=ax, legend=False)
# Only draw legend in the final Axis
ax.legend()

# Force tight layout
fig.tight_layout()

3. My dataframe has 39 rows but the method above doesn't plot the last 4 points at all.

Try printing the ranges you select in your loop, and you should be able to spot the error. It is an off-by-one error :-)

Spoiler/solution below!

for i in range(int(round(len(df)/7))):
    print(df.iloc[i*7:(i+1)*7])

shows that you are only selecting complete weeks.

Note: In copying the data from the question, I apparently missed a row! There should be 39. The remarks still stand, though.

Let's inspect what happens! len(df) is 38, len(df) / 7 is 5.43, and round(len(df) / 7) is 5. You are rounding down to nearest complete week. Had your data contained one more day, it would round up to 6 as you expect. However, that is somewhat brittle behaviour; sometimes it rounds up, sometimes down, but you always want to see the last incomplete week. So rather than doing that, I'll introduce you to two nice features: the // operator, which is a floor division (always rounding down), and divmod , a built-in function that simultaneously does floor division and gives you the remainder.

My suggested solution uses divmod to count any incomplete weeks:

n_weeks, remaining_days = divmod(len(df), 7)
n_weeks += min(1, remaining_days)

for i in range(n_weeks):
    ...

You can do this by first setting up your figure layout, then passing an explicit axes object to the pandas plot method. I then conditionally only show the x axis labels on the last plot. I also removed the mapping to the names of the days - this is done now via the plot directly. Obviously can be put back in if needed for other reasons!

import pandas as pd
import matplotlib.pyplot as plt
import calendar

df = pd.read_csv("health.csv", header = None, names = ['Physical', 'Emotional'])
# Get Dayofweek index number (start with 6 for sunday) 6,0,1....
df['DayOfTheWeek'] = [(i+6) % 7  for i in range(len(df))]

df_calendar = calendar.Calendar(firstweekday=6)

weeks = int(round(len(df)/7))
fig, axes = plt.subplots(weeks, 1, figsize=(6, weeks*3))

# Loop through the df (splitting week by week)
for i in range(weeks):
    ax=axes[i]

    df.iloc[i*7:(i+1)*7].set_index('DayOfTheWeek').plot(kind='bar', ax=axes[i])
    ax.set_ylim([0,10])
    ax.set_xlim([-0.5,6.5])
    ax.set_xticks(range(7))

    if i == 0:
        ax.legend().set_visible(True)
    else:
        ax.legend().set_visible(False)

    if i == weeks-1:
        ax.set_xticklabels([calendar.day_name[weekday] for weekday in df_calendar.iterweekdays()])
        ax.set_xlabel("Day of the week")
    else:
        ax.set_xticklabels([])
        ax.set_xlabel("")

plt.savefig("health.png")
plt.show()

健康

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM