简体   繁体   中英

How to plot data per hour, grouped by days?

Background: from a large DataFrame I filtered out entries for year=2013 , month=June , week of the 3rd - 9th (Monday to Sunday). Then, I grouped the data by day , hour , and user_type , and pivoted the table to get a DataFrame which looks like:

   Day  Hour  Casual  Registered  Casual_percentage
0  3    0     14      19          42.42
1  3    1     8       8           50.00
2  3    2     1       3           25.00
3  3    3     2       1           66.67
4  3    4     1       3           25.00
5  3    5     1       17          5.56
.  .    .     .       .           .

For each day I have 24 hours so for day 4 (Tuesday), the data starts like:

.  .    .     .       .           .  
21 3    21    32      88          26.67
22 3    22    26      64          28.89
23 3    23    23      30          43.40
24 4    0     10      11          47.62
25 4    1     1       5           16.67
26 4    2     1       1           50.00
.  .    .     .       .           .

How can I plot Casual and Registered variables per Hour , for each of the 7 Day s? Would I need to create 7 different plots and align them in 1 figure?

Current code. I feel I'm way off. I also tried to create a second x-axis (for Days ) using the documentation .

def make_patch_spines_invisible(ax):
    ax.set_frame_on(True)
    ax.patch.set_visible(False)
    for sp in ax.spines.values():
        sp.set_visible(False)

fig, ax1 = plt.subplots(figsize=(10, 5))
ax1.set(xlabel='Hours', ylabel='Total # of trips started')

ax1.plot(data.Hour, data.Casual, color='g')
ax1.plot(data.Hour, data.Registered, color='b')


"""This part is trying to create the 2nd x-axis (Days)"""
ax2 = ax1.twinx()
#offset the bottom spine
ax2.spines['bottom'].set_position(('axes', -.5))
make_patch_spines_invisible(ax2)
#show bottomm spine
ax2.spines['bottom'].set_visible(True)
ax2.set_xlabel("Days")


plt.show()

Output: 在此处输入图片说明

End goal

I think this should be easier if you work on datetime objects rather than Day , Hour strings.
This way, you'll be able to use date tick locators and formatters along with major and minor ticks .

Even if you didn't mention it, I assume you can use pandas to deal with dataframes.
I created a new dataframe by copying many times data you provided and cutting some of them (this is not so important).
Here I rebuilt dates from infos you provided, but I suggest to work directly on them (I suppose the original dataframe has some kind of date-like field in it).

import pandas as pd
import matplotlib.pyplot as plt 
import matplotlib.dates as mdates

df = pd.read_csv("mydataframe.csv")
df["timestamp"] = "2013-06-" + df["Day"].astype(str).str.zfill(2) + "-" + df["Hour"].astype(str).str.zfill(2)
df["timestamp"] = pd.to_datetime(df["timestamp"], format="%Y-%m-%d-%H")


fig, ax1 = plt.subplots(figsize=(10, 5))
ax1.set(xlabel='', ylabel='Total # of trips started')
ax1.plot(df["timestamp"], df.Casual, color='g')
ax1.plot(df["timestamp"], df.Registered, color='b')

ax1.xaxis.set(
    major_locator=mdates.DayLocator(),
    major_formatter=mdates.DateFormatter("\n\n%A"),
    minor_locator=mdates.HourLocator((0, 12)),
    minor_formatter=mdates.DateFormatter("%H"),
)
plt.show()

Output:

格式化的数据框

Assuming your data is ordered by index (eg, 0 - 24 is day 3, 25 - 48 is day 4, etc.) you can plot the index values rather than hours in your code:

ax1.plot(data.index.values, df.Casual, color='g')
ax1.plot(data.index.values, df.Registered, color='b')

This will yield a graph similar to what you're looking for as an end product (note I used fake data):

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM