简体   繁体   中英

Matplot line graph of pandas dataframe with double y axis scale and datetime on x axis

I have a log which describes my home ADSL speeds. Log entries are in the following format, where the fields are datetime;level;downspeed;upspeed;testhost:

2020-01-06 18:09:45;INFO;211.5;29.1;0;host:spd-pub-rm-01-01.fastwebnet.it
2020-01-06 18:14:39;WARNING;209.9;28.1;0;host:spd-pub-rm-01-01.fastwebnet.it
2020-01-08 10:51:27;INFO;211.6;29.4;0;host:spd-pub-rm-01-01.fastwebnet.it

(for a full sample file -> https://www.dropbox.com/s/tfmj9ozxe5millx/test.log?dl=0 for you to download for the code below)

I wish to plot a matplot figure with the download speeds on the left axis, the upload speeds (which are on a smaller and lower range of values) and have the shortened datetimes under the x tick marks possibly at 45 degrees angle.

"""Plots the adsl-log generated log."""
import matplotlib.pyplot as plt
# import matplotlib.dates as mdates
import pandas as pd

# set field delimiter and set column names which will also cause reading from row 1
data = pd.read_csv("test.log", sep=';', names=[
                   'datetime', 'severity', 'down', 'up', 'loss', 'server'])

#  we need to filter out ERROR records (with 0 speeds)
indexNames = data[data['severity'] == 'ERROR'].index
data.drop(indexNames, inplace=True)

# convert datetime pandas objecti to datetime64
data['datetime'] = pd.to_datetime(data['datetime'])

# use a dataframe with just the data I need; cleaner
speeds_df = data[['datetime', 'down', 'up']]
speeds_df.info() # this shows datetime column is really a datetime64 value now
# now let's plot
fig, ax = plt.subplots()
y1 = speeds_df.plot(ax=ax, x='datetime', y='down', grid=True, label="DL", legend=True, linewidth=2,ylim=(100,225))
y2 = speeds_df.plot(ax=ax, x='datetime', y='up', secondary_y=True, label="UL", legend=True, linewidth=2, ylim=(100,225))

plt.show()

I am now obtaining the plot I need but would appreciate some clarification about the roles of the ax, y1 and y2 axes in the above code.

First, assigning y1 and y2 objects is unnecessary as you will never use them later on. Also, legend=True is the default.

Therefore, you are first initializing an array of axes objects (defaulting to one item, nrow=1 and nrow=2 ), and then assigning it/them according to the pandas plots. Now, normally, you would be overwriting the assignment of ax with ax=ax , but since you employ a secondary y-axis, plots overlay with each other:

# INITIALIZE FIG DIMENSION AND AXES OBJECTS
fig, axs = plt.subplots(figsize=(8,4))

# ASSIGN AXES OBJECTS ACCORDINGLY
speeds_df.plot(ax=axs, x='datetime', y='down', grid=True, label="DL", linewidth=2, ylim=(100,225))
speeds_df.plot(ax=axs, x='datetime', y='up', secondary_y=True, label="UL", linewidth=2, ylim=(100,225))

plt.show()

单图


To illustrate how axes objects can be extended, see below with multiple (non-overlaid) plots.

Example of multiple subplots using nrows=2 :

# INITIALIZE FIG DIMENSION AND AXES OBJECTS
fig, axs = plt.subplots(nrows=2, figsize=(8,4))

# ASSIGN AXES OBJECTS WITH INDEXING AND NO Y LIMITS
speeds_df.plot(ax=axs[0], x='datetime', y='down', grid=True, label="DL", linewidth=2)
plt.subplots_adjust(hspace = 1)
speeds_df.plot(ax=axs[1], x='datetime', y='up', label="UL", linewidth=2)

plt.show()

两行子图


Example of multiple plots using ncols=2 :

# INITIALIZE FIG DIMENSION AND AXES OBJECTS
fig, axs = plt.subplots(ncols=2, figsize=(12,4))

# ASSIGN AXES OBJECTS WITH INDEXING AND NO Y LIMITS
speeds_df.plot(ax=axs[0], x='datetime', y='down', grid=True, label="DL", linewidth=2)
speeds_df.plot(ax=axs[1], x='datetime', y='up', label="UL", linewidth=2)

plt.show()

两列子图


You can even use subplots=True after setting date/time field as index:

# INITIALIZE FIG DIMENSION AND AXES OBJECTS
fig, axs = plt.subplots(figsize=(8,4))

# ASSIGN AXES OBJECT PLOTTING ALL COLUMNS
speeds_df.set_index('datetime').plot(ax=axs, subplots=True, grid=True, label="DL", linewidth=2)

plt.show()

熊猫子图输出

So thanks to @Parfait I hope I understood things correctly. Here the working code:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
###### Prepare the data to plot
# set field delimiter and set column names which will also cause reading from row 1
data = pd.read_csv('test.log', sep=';', names=[
                   'datetime', 'severity', 'down', 'up', 'loss', 'server'])
#  we need to filter out ERROR records (with 0 speeds)
indexNames = data[data['severity'] == 'ERROR'].index
data.drop(indexNames, inplace=True)
# convert datetime pandas object to datetime64
data['datetime'] = pd.to_datetime(data['datetime'])
# use a dataframe with just the data I need; cleaner
speeds_df = data[['datetime', 'down', 'up']]

# now plot the graph
fig, ax = plt.subplots()

color = 'tab:green'
ax.set_xlabel('thislabeldoesnotworkbutcolordoes', color=color)
ax.tick_params(axis='x', labelcolor=color)

color = 'tab:red'
speeds_df.plot(ax=ax, x='datetime', y='down', label="DL", legend=True, linewidth=2, color=color)
ax.set_ylabel('DL', color=color)
ax.tick_params(axis='y', labelcolor=color)

color = 'tab:blue'
ax2 = speeds_df.plot(ax=ax, x='datetime', y='up', secondary_y=True, label="UL", legend=True, linewidth=2, color=color)
ax2.set_ylabel('UL', color=color)
ax2.tick_params(axis='y', labelcolor=color)
# using ylim in the plot command params does not work the same
# cannot show a grid since the two scales are different
ax.set_ylim(10, 225)
ax2.set_ylim(15, 50)

plt.show()

Which gives: 上面代码的输出

What I still don't get is: a) why the x-axis label only seems to honour the color but not the string value :( b) why the ylim=(n,m) parameters in the df plot does not work well and I have to use the ax.set_ylim constructs instead

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM