[英]Plotting pandas DataFrame with matplotlib
Here is a sample of the code I am using which works perfectly well.. 这是我正在使用的代码示例,效果很好。
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
# Data
df=pd.DataFrame({'x': np.arange(10), 'y1': np.random.randn(10), 'y2': np.random.randn(10)+
range(1,11), 'y3': np.random.randn(10)+range(11,21) })
print(df)
# multiple line plot
plt.plot( 'x', 'y1', data=df, marker='o', markerfacecolor='blue', markersize=12, color='skyblue', linewidth=4)
plt.plot( 'x', 'y2', data=df, marker='', color='olive', linewidth=2)
plt.plot( 'x', 'y3', data=df, marker='', color='olive', linewidth=2, linestyle='dashed', label="y3")
plt.legend()
plt.show()
The values in the column 'x' actually refers to 10 hours time period of the day, starting with 6 AM as 0 and 7 AM, and so on. “ x”列中的值实际上是指一天中的10个小时,从6 AM(0和7 AM)开始,依此类推。 Is there any way I could replace those values(x-axis) in my figure with the time periods, like replace the 0 with 6 AM? 有什么办法可以用时间周期替换图中的那些值(x轴),例如将0替换为6 AM?
It's always a good idea to store time or datetime information as Pandas datetime datatype. 将时间或日期时间信息存储为Pandas datetime数据类型总是一个好主意。
In your example, if you only want to keep the time information: 在您的示例中,如果您只想保留时间信息:
df['time'] = (df.x + 6) * pd.Timedelta(1, unit='h')
Output 产量
x y1 y2 y3 time
0 0 -0.523190 1.681115 11.194223 06:00:00
1 1 -1.050002 1.727412 13.360231 07:00:00
2 2 0.284060 4.909793 11.377206 08:00:00
3 3 0.960851 2.702884 14.054678 09:00:00
4 4 -0.392999 5.507870 15.594092 10:00:00
5 5 -0.999188 5.581492 15.942648 11:00:00
6 6 -0.555095 6.139786 17.808850 12:00:00
7 7 -0.074643 7.963490 18.486967 13:00:00
8 8 0.445099 7.301115 19.005115 14:00:00
9 9 -0.214138 9.194626 20.432349 15:00:00
If you have a starting date: 如果您有开始日期:
start_date='2018-07-29' # change this date appropriately
df['datetime'] = pd.to_datetime(start_date) + (df.x + 6) * pd.Timedelta(1, unit='h')
Output 产量
x y1 y2 y3 time datetime
0 0 -0.523190 1.681115 11.194223 06:00:00 2018-07-29 06:00:00
1 1 -1.050002 1.727412 13.360231 07:00:00 2018-07-29 07:00:00
2 2 0.284060 4.909793 11.377206 08:00:00 2018-07-29 08:00:00
3 3 0.960851 2.702884 14.054678 09:00:00 2018-07-29 09:00:00
4 4 -0.392999 5.507870 15.594092 10:00:00 2018-07-29 10:00:00
5 5 -0.999188 5.581492 15.942648 11:00:00 2018-07-29 11:00:00
6 6 -0.555095 6.139786 17.808850 12:00:00 2018-07-29 12:00:00
7 7 -0.074643 7.963490 18.486967 13:00:00 2018-07-29 13:00:00
8 8 0.445099 7.301115 19.005115 14:00:00 2018-07-29 14:00:00
9 9 -0.214138 9.194626 20.432349 15:00:00 2018-07-29 15:00:00
Now the time / datetime column have a special datatype: 现在,“时间/日期时间”列具有特殊的数据类型:
print(df.dtypes)
Out[5]:
x int32
y1 float64
y2 float64
y3 float64
time timedelta64[ns]
datetime datetime64[ns]
dtype: object
Which have a lot of nice properties, including automatic string formatting which you will find very useful in later parts of your projects. 它具有很多不错的属性,包括自动字符串格式设置 ,在项目的后续部分中您会发现它非常有用。
Finally, to plot using matplotlib: 最后,使用matplotlib进行绘图:
# multiple line plot
plt.plot( df.datetime.dt.hour, df['y1'], marker='o', markerfacecolor='blue', markersize=12, color='skyblue', linewidth=4)
plt.plot( df.datetime.dt.hour, df['y2'], marker='', color='olive', linewidth=2)
plt.plot( df.datetime.dt.hour, df['y3'], marker='', color='olive', linewidth=2, linestyle='dashed', label="y3")
plt.legend()
plt.show()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.