[英]Error with x-axis labels when plotting multi-index dataframe using Matplotlib
I've got a timeseries dataframe and I've calculated a season column from the datetime column.我有一个时间序列数据框,我从日期时间列计算了一个季节列。 I've then indexed the dataframe by 'Season' and 'Year' and want to plot the result.
然后我通过“季节”和“年份”对数据框进行了索引,并希望绘制结果。 Code below:
代码如下:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
dates = pd.date_range('20070101',periods=1000)
df = pd.DataFrame(np.random.randn(1000), columns =list ('A'))
df['date'] = dates
def get_season(row):
if row['date'].month >= 3 and row['date'].month <= 5:
return 'spring'
elif row['date'].month >= 6 and row['date'].month <= 8:
return 'summer'
elif row['date'].month >= 9 and row['date'].month <= 11:
return 'autumn'
else:
return 'winter'
df['Season'] = df.apply(get_season, axis=1)
df['Year'] = df['date'].dt.year
df.loc[df['date'].dt.month == 12, 'Year'] += 1
df = df.set_index(['Year', 'Season'], inplace=False)
df.head()
fig,ax = plt.subplots()
df.plot(x_compat=True,ax=ax)
ax.xaxis.set_tick_params(reset=True)
ax.xaxis.set_major_locator(mdates.YearLocator(1))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y'))
plt.show()
Unfortunately this gives me the error when plotting the x axis labels:不幸的是,这在绘制 x 轴标签时给了我错误:
File "C:\Users\myname\AppData\Local\Continuum\Anaconda\lib\site-packages\matplotlib\dates.py", line 225, in _from_ordinalf
dt = datetime.datetime.fromordinal(ix)
ValueError: ordinal must be >= 1
I want to see only the year as the x-axis label, not the year and the season.我只想将年份视为 x 轴标签,而不是年份和季节。
I'm sure it's something simple that I'm doing wrong but I can't figure out what...我确定这很简单,我做错了,但我不知道是什么......
EDIT:编辑:
Changing the df.plot function slightly plots the dates a bit better, but still plots months, I'd prefer to have only the year, but this is slightly better than before.稍微更改 df.plot 函数可以更好地绘制日期,但仍然绘制月份,我更喜欢只有年份,但这比以前略好。
new code:新代码:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
dates = pd.date_range('20070101',periods=1000)
df = pd.DataFrame(np.random.randn(1000), columns =list ('A'))
df['date'] = dates
def get_season(row):
if row['date'].month >= 3 and row['date'].month <= 5:
return 'spring'
elif row['date'].month >= 6 and row['date'].month <= 8:
return 'summer'
elif row['date'].month >= 9 and row['date'].month <= 11:
return 'autumn'
else:
return 'winter'
df['Season'] = df.apply(get_season, axis=1)
df['Year'] = df['date'].dt.year
df.loc[df['date'].dt.month == 12, 'Year'] += 1
df = df.set_index(['Year', 'Season'], inplace=False)
df.head()
fig,ax = plt.subplots()
df.plot(x='date', y = 'A', x_compat=True,ax=ax)
Unfortunately, the marriage between pandas
and matplotlib
time locator/formatter is never a happy one.不幸的是,
pandas
和matplotlib
时间定位器/格式化程序之间的结合从来都不是幸福的。 The most consistent way is to have the datetime data in a numpy
array
of datetime
, and have that plotted directly in matplotlib
.最一致的方式是有一个datetime数据
numpy
array
的datetime
,并具有直接绘制matplotlib
。 pandas
does provided a nice .to_pydatetime()
method: pandas
确实提供了一个很好的.to_pydatetime()
方法:
fig,ax = plt.subplots()
plt.plot(dates.to_pydatetime(), df.A)
years = mdates.YearLocator() # every year
months = mdates.MonthLocator() # every month
yearsFmt = mdates.DateFormatter('%Y')
ax.xaxis.set_major_locator(years)
ax.xaxis.set_major_formatter(yearsFmt)
ax.xaxis.set_minor_locator(months)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.