简体   繁体   English

按日期突出显示python图中的最大点

[英]Highlight maximum point point in python plot by date

I know this question is really close to a lot of other answered questions, but all previous answers are giving me the same traceback issues. 我知道这个问题确实很接近其他许多已回答的问题,但是以前的所有答案都给了我同样的追溯问题。

I have a straightforward time series, and I'm trying to highlight the maximum point. 我有一个简单的时间序列,我正在尝试突出显示最高点。 I'm running into problems manipulating a Pandas Dataframe to obtain the maximum y value for plotting on a graph. 我在处理“熊猫数据框”以获得在图上绘制的最大y值时遇到问题。 I think I'm nearly there, but I think the parse_dates parameter of the pd.read_csv import is messing with my indexing. 我想我快要在那里了,但我认为pd.read_csv导入的parse_dates参数弄乱了我的索引编制。

When I import the dataset, I have a datetime column, and a wind_speed column. 导入数据集时,我有一个datetime列和一个wind_speed列。 When I resample for the daily average, the title for the variable column disappears and the datetime column becomes uncallable. 当我对每日平均值进行重新采样时,变量列的标题消失,而datetime列变得无法调用。

Before taking the daily average: 在获取每日平均值之前:

In[12]: weather.head()
Out[12]:                                  wind_speed
            d_stamp_t_stamp                
            2017-07-26 00:05:09        1.31
            2017-07-26 00:35:13        1.62
            2017-07-26 01:05:05        1.50
        .......

After taking the daily average: 取每日平均值后:

wind_avg = weather.wind_speed.resample('D').mean()

d_stamp_t_stamp
2017-09-01    3.870625
2017-09-02    4.386875
2017-09-03    5.426739
2017-09-04    2.718750
2017-09-05    3.407708

The label for the wind_speed column goes away, and I can't seem to sample that data anymore. wind_speed列的标签消失了,我似乎再也无法对该数据进行采样了。

So this is the code for the time series I have so far: 这是到目前为止的时间序列的代码:

## Import weather data.
weather = pd.read_csv('/Users/regina/university_projects/Themo_Data/Weather0717-0618.csv', 
                 parse_dates=[[0,1]], index_col=0)
wind_avg = weather.wind_speed.resample('D').mean()

## Wind Speed graph
windplot = wind_avg.plot(title="Wind Speed", figsize=(12,8), 
                        fontsize=12, marker='o', markersize=7)
windplot.set_xlabel("Date"),windplot.set_ylabel("Wind Speed in m/s")

Which gives me this graph with wind speed average on the y axis. 这给了我这张在y轴上具有平均风速的图表。 在此处输入图片说明

The problem comes when I try to annotate the maximum wind speed. 当我尝试注释最大风速时,问题就来了。

y0 = max(wind_avg.wind_speed)
xpos = wind_avg.wind_speed.index(y0)
x0 = (wind_avg.d_stamp_t_stamp[xpos])

    windplot.annotate(
                "Max Speed", xy=(x0,y0), ha='right',
                va='bottom', textcoords='offset points', bbox=dict(BoxStyle='Round, pad=0.5', fc='yellow',
                alpha=0.5), arrowprops=dict(facecolor='black', shrink=0.05))

I get an attribute error message like this: 我收到这样的属性错误消息:

Traceback (most recent call last):

  File "<ipython-input-15-5e45876c5ebc>", line 5, in <module>
    y0 = max(wind_avg.wind_speed)

  File "/Users/regina/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py", line 4372, in __getattr__
    return object.__getattribute__(self, name)

AttributeError: 'Series' object has no attribute 'wind_speed'

Is there something about the way I'm resampling the wind_speed column that removes its label? 我重新采样wind_speed列以删除其标签的方式是否有问题? Thank you all so much! 非常感谢大家!

In the line 在行中

wind_avg = weather.wind_speed.resample('D').mean()

you apply resample to the single Pandas Series which is in the column wind_speed of your Dataframe, so you'll get a Series as return value: 您将resample应用于单个Pandas系列,该系列位于数据wind_speed的wind_speed列中,因此您将获得系列作为返回值:

type(wind_avg)
Out: pandas.core.series.Series

Try 尝试

weather_avg = weather.resample('D').mean()
type(weather_avg)
Out: pandas.core.frame.DataFrame

and you'll get your whole weather dataset resampled per days. 这样您就可以每天对整个天气数据集进行重新采样。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM