简体   繁体   中英

How to group a time series dataframe by day of the month and plot the x axis as the day of the month?

I'm analyzing NOAA Global Historical Climatology Network Daily that is stored in BigQuery. I want to understand if max temperatures (on the same day of the year) have changed from year to year to understand climate change (ie 'can we see a subtle rise in temps from August 25th 1970 vs. August 25th 1980' and so on).

I'm able to get the data pulled fine using the BigQuery Colab Client.

dfall = pd.DataFrame()

for i in range(1991,2010):
    sql = "SELECT date, element, (value/10 * 1.8) + 32 as temp_f, extract(year from date) yearstring  FROM `bigquery-public-data.ghcn_d.ghcnd_" + str(i) + "` where id = 'USC00040693' and DATE(date) bETWEEN DATE('" + str(i) + "-08-26') AND DATE('"+ str(i) + "-09-03') and (element = 'TMAX') order by date asc "
     
    dfyear = client.query(sql).to_dataframe()
    dfall = dfall.append(dfyear, ignore_index=True)

    

This creates a dataframe that looks like so:

数据框

I tried plotting it like so

dfall.set_index('date').plot()

图表

This is showing it on a year by year basis, even though I'm only focused on a specific stretch of 15-20 days. I'd like to be able to show only those specific days. So something like the 1st day of September (and then have all of the bars for that day across many years) and then the 2nd, etc. etc.

How do I group on a day of the year or a specific month?

If need filter between 1.9. to 20.9 for all years use:

s = dfall.set_index('date')['temp_f']
s = s[(s.dt.month == 9) & s.dt.day.between(1,20)]

s.plot()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM