relatively new to Python. I don't think this is a duplicate because I didn't find the answer I was looking for.
I have the following dataframe consisting 'Date' in datetime64 format and average temperature in Celsius as float64. I have 18 years (1990 to 2018) worth of daily recordings and I am supposed to gather the highest temperature for each of the 18 years.
Date Average Daily Value
0 1990-01-01 8.88330
1 1990-01-02 9.11045
2 1990-01-03 10.93545
3 1990-01-04 3.69165
4 1990-01-05 6.03955
... ... ...
10567 2018-12-27 6.20830
10568 2018-12-28 7.05420
10569 2018-12-29 2.68330
10570 2018-12-30 14.49580
10571 2018-12-31 4.74170
year = set(df['Date'].dt.year.to_list()); years = list(years)
years = [1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008,
2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018]
I have managed to make a list of the years and I am hoping to use it to iterate through the data but I am not sure how. I tried using a for loop but it just return the highest value for whole data set, not for each year.
Any help would be great. Thanks.
You need to first group by year and then fetch the maximum:
Example:
import numpy as np
import pandas as pd
df = pd.read_csv('test.csv', converters={'date': pd.to_datetime})
df['years'] = df['date'].dt.year
grouped_df = df.groupby('years')
max_temp = grouped_df.max('temp')
max_temp
Output with my test set:
temp
years
2018 14
2019 12
2020 11
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.