[英]Calculate day of year in dataframe from a datetime column to another column in Python
I have a dataframe
with a datetime
column in it, like 2014-01-01
, 2016-06-05
, etc. Now I want to add a column in the dataframe
calculating the day of year (for that given year).我有一个带有datetime
时间列的dataframe
,例如2014-01-01
、 2016-06-05
等。现在我想在dataframe
添加一个列来计算一年中的第几天(对于给定的年份)。
On this forum I did find some hints for sure, but I'm struggling with the types and dataframe
stuff.在这个论坛上,我确实确实找到了一些提示,但我正在努力解决类型和dataframe
的问题。 So this works fine所以这很好用
from datetime import datetime
day_to_calc = today
day_of_year = day_to_calc.timetuple().tm_yday
day_of_year
But my day_to_calc
is not today, but df['Date']
.但是我的day_to_calc
不是今天,而是df['Date']
。 However, if I try this但是,如果我尝试这个
df['DOY'] = df['Date'].timetuple().tm_yday
I get我得到
AttributeError: 'Series' object has no attribute 'timetuple' AttributeError:“系列”对象没有属性“timetuple”
Ok, so I guess I need a map function perhaps?好的,所以我想我可能需要一个地图功能? So I'm trying something like..所以我正在尝试类似..
df['DOY'] = map (datetime.timetuple().tm_yday,df['Date'])
And surely you guys see how stupid that is;-) (but I'm still learning Python)你们肯定知道那是多么愚蠢;-)(但我仍在学习 Python)
TypeError: descriptor 'timetuple' of 'datetime.datetime' object needs an argument TypeError: 'datetime.datetime' 对象的描述符 'timetuple' 需要参数
So that makes sense sort of because I need to pass the date as parameter, sooo.. trying所以这是有道理的,因为我需要将日期作为参数传递,sooo ..尝试
df['DOY'] = datetime.timetuple(df['Date']).tm_yday
TypeError: descriptor 'timetuple' requires a 'datetime.datetime' object but received a 'Series' TypeError: 描述符 'timetuple' 需要一个 'datetime.datetime' 对象但收到了一个 'Series'
There must be a simple way, but I just can't figure out the syntax:-(一定有一个简单的方法,但我就是想不通语法:-(
Use dayofyear
function:使用dayofyear
函数:
import pandas as pd
# first convert date string to datetime with a proper format string
df = pd.DataFrame({'Date':pd.to_datetime(['2014-01-01', '2016-06-05'], format='%Y-%m-%d')})
# calculate day of year
df['DOY'] = df['Date'].dt.dayofyear
print(df)
Output:输出:
Date DOY
0 2014-01-01 1
1 2016-06-05 157
I noticed the above answer does not go into great detail, so I've provided a more explanatory answer below.我注意到上面的答案没有详细说明,所以我在下面提供了一个更具解释性的答案。
Try the following:尝试以下操作:
import pandas as pd
# Create a pandas datetime range for the year 2022
passed_2022 = pd.date_range('2022-01-01', '2022-12-31')
# Convert the datetime range to a list of strings in the format 'YYYY-MM-DD'
passed_2022_list = [i.strftime('%Y-%m-%d') for i in passed_2022]
# Create a DataFrame
data = pd.DataFrame({'datetime': passed_2022_list})
# Filter the data DataFrame to only include dates in the passed_2022 list
data = data[data['datetime'].isin(passed_2022_list)]
# Count the number of rows in the filtered DataFrame
num_days_passed = len(data)
# Create a new DataFrame with 'datetime' and 'DAYS_OF_YEAR' columns
result = pd.DataFrame({'datetime': passed_2022_list,
'DAYS OF YEAR': range(1, num_days_passed+1)})
# Print the result of the DataFrame
print(result)
Output:输出:
datetime DAYS OF YEAR
0 2022-01-01 1
1 2022-01-02 2
2 2022-01-03 3
3 2022-01-04 4
4 2022-01-05 5
.. ... ...
360 2022-12-27 361
361 2022-12-28 362
362 2022-12-29 363
363 2022-12-30 364
364 2022-12-31 365
[365 rows x 2 columns]
Process finished with exit code 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.