简体   繁体   English

Pandas 从日期中提取年中的星期和年份

[英]Pandas extract week of year and year from date

I caught up with this scenario and don't know how can I solve this.我赶上了这种情况,不知道该如何解决。 I have the data frame where I am trying to add "week_of_year" and "year" column based in the "date" column of the pandas' data frame which is working fine.我有一个数据框,我试图在 pandas 数据框的“日期”列中添加“week_of_year”和“year”列,工作正常。

import pandas as pd
df = pd.DataFrame({'date': ['2018-12-31', '2019-01-01', '2019-12-31', '2020-01-01']})
df['date'] = pd.to_datetime(df['date'])
df['week_of_year'] = df['date'].apply(lambda x: x.weekofyear)
df['year'] = df['date'].apply(lambda x: x.year)
print(df)

Current Output当前 Output

       date       week_of_year    year
0    2018-12-31      1            2018
1    2019-01-01      1            2019
2    2019-12-31      1            2019
3    2020-01-01      1            2020

Expected Output预计 Output

So here what I am expecting is for 2018 and 2019 the last date was the first week of the new year which is 2019 and 2020 respectively so I want to add logic in the year, where the week is 1 but the date belongs for the previous year so the year column would track that as in the expected output.所以在这里我期待的是 2018 年和 2019 年的最后日期是新年的第一周,分别是 2019 年和 2020 年所以我想在这一年添加逻辑,其中一周是 1 但日期属于前一周年份,因此年份列将跟踪预期的 output。

           date       week_of_year    year
    0    2018-12-31      1            2019
    1    2019-01-01      1            2019
    2    2019-12-31      1            2020
    3    2020-01-01      1            2020

Try:尝试:

df['date'] = pd.to_datetime(df['date'])
df['week_of_year'] = df['date'].dt.weekofyear
df['year']=(df['date']+pd.to_timedelta(6-df['date'].dt.weekday, unit='d')).dt.year

Outputs:输出:

        date  week_of_year  year
0 2018-12-31             1  2019
1 2019-01-01             1  2019
2 2019-12-31             1  2020
3 2020-01-01             1  2020

Few things - generally avoid .apply(..) .几件事 - 通常避免.apply(..)

For datetime columns you can just interact with the date through df[col].dt variable.对于datetime时间列,您可以通过df[col].dt变量与日期进行交互。

Then to get the last day of the week just add to date 6-weekday where weekday is between 0 (Monday) and 6 to the date然后要获得一周的最后一天,只需将日期6-weekday添加到日期,其中weekday介于 0(星期一)和 6 之间

TLDR CODE TLDR代码

To get the week number as a series将周数作为一个系列

df['DATE'].dt.isocalendar().week

To set a new column to the week use same function and set series returned to a column:要将新列设置为周,请使用相同的 function 并将系列返回到列:

df['WEEK'] = df['DATE'].dt.isocalendar().week

TLDR EXPLANATION TLDR 解释

Use the pd.series.dt.isocalendar().week to get the the week for a given series object.使用pd.series.dt.isocalendar().week获取给定系列 object 的星期。

Note:笔记:

  • column "DATE" must be stored as a datetime column “DATE”列必须存储为日期时间列

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM