如何基于保存日期的其他两个列创建一个 Pandas DataFrame 列？

Question

I have a pandas Dataframe with two date columns (A and B) and I would like to create a 3rd column (C) that holds dates created using month and year from column A and the day of column B. Obviously I would need to change the day for the months that day doesn't exist like we try to create 31st Feb 2020, it would need to change it to 29th Feb 2020.我有一个带有两个日期列（A 和 B）的熊猫数据框，我想创建一个第三列（C），其中包含使用 A 列中的月份和年份以及 B 列的日期创建的日期。显然我需要更改那天的月份并不存在，就像我们尝试创建 2020 年 2 月 31 日一样，它需要将其更改为 2020 年 2 月 29 日。

For example例如

import pandas as pd
df = pd.DataFrame({'A': ['2020-02-21', '2020-03-21', '2020-03-21'], 
                   'B': ['2020-01-31', '2020-02-11', '2020-02-01']})
for c in df.columns:
    dfx[c] = pd.to_datetime(dfx[c])

Then I want to create a new column C that is a new datetime that is:然后我想创建一个新的 C 列，它是一个新的日期时间，它是：

year = df.A.dt.year年 = df.A.dt.year

month = df.A.dt.month月 = df.A.dt.month

day = df.B.dt.day天 = df.B.dt.day

I don't know how to create this column.我不知道如何创建此列。 Can you please help?你能帮忙吗？

Answer 1

Here is one way to do it, using pandas' time series functionality:这是使用熊猫的时间序列功能的一种方法：

import pandas as pd

# your example data
df = pd.DataFrame({'A': ['2020-02-21', '2020-03-21', '2020-03-21'], 
                   'B': ['2020-01-31', '2020-02-11', '2020-02-01']})
for c in df.columns:
    # keep using the same dataframe here
    df[c] = pd.to_datetime(df[c])

# set back every date from A to the end of the previous month,
# then add the number of days from the date in B
df['C'] = df.A - pd.offsets.MonthEnd() + pd.TimedeltaIndex(df.B.dt.day, unit='D')

display(df)

Result:结果：

             A           B           C
0   2020-02-21  2020-01-31  2020-03-02
1   2020-03-21  2020-02-11  2020-03-11
2   2020-03-21  2020-02-01  2020-03-01

As you can see in row 0, this handles the case of "February 31st" not quite as you suggested, but still in a logical way.正如您在第 0 行中看到的那样，这处理“2 月 31 日”的情况并不像您建议的那样，但仍以合乎逻辑的方式处理。

如何基于保存日期的其他两个列创建一个 Pandas DataFrame 列？

问题描述

1 个解决方案

解决方案1
1 2020-03-24 11:55:54

如何基于保存日期的其他两个列创建一个 Pandas DataFrame 列？

问题描述

1 个解决方案

解决方案1 1 2020-03-24 11:55:54

解决方案1
1 2020-03-24 11:55:54