简体   繁体   English

如何基于保存日期的其他两个列创建一个 Pandas DataFrame 列?

[英]How to create a pandas DataFrame column based on two other columns that holds dates?

I have a pandas Dataframe with two date columns (A and B) and I would like to create a 3rd column (C) that holds dates created using month and year from column A and the day of column B. Obviously I would need to change the day for the months that day doesn't exist like we try to create 31st Feb 2020, it would need to change it to 29th Feb 2020.我有一个带有两个日期列(A 和 B)的熊猫数据框,我想创建一个第三列(C),其中包含使用 A 列中的月份和年份以及 B 列的日期创建的日期。显然我需要更改那天的月份并不存在,就像我们尝试创建 2020 年 2 月 31 日一样,它需要将其更改为 2020 年 2 月 29 日。

For example例如

import pandas as pd
df = pd.DataFrame({'A': ['2020-02-21', '2020-03-21', '2020-03-21'], 
                   'B': ['2020-01-31', '2020-02-11', '2020-02-01']})
for c in df.columns:
    dfx[c] = pd.to_datetime(dfx[c])

Then I want to create a new column C that is a new datetime that is:然后我想创建一个新的 C 列,它是一个新的日期时间,它是:

year = df.A.dt.year年 = df.A.dt.year

month = df.A.dt.month月 = df.A.dt.month

day = df.B.dt.day天 = df.B.dt.day

I don't know how to create this column.我不知道如何创建此列。 Can you please help?你能帮忙吗?

Here is one way to do it, using pandas' time series functionality:这是使用熊猫的时间序列功能的一种方法:

import pandas as pd

# your example data
df = pd.DataFrame({'A': ['2020-02-21', '2020-03-21', '2020-03-21'], 
                   'B': ['2020-01-31', '2020-02-11', '2020-02-01']})
for c in df.columns:
    # keep using the same dataframe here
    df[c] = pd.to_datetime(df[c])

# set back every date from A to the end of the previous month,
# then add the number of days from the date in B
df['C'] = df.A - pd.offsets.MonthEnd() + pd.TimedeltaIndex(df.B.dt.day, unit='D')

display(df)

Result:结果:

             A           B           C
0   2020-02-21  2020-01-31  2020-03-02
1   2020-03-21  2020-02-11  2020-03-11
2   2020-03-21  2020-02-01  2020-03-01

As you can see in row 0, this handles the case of "February 31st" not quite as you suggested, but still in a logical way.正如您在第 0 行中看到的那样,这处理“2 月 31 日”的情况并不像您建议的那样,但仍以合乎逻辑的方式处理。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas DataFrame 基于其他两列创建新的 csv 列 - Pandas DataFrame create new csv column based on two other columns 如何基于布尔表达式和其他两个列的关系在pandas数据框中创建列 - How to create column in pandas dataframe based on boolean expression and relationship of two other columns 如何在其他两列上创建熊猫数据框列循环? - How to create pandas dataframe column loop on two other columns? Pandas数据框基于其他数据框的列创建新列 - Pandas dataframe create a new column based on columns of other dataframes 根据其他列中的“NaN”值在 Pandas Dataframe 中创建一个新列 - Create a new column in Pandas Dataframe based on the 'NaN' values in other columns 基于其他列在 Pandas DataFrame 中创建新列 - Create new column in Pandas DataFrame based on other columns Pandas:创建一个将一列与其他两列相关联的数据框 - Pandas: create a dataframe relating a column to other two columns 如何从数据框中的其他列创建新的Pandas数据框列 - How to create a new Pandas dataframe column from other columns in the dataframe 如何通过pandas数据框解析,根据其他两列的值创建新列 - How to parse through pandas dataframe, make new column based on the value of two other columns 如何根据其他两列中满足的条件索引最后一列中的 pandas DataFrame 元素? - How to index a pandas DataFrame element in last column based on criteria being met in two other columns?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM