简体   繁体   English

日期偏移量取决于 pandas df 中的其他列

[英]Date Offset depending on other column in pandas df

Hi I am new to python switching from R and I have a hard time with this pretty simple task of changing a date based on another column of a pandas data frame.嗨,我是 python 的新手,从 R 切换,我很难完成这个非常简单的任务,即根据 pandas 数据帧的另一列更改日期。 I read several other questions on this and I was hoping that someone could just quickly solve my issue, since I have nobody else to ask but the internet.我阅读了其他几个关于此的问题,我希望有人可以快速解决我的问题,因为除了互联网我没有其他人可以问。

I think I have all the ingredients (functions) but I really struggle using pandas df compared to what I am used to in R.我想我拥有所有的成分(功能),但与我在 R 中使用的相比,我真的很难使用 pandas df。

import datetime
from datetime import datetime
import pandas as pd
import numpy as np

today=pd.to_datetime(datetime.today().strftime('%Y-%m-%d'))

d={"Start_Date":[today,today]}
df=pd.DataFrame(data=d)
n=len(df)
df["Distance"]=np.round_(np.random.uniform(low=1, high=14, size=n)).astype(int)

df.loc[:,"FutureDate"]=""
for index, row in df.iterrows():
    print(row["Start_Date"]+pd.DateOffset(row["Distance"]))  
    row["FutureDate"]=row["Start_Date"]+pd.DateOffset(row["Distance"])

Why is my FutureDate column empty if the print statement works?如果打印语句有效,为什么我的 FutureDate 列是空的? Is there a more elegant solution than using a loop?有没有比使用循环更优雅的解决方案? I am used to data.table where I would use write the function in one line.我习惯了data.table ,我会用在一行中写 function 。

Try this code.. it worked for me:试试这个代码..它对我有用:

import datetime
from datetime import datetime
import pandas as pd
import numpy as np

today=pd.to_datetime(datetime.today().strftime('%Y-%m-%d'))

d={"Start_Date":[today,today]}
df=pd.DataFrame(data=d)
n=len(df)
df["Distance"]=np.round_(np.random.uniform(low=1, high=14, size=n)).astype(int)

df.loc[:,"FutureDate"]=""
for index, row in df.iterrows():
    print(row["Start_Date"]+pd.DateOffset(row["Distance"]))  
    df.loc[index, "FutureDate"]=row["Start_Date"]+pd.DateOffset(row["Distance"])

In fact we need to assign the values on the df itself with the correct index and column.事实上,我们需要为 df 本身的值分配正确的索引和列。

EDIT:

A more elegant way:更优雅的方式:

import datetime
# from datetime import datetime
import pandas as pd
import numpy as np

today=pd.to_datetime(datetime.datetime.today().strftime('%Y-%m-%d'))

d={"Start_Date":[today,today]}
df=pd.DataFrame(data=d)
n=len(df)
df["Distance"]=np.round_(np.random.uniform(low=1, high=14, size=n)).astype(int)

df["FutureDate"]=df["Start_Date"] + pd.to_timedelta(df['Distance'],'d')

Hope this helps.希望这可以帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM