[英]Pandas: create new columns by linearly interpolating between existing columns
[英]Create weekly time series by interpolating dates between two existing columns
如何使用熊貓將“源數據框”轉換為“目標數據框”?
源數據框的datefrom和dateto是日期范圍,我希望將其轉換為每周的日期范圍,如結果“目標數據框”。
DateFrom DateTo Catalog Score
2017-05-01 2017-05-21 ABC 20
2017-05-22 2017-06-04 WXY 30
DateFrom DateTo Catalog Score
2017-05-01 2017-05-07 ABC 20
2017-05-08 2017-05-14 ABC 20
2017-05-15 2017-05-21 ABC 20
2017-05-22 2017-05-28 WXY 30
2017-05-29 2017-06-04 WXY 30
使用melt
將DateFrom
和DateTo
對齊,然后使用groupby(Catalog)
並通過向前填充在DateTo
上resample
。
重建DateFrom
使用TimeDelta
。
melted = pd.melt(df, id_vars=["Catalog", "Score"], var_name="x", value_name="DateTo")
df2 = (
melted.set_index(pd.to_datetime(melted.DateTo))
.drop(["x", "DateTo"],1)
.groupby("Catalog", as_index=False)
.resample("W")
.ffill()
.reset_index(level=1)
)
df2["DateFrom"] = df2.DateTo - pd.Timedelta("6 days")
輸出:
df2[df.columns]
Catalog Score
Catalog date
ABC 2017-05-07 ABC 20
2017-05-14 ABC 20
2017-05-21 ABC 20
WXY 2017-05-28 WXY 30
2017-06-04 WXY 30
數據:
df
DateFrom DateTo Catalog Score
0 2017-05-01 2017-05-21 ABC 20
1 2017-05-22 2017-06-04 WXY 30
在此處擴展類似問題的答案擴展日期范圍為列的pandas數據框 ,您可以遍歷每一行並按如下所示擴展數據框
import pandas as pd
from datetime import timedelta
newdf = pd.concat(
[
pd.DataFrame(
{
'DataFrom':
pd.date_range(row.DateFrom, row.DateTo, freq='W-MON'),
'DateTo':
pd.date_range(
row.DateFrom + timedelta(days=6),
row.DateTo + timedelta(days=6),
freq='W'),
'Catalog':
row.Catalog,
'Score':
row.Score
},
columns=['DataFrom', 'DateTo', 'Catalog', 'Score'])
for i, row in df.iterrows()
],
ignore_index=True)
打印以下輸出
newdf
DataFrom DateTo Catalog Score
0 2017-05-01 2017-05-07 ABC 20
1 2017-05-08 2017-05-14 ABC 20
2 2017-05-15 2017-05-21 ABC 20
3 2017-05-22 2017-05-28 WXY 30
4 2017-05-29 2017-06-04 WXY 30
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.