[英]Reshaping pandas data frame based on date information
I have the following data frame.我有以下数据框。
df df
Current_date ID Current_date+1 Current_Date+2
19/10/2021 11 0.9 0.8
19/10/2021 11 0.7 0.6
19/10/2021 12 1.0 1.5
19/10/2021 13 0.8 0.5
I would like to rearrange this and get the following table.我想重新排列它并得到下表。 To make it clear, to get the
Date
column values if it is Current_date+1
just and 1 day on the first column (``Current_date ) and if it is
Current_date+2``` add two days on the current date.为了清楚
Current_date+1
,要获取Date
列值,如果它是Current_date+1
并且在第一列 (``Current_date ) and if it is
上有 1 天) and if it is
Current_date+2``` 在当前日期上添加两天。 Then create a new column to assign the values.然后创建一个新列来分配值。
Current_date ID Date Value
19/10/2021 11 20/10/2021 0.9
19/10/2021 11 20/10/2021 0.7
19/10/2021 12 20/10/2021 1.0
19/10/2021 13 20/10/2021 0.8
19/10/2021 11 21/10/2021 0.8
19/10/2021 11 21/10/2021 0.6
19/10/2021 12 21/10/2021 1.5
19/10/2021 13 21/10/2021 0.5
Can anyone help with this in Pandas Python?任何人都可以在 Pandas Python 中帮助解决这个问题吗?
Use DataFrame.melt
with add timedeltas by extracted by last intgers in columns by Series.str.extract
and to_timedelta
:使用
DataFrame.melt
并通过Series.str.extract
和to_timedelta
列中的最后一个整数提取添加to_timedelta
:
df = df.melt(['Current_date','ID'], var_name='Date', value_name='Value')
td = pd.to_timedelta(df['Date'].str.extract('(\d+)$', expand=False).astype(int), unit='D')
df['Date']=pd.to_datetime(df['Current_date'],dayfirst=True).add(td).dt.strftime('%d/%m/%Y')
print (df)
Current_date ID Date Value
0 19/10/2021 11 20/10/2021 0.9
1 19/10/2021 11 20/10/2021 0.7
2 19/10/2021 12 20/10/2021 1.0
3 19/10/2021 13 20/10/2021 0.8
4 19/10/2021 11 21/10/2021 0.8
5 19/10/2021 11 21/10/2021 0.6
6 19/10/2021 12 21/10/2021 1.5
7 19/10/2021 13 21/10/2021 0.5
One option is with pivot_longer from pyjanitor :一种选择是使用pyjanitor的pivot_longer :
# pip install pyjanitor
import pandas as pd
import janitor
df['Current_date'] = pd.to_datetime(df.Current_date)
(df.rename(columns = {"Current_date+1":"Value+1",
"Current_Date+2":"Value+2"})
.pivot_longer(index=['Current_date', 'ID'],
names_to=(".value", "Date"),
names_sep="+")
.assign(Date = lambda df: df.Current_date +
pd.to_timedelta(df.Date.astype(int), unit='D'))
)
Current_date ID Date Value
0 2021-10-19 11 2021-10-20 0.9
1 2021-10-19 11 2021-10-20 0.7
2 2021-10-19 12 2021-10-20 1.0
3 2021-10-19 13 2021-10-20 0.8
4 2021-10-19 11 2021-10-21 0.8
5 2021-10-19 11 2021-10-21 0.6
6 2021-10-19 12 2021-10-21 1.5
7 2021-10-19 13 2021-10-21 0.5
You could also achieve this with pd.wide_to_long :您也可以使用pd.wide_to_long实现此目的:
df = df.rename(columns = {"Current_date+1":"Value+1",
"Current_Date+2":"Value+2"})
(pd.wide_to_long(df.reset_index(),
stubnames = 'Value',
i = ['Current_date', 'ID', 'index'],
j= 'Date',
sep='+')
.droplevel('index')
.reset_index()
.assign(Date = lambda df: df.Current_date +
pd.to_timedelta(df.Date.astype(int), unit='D'))
)
Current_date ID Date Value
0 2021-10-19 11 2021-10-20 0.9
1 2021-10-19 11 2021-10-21 0.8
2 2021-10-19 11 2021-10-20 0.7
3 2021-10-19 11 2021-10-21 0.6
4 2021-10-19 12 2021-10-20 1.0
5 2021-10-19 12 2021-10-21 1.5
6 2021-10-19 13 2021-10-20 0.8
7 2021-10-19 13 2021-10-21 0.5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.