过滤和扩展时间序列 pandas dataframe

Question

这个问题是这个问题的附加问题： filter multi-indexed grouped pandas dataframe

我想获取date之后value开始大于零的timestamp ，作为每个单独id的新列new_date

示例输入数据：

id timestamp  date       value
1  2001-01-01 2001-05-01 1
1  2001-10-01 2001-05-01 0
1  2001-10-02 2001-05-01 1
1  2001-10-03 2001-05-01 0
1  2001-10-04 2001-05-01 1
2  2001-01-01 2001-05-01 1
2  2001-10-01 2001-05-01 0
2  2001-10-02 2001-05-01 0
2  2001-10-03 2001-05-01 0
2  2001-10-04 2001-05-01 1

想要的 Output 数据示例：

id timestamp  date       value new_date
1  2001-01-01 2001-05-01 1     2001-10-02
1  2001-10-01 2001-05-01 0     2001-10-02
1  2001-10-02 2001-05-01 1     2001-10-02
1  2001-10-03 2001-05-01 0     2001-10-02
1  2001-10-04 2001-05-01 1     2001-10-02
2  2001-01-01 2001-05-01 1     2001-10-04
2  2001-10-01 2001-05-01 0     2001-10-04
2  2001-10-02 2001-05-01 0     2001-10-04
2  2001-10-03 2001-05-01 0     2001-10-04
2  2001-10-04 2001-05-01 1     2001-10-04

Answer 1

Simplier solution working also if some group has no match is first filter DataFrame chained mask for greater like date by Series.gt with bitwise AND same for 0 , then remove duplicates by DataFrame.drop_duplicates , create Series and last use Series.map :

df['timestamp'] = pd.to_datetime(df['timestamp'])
df['date'] = pd.to_datetime(df['date'])
df = df.sort_values(['id','timestamp'])

m = df['timestamp'].gt(df['date']) & df['value'].gt(0)

s = df[m].drop_duplicates('id').set_index('id')['timestamp']

df['new_date'] = df['id'].map(s)
print (df)
   id  timestamp       date  value   new_date
0   1 2001-01-01 2001-05-01      1 2001-10-02
1   1 2001-10-01 2001-05-01      0 2001-10-02
2   1 2001-10-02 2001-05-01      1 2001-10-02
3   1 2001-10-03 2001-05-01      0 2001-10-02
4   1 2001-10-04 2001-05-01      1 2001-10-02
5   2 2001-01-01 2001-05-01      1 2001-10-04
6   2 2001-10-01 2001-05-01      0 2001-10-04
7   2 2001-10-02 2001-05-01      0 2001-10-04
8   2 2001-10-03 2001-05-01      0 2001-10-04
9   2 2001-10-04 2001-05-01      1 2001-10-04

过滤和扩展时间序列 pandas dataframe

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-04-29 13:44:05

过滤和扩展时间序列 pandas dataframe

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-04-29 13:44:05

解决方案1
1 已采纳 2020-04-29 13:44:05