合并大熊猫中的时间段数据

Question

How do I consolidate time periods data in Python pandas? 如何在Python pandas中合并时间段数据？

I want to manipulate data from 我想操纵数据

person  start       end
1       2001-1-8   2002-2-14
1       2002-2-14  2003-3-1
2       2001-1-5   2002-2-16
2       2002-2-17  2003-3-9

to 至

person  start       end
1       2001-1-8   2002-3-1
2       2001-1-5   2002-3-9

I want to check first if the last end and new start are within 1 day first. 我想首先检查最后一次end和新start是否在1天之内。 If not, then keep the original data structure, if so, then consolidate. 如果没有，则保留原始数据结构，如果是，则合并。

Answer 1

df.sort_values(["person", "start", "end"], inplace=True)

def condense(df):
    df['prev_end'] = df["end"].shift(1)
    df['dont_condense'] = (abs(df['prev_end'] - df['start']) > timedelta(days=1))
    df["group"] = df['dont_condense'].fillna(False).cumsum()
    return df.groupby("group").apply(lambda x: pd.Series({"person": x.iloc[0].person, 
                                               "start": x.iloc[0].start, 
                                               "end": x.iloc[-1].end}))

df.groupby("person").apply(condense).reset_index(drop=True)

Answer 2

You can use if each group contains only 2 rows and need difference 1 and 0 days, also all data are sorted: 您可以使用，如果每个组只包含2行，需要1和0天的差异，所有数据都会排序：

print (df)
   person      start        end
0       1   2001-1-8  2002-2-14
1       1  2002-2-14   2003-3-1
2       2   2001-1-5  2002-2-16
3       2  2002-2-17   2003-3-9
4       3   2001-1-2  2002-2-14
5       3  2002-2-17  2003-3-10

df.start = pd.to_datetime(df.start)
df.end = pd.to_datetime(df.end)

def f(x):
    #if need difference only 0 days, use
    #a = (x['start'] - x['end'].shift()) == pd.Timedelta(days=0) 
    a = (x['start'] - x['end'].shift()).isin([pd.Timedelta(days=1), pd.Timedelta(days=0)])
    if a.any():
        x.end = x['end'].shift(-1)
    return (x)

df1 = df.groupby('person').apply(f).dropna().reset_index(drop=True)
print (df1)
   person      start        end
0       1 2001-01-08 2003-03-01
1       2 2001-01-05 2003-03-09
2       3 2001-01-02 2002-02-14
3       3 2002-02-17 2003-03-10

合并大熊猫中的时间段数据

问题描述

2 个解决方案

解决方案1
0 已采纳 2017-03-01 05:56:56

解决方案2
0 2017-03-01 06:57:28

合并大熊猫中的时间段数据

问题描述

2 个解决方案

解决方案1 0 已采纳 2017-03-01 05:56:56

解决方案2 0 2017-03-01 06:57:28

解决方案1
0 已采纳 2017-03-01 05:56:56

解决方案2
0 2017-03-01 06:57:28