简体   繁体   English

Pandas groupby 递增日期时间以使其对组唯一

[英]Pandas groupby Increment datetime to make it unique for the group

I have a dataframe which looks something like below:我有一个 dataframe,如下所示:

    df = pd.DataFrame({'State': ['Texas', 'Texas', 'Florida', 'Florida'],
                       'a': [4, 5, 1, 3], 'b': [6, 10, 3, 11]})
    df['ts'] = datetime.utcnow()

table looks something like this below表如下所示

     State  a   b                ts
0    Texas  4   6 2022-09-06 15:33:31
1    Texas  5  10 2022-09-06 15:33:31
2  Florida  1   3 2022-09-06 15:33:31
3  Florida  3  11 2022-09-06 15:33:31

what I want to achieve, is for each group 'ts' should be unique, so I want to increment it's all other values with +1 second so the output dataframe will look like this:我想要实现的是,对于每个组 'ts' 应该是唯一的,所以我想用 +1 秒增加它的所有其他值,所以 output dataframe 看起来像这样:

     State  a   b                ts
0    Texas  4   6 2022-09-06 15:33:31
1    Texas  5  10 2022-09-06 15:33:32
2  Florida  1   3 2022-09-06 15:33:31
3  Florida  3  11 2022-09-06 15:33:32

With groupby and transform, able to get the series, but can't get any further:使用 groupby 和 transform,可以得到系列,但不能再进一步:

df['ts'] = df['ts'].groupby(df['State']).transform(lambda x: increment_ms(x))

How can I achieve the above output?如何实现上述output?

You can use groupby().cumcount() with pd.to_timedelta :您可以将groupby().cumcount()pd.to_timedelta一起使用:

df['ts'] += pd.to_timedelta(df.groupby('State').cumcount(), unit='s')

Output: Output:

     State  a   b                         ts
0    Texas  4   6 2022-09-06 15:40:46.429416
1    Texas  5  10 2022-09-06 15:40:47.429416
2  Florida  1   3 2022-09-06 15:40:46.429416
3  Florida  3  11 2022-09-06 15:40:47.429416

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM