简体   繁体   English

如何实现原dataframe每一行的组键? ('by' 是 pandas 石斑鱼)

[英]How to materialize the group key for each row of the original dataframe? ('by' is a pandas grouper)

I would like to materialize for each row of a dataframe the corresponding group key it would get if I was using a groupby operation with a pandas Grouper .如果我将groupby操作与 pandas Grouper一起使用,我想为 dataframe 的每一行实现相应的组密钥。

import pandas as pd

# Test data
ts = [pd.Timestamp('2022/03/01 09:00'),
      pd.Timestamp('2022/03/01 10:00'),
      pd.Timestamp('2022/03/01 10:30'),
      pd.Timestamp('2022/03/01 15:00')]
df = pd.DataFrame({'a':range(len(ts)), 'ts': ts})

grouper = pd.Grouper(key='ts', freq='2H', sort=False, origin='start_day')

Is there any way to get for each row the corresponding groupkey?有没有办法为每一行获取相应的组键? The result I am looking for could be either a list, or a pandas Series or Index, or numpy array, the same length as the initial dataframe, and would then contain following values.我正在寻找的结果可能是一个列表,或者一个 pandas 系列或索引,或者 numpy 数组,与初始 dataframe 的长度相同,然后将包含以下值。

result = pd.Series([pd.Timestamp('2022-03-01 08:00:00'),
                    pd.Timestamp('2022-03-01 10:00:00'),
                    pd.Timestamp('2022-03-01 10:00:00'),
                    pd.Timestamp('2022-03-01 14:00:00')])

Thanks for your help!谢谢你的帮助! Bests最好的

Similar idea to @Andrej, just creates a table with a new column与@Andrej 类似的想法,只是创建一个带有新列的表

pd.concat(g.assign(grouper_val = i) for i,g in df.groupby(grouper))

在此处输入图像描述

Not directly using the groupby but you can use:不直接使用groupby但您可以使用:

df['ts'].dt.floor('2H')

With the groupby :使用groupby

df.groupby(grouper)['ts'].transform(lambda g: g.name)

Output: Output:

0   2022-03-01 08:00:00
1   2022-03-01 10:00:00
2   2022-03-01 10:00:00
3   2022-03-01 14:00:00
Name: ts, dtype: datetime64[ns]

Given:鉴于:

   a                  ts
0  0 2022-03-01 09:00:00
1  1 2022-03-01 10:00:00
2  2 2022-03-01 10:30:00
3  3 2022-03-01 15:00:00

Doing:正在做:

pd.Series(df.resample('2H', origin='start_day', on='ts').groups)

Output: Output:

2022-03-01 08:00:00    1
2022-03-01 10:00:00    3
2022-03-01 12:00:00    3
2022-03-01 14:00:00    4
dtype: int64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何保留 pandas 石斑鱼和分组依据中的日期时间列? - How to retain datetime column in pandas grouper and group by? 您如何遍历 pandas Dataframe 中的组,对每个组进行操作,然后将值分配给原始 Dataframe? - How do you iterate through groups in a pandas Dataframe, operate on each group, then assign values to the original dataframe? 熊猫数据框获取每组的第一行 - Pandas dataframe get first row of each group 将第一行插入 pandas dataframe 中的每个组 - insert first row to each group in pandas dataframe 返回每行中组的大小 pandas dataframe - return size of the group in each row of a pandas dataframe 按组将函数应用于 Pandas 数据框中的每一行 - Apply function to each row in Pandas dataframe by group 在pandas数据帧中为每个组添加第一行 - adding first row for each group in pandas dataframe 使用 pandas Grouper 对 dataframe 进行分组时控制最后一行 label - Control last row label when grouping a dataframe using pandas Grouper 如何根据每个组具有 n 行数的特定列在 pandas 中分组? 如果可能,还要从原始 dataframe 中删除? - How to group by in pandas based on specific columns where each group has n number of rows? Also delete from the original dataframe IF POSSIBLE? 大熊猫石斑鱼问题与关键是一个指数 - pandas grouper issue with key that is an index
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM