简体   繁体   English

pandas dataframe 列上的条件聚合,将“n”行合并为 1 行

[英]Conditional aggregation on pandas dataframe columns with combining 'n' rows into 1 row

I have the following pandas dataframe:我有以下 pandas dataframe:

START   NAME
5.11    name1
9.1     name1
10.86   name1
12.61   name2
14.86   name2
23.11   name2
25.36   name1
26.61   name1
28.36   name2
31.61   name2
32.86   name1
35.61   name1
44.61   name1
46.36   name2

I would this merged by name as follows:我将按名称合并如下:

START   END     NAME
5.11    12.61   name1
12.61   25.36   name2
26.61   28.36   name1
28.36   32.86   name2
32.86   46.36   name1
46.36   total   name2

I tried something like this:我试过这样的事情:

df2 = df.copy()
df2 = df2.rename({"name": "temp"}).reset_index()
grp = (df2['name'] != df2['name'].shift()).cumsum().rename('group')
df2 = df2.groupby(['name', grp], sort=False)

But this does not produce the desired output.但这不会产生所需的 output。 Any help is appreciated任何帮助表示赞赏

thanks谢谢

  1. use shift to compare the row's content is same with the next row使用shift比较该行的内容是否与下一行相同
  2. keep the NAME that is not the same as the next row's NAME.保留与下一行的名称不同的名称。
  3. use shift(-1) to assign the NAME's END.使用 shift(-1) 分配 NAME 的 END。
cond = (df['NAME'] != df['NAME'].shift(1))
dfn = df[cond].copy()
dfn['END'] = dfn['START'].shift(-1).fillna('total')
dfn[['START', 'END', 'NAME']]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM