[英]Pandas groupby based on column value
I have following dataframe - dfgeo
:我有以下 dataframe -
dfgeo
:
x y z zt n k pv geometry dist
0 6574878.210 4757530.610 1152.588 1 8 4 90 POINT (6574878.210 4757530.610) 0.000000
1 6574919.993 4757570.314 1174.724 0 POINT (6574919.993 4757570.314) 57.638760
2 6575020.518 4757665.839 1177.339 0 POINT (6575020.518 4757665.839) 138.673362
3 6575239.548 4757873.972 1160.156 1 8 4 90 POINT (6575239.548 4757873.972) 302.148120
4 6575351.603 4757980.452 1202.418 0 POINT (6575351.603 4757980.452) 154.577856
5 6575442.780 4758067.093 1199.297 0 POINT (6575442.780 4758067.093) 125.777217
6 6575538.217 4758157.782 1192.914 1 8 4 90 POINT (6575538.217 4758157.782) 131.653772
7 6575594.625 4758240.033 1217.442 0 POINT (6575594.625 4758240.033) 99.735096
8 6575738.820 4758450.289 1174.477 0 POINT (6575738.820 4758450.289) 254.950551
9 6575850.937 4758613.772 1123.852 1 8 4 90 POINT (6575850.937 4758613.772) 198.234490
10 6575984.323 4758647.118 1131.761 0 POINT (6575984.323 4758647.118) 137.491020
11 6576204.312 4758702.115 1119.407 0 POINT (6576204.312 4758702.115) 226.759410
12 6576303.976 4758727.031 1103.064 0 POINT (6576303.976 4758727.031) 102.731300
13 6576591.496 4758798.910 1114.06 0 POINT (6576591.496 4758798.910) 296.368590
14 6576736.965 4758835.277 1120.285 1 8 4 90 POINT (6576736.965 4758835.277) 149.945952
I am trying to group by zt
values an summarize dist column.我正在尝试按
zt
值对汇总 dist 列进行分组。 I have tried this:我试过这个:
def summarize(group):
s = group['zt'].eq(1).cumsum()
return group.groupby(s).agg(
D=('dist', 'sum')
)
dfzp=dfgeo.apply(summarize)
But i get following errors on last line of code但是我在最后一行代码中遇到以下错误
s = group['zt'].eq(1).cumsum()
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\series.py", line 871, in __getitem__
result = self.index.get_value(self, key)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 4405, in get_value
return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
File "pandas\_libs\index.pyx", line 80, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 90, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 135, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index_class_helper.pxi", line 109, in pandas._libs.index.Int64Engine._check_type
KeyError: 'zt'
Any help in resolving this appreciated.解决此问题的任何帮助表示赞赏。
If need pass Dataframe to function use:如果需要通过 Dataframe 到 function 使用:
dfzp=summarize(dfgeo)
Or DataFrame.pipe
:或
DataFrame.pipe
:
dfzp=dfgeo.pipe(summarize)
If use DataFrame.apply
then is used function per columns or per rows if axis=1
.如果使用
DataFrame.apply
则使用 function 如果axis=1
每列或每行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.