What is the panda way of doing the following?
data.groupby('id').duration.max().index[data.groupby('id').duration.max() > 365]
I wan't to group by id and then filter using the groups and return the id where the condition was true.
using the group.filter function returns the original dataframe.
You can rewrite your code - it is called boolean indexing
with Series
from aggregation function max
and index
:
s = data.groupby('id').duration.max()
idx = s.index[s > 365]
#alternative
#idx = s[s > 365].index
You can also check filtered values of Series
:
print(s[s > 365])
But if want filter original DataFrame
by max
values per groups add GroupBy.transform
for return Series
with same size as original DataFrame
:
data[data.groupby('id').duration.transform('max') > 365]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.