简体   繁体   中英

Merging a selection from a pandas series

To avoid too many tiny slices in a pie chart, I need to merge/sum all elements in a series below a certain threshold. So far this is what I came up with:

from pandas import Series
import numpy as np

ser = Series(np.random.randint(100, size=10), index=list('abcdefghij')).order(ascending=False)

thresh = 20
cleaned = ser[ser>=thresh].append(Series([ser[ser<thresh].sum()],
                                         index=["below {}".format(thresh)]))

this delivers the correct result, but the use of append bothers me and does not strike me as particularly pandas-like.

Is there a more appealing way to achieve the same result?

Update:

This is a solution based on the comment by IanS below.

ser.index = map(lambda (x, y): x if y>=thresh else "below {}".format(thresh),
                ser.iteritems())

or

ser.index = [x if y >=thresh else "below {}".format(thresh) for (x,y) in ser.iteritems()]

and then

ser.groupby(ser.index).sum()

You can try this:

df = ser.groupby(ser>20).apply(lambda x:
                               x if (x>20).all()
                               else pd.Series(x.sum(),
                                              index=["below 20"])
                              ).reset_index().set_index("level_1"
                                                        ).iloc[:,1:][0].copy()

df.name = None
df.index.name=None
df.sort(ascending=False)
df
c           97
f           88
e           61
h           60
a           53
g           49
i           37
d           24
below 20    21
dtype: int64

But I'm not sure it's better than your solution.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM