I am trying to match several data frames on one interval column which is a result of a pd.cut() function. However, the matching doesn`t work due to the fact that the pd.cut() produces different outcomes.
For example: While cutting a float numbers series into bins of [15, 16, 17, 18], the pd.cut function produces sometimes the following intervals - option A:
(15, 16], (16, 17], (17, 18]
and sometimes it produces with the following intervals - option B:
(15.0, 16.0], (16.0, 17.0], (17.0, 18.0]
Change of hyper-parameters such as precision don`t help. And the funny thing is that for option B result when you group-by the intervals, the grouped names are actually as option A - (15, 16], (16, 17], (17, 18]
Which hyper parameters should I use for the pd.cut() function?
Yup it works, a possible solution is just manually adding labels for the pd.cut() intervals as legend.
df['a_groups'] = pd.qcut(df.a, q=3, labels=['(15, 16]', '(16, 17]', '(17, 18]'])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.