Below is my array data: wkx_old['Sales point'].values
array([ 2, 2, 2, 4, 4, 3, 1, 4, 2, 1, 3, 4, 1, 1, 4, 7, 4, 1, 1, 2, 4, 3, 4, 3, 3, 2, 5, 2, 3, 2, 3, 4, 2, 10, 4, 4, 6, 3, 3, 1, 1, 2, 1, 3, 2, 4, 5, 2, 4, 3, 2, 3, 4, 3, 1, 1, 6, 3, 6, 5, 7, 2, 1, 1, 6, 5, 1, 1, 1, 2, 2, 1, 2, 2, 4, 4, 1, 5, 7, 2, 1, 2, 1, 5, 3, 1, 1, 2, 3, 3, 5, 4, 4, 6, 1, 4, 4, 1, 3, 4, 4, 5, 4, 4, 1, 1, 3, 1, 2, 1, 3, 7, 2, 1, 1, 3, 3, 6, 1, 6, 2, 3, 7, 1])
Trying to compute below code:
names=['D','C','B','A']
wkx_old['Rankings'] = pd.qcut(wkx_old['Sales point'],q=4,labels=names)
The error I am getting: ValueError: Bin edges must be unique: array([ 1., 1., 3., 4., 10.]). You can drop duplicate edges by setting the 'duplicates' kwarg
qcut
is not friendly with duplicated data and will throw an error when it sees a duplicate at splitting point. Imagine you do a qcut
on [1]*100
, what is the 50-th
percentile?
You can try rank(pct=True)
to calculate the actual percentile for the value, then cut
:
wkx_old['Rankings'] = pd.cut(wkx_old['Sales point'].rank(pct=True),
bins=4, labels=names)
Output:
0 C
1 C
2 C
3 B
4 B
..
119 A
120 C
121 C
122 A
123 D
Length: 124, dtype: category
Categories (4, object): ['D' < 'C' < 'B' < 'A']
There are two problems with your code:
qcut
tries to size the windows such that the number of elements are approximately the same for each window. As there are a lot of 1
s in your data, it will try to create this window: array([ 1., 1., 3., 4., 10.])
, as per the error message. The first two entries are identical, which then leads to the error that you see. To fix this add the parameter duplicates='drop'
to qcut
:pd.qcut(wkx_old['Sales point'], q=4, duplicates='drop')
names
list is 4 elements long, but you are cutting the data into 5 windows ( q=4
is the number of cuts). To fix this just add another element to the list:names = ['E', 'D', 'C', 'B', 'A']
pd.qcut(wkx_old['Sales point'], q=4, duplicates='drop', labels=names)
This should then work.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.