![](/img/trans.png)
[英]pandas,how to get the index after using the func pandas.Series.value_counts?
[英]How to get pandas series Value counts with a series in original index order after value preference
下面我举个例子:
a = ['Ibrutinib', 'Ibrutinib', 'Ibrutinib',
'Ibrutinib-containing product', 'Ibrutinib 140 MG',
'Ibrutinib Oral Product',
'Ibrutinib-containing product in oral dose form', 'Ibrutinib Pill',
'Ibrutinib Oral Capsule', 'Ibrutinib 140 MG Oral Capsule',
'Ibrutinib 140 MG [Imbruvica]',
'Ibrutinib Oral Capsule [Imbruvica]',
'Ibrutinib 140 MG Oral Capsule [Imbruvica]']
pd.Series(a).value_counts()
%%out%%
Ibrutinib 3
Ibrutinib-containing product in oral dose form 1
Ibrutinib Pill 1
Ibrutinib Oral Product 1
Ibrutinib 140 MG Oral Capsule [Imbruvica] 1
Ibrutinib 140 MG Oral Capsule 1
Ibrutinib Oral Capsule 1
Ibrutinib-containing product 1
Ibrutinib 140 MG [Imbruvica] 1
Ibrutinib 140 MG 1
Ibrutinib Oral Capsule [Imbruvica] 1
dtype: int64
我想在 3 position 中看到“Ibrutinib 140 MG”,因为它在原始系列中领先。
要按原始列表排序,请将其转换为 dataframe,然后创建一个排名列作为排序依据。
import pandas as pd
a = ['Ibrutinib', 'Ibrutinib', 'Ibrutinib',
'Ibrutinib-containing product', 'Ibrutinib 140 MG',
'Ibrutinib Oral Product',
'Ibrutinib-containing product in oral dose form', 'Ibrutinib Pill',
'Ibrutinib Oral Capsule', 'Ibrutinib 140 MG Oral Capsule',
'Ibrutinib 140 MG [Imbruvica]',
'Ibrutinib Oral Capsule [Imbruvica]',
'Ibrutinib 140 MG Oral Capsule [Imbruvica]']
s = pd.Series(a).value_counts()
df = s.rename_axis('value').reset_index(name='count') # convert to dataframe
df["rank"] = df['value'].apply(lambda x : a.index(x)) # create rank column, ranked by list index
dfsrt = df.sort_values(by='rank') # sort by rank
print(dfsrt[['value','count']].to_string(index=False, justify='left', # display value and count
formatters={'value':'{{:<{}s}}'.format(dfsrt['value'].str.len().max()).format}))
Output
value count
Ibrutinib 3
Ibrutinib-containing product 1
Ibrutinib 140 MG 1
Ibrutinib Oral Product 1
Ibrutinib-containing product in oral dose form 1
Ibrutinib Pill 1
Ibrutinib Oral Capsule 1
Ibrutinib 140 MG Oral Capsule 1
Ibrutinib 140 MG [Imbruvica] 1
Ibrutinib Oral Capsule [Imbruvica] 1
Ibrutinib 140 MG Oral Capsule [Imbruvica] 1
尝试
df = pd.Dataframe(a)
df = df.groupby(0, sort=False).size()\
.sort_values('size', ascending=False, kind='mergesort')
Value_counts 默认进行快速排序,不能保证稳定。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.