简体   繁体   中英

Access the index of a pandas series

I am trying to identify which word is the most counted in a pandas dataframe (df_temp in my code). Also I have this:

 l = df_temp['word'].count_values()

l is then obviously a pandas series where the first row points toward the most counted index (in my case the most counted word) in df_temp['word']. Although I can see the word in my console, I cannot get it properly. The only way I found so far is to transform it into a dictionary so I have:

dl = dict(l)

and then I can easily retrieve my index...after sorting the dictionary. Obviously this does the job, but I am pretty sure you have a smarter solution as this one is very dirty and inelegant.

The index of the result of value_counts() are your values:

l.index

will give you the values that were counted

Example:

In [163]:
df = pd.DataFrame({'a':['hello','world','python','hello','python','python']})
df

Out[163]:
        a
0   hello
1   world
2  python
3   hello
4  python
5  python

In [165]:    
df['a'].value_counts()

Out[165]:
python    3
hello     2
world     1
Name: a, dtype: int64

In [164]:    
df['a'].value_counts().index

Out[164]:
Index(['python', 'hello', 'world'], dtype='object')

So basically you can get a specific word count by indexing the series:

In [167]:
l = df['a'].value_counts()
l['hello']

Out[167]:
2

Using Pandas you can find the most frequent value in the word column:

df['word'].value_counts().idxmax()

and this code below will give you the count for that value, which is the max count in that column:

df['word'].value_counts().max()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM