简体   繁体   中英

How to convert pandas dataframe so that index is the unique set of values and data is the count of each value?

I have a dataframe from a multiple choice questions and it is formatted like so:

      Sex Qu1  Qu2  Qu3
Name
Bob    M   1    2    1
John   M   3    3    5
Alex   M   4    1    2
Jen    F   3    2    4
Mary   F   4    3    4

The data is a rating from 1 to 5 for the 3 multiple choice questions. I want rearrange the data so that the index is range(1,6) where 1='bad', 2='poor', 3='ok', 4='good', 5='excellent', the columns are the same and the data is the count of the number occurrences of the values (excluding the Sex column). This is basically a histogram of fixed bin sizes and the x-axis labeled with strings. I like the output of df.plot() much better than df.hist() for this but I can't figure out how to rearrange the table to give me a histogram of data. Also, how do you change x-labels to be strings?

Series.value_counts gives you the histogram you're looking for:

In [9]: df['Qu1'].value_counts()
Out[9]: 
4    2
3    2
1    1

So, apply this function to each of those 3 columns:

In [13]: table = df[['Qu1', 'Qu2', 'Qu3']].apply(lambda x: x.value_counts())

In [14]: table
Out[14]: 
   Qu1  Qu2  Qu3
1    1    1    1
2  NaN    2    1
3    2    2  NaN
4    2  NaN    2
5  NaN  NaN    1

In [15]: table = table.fillna(0)

In [16]: table
Out[16]: 
   Qu1  Qu2  Qu3
1    1    1    1
2    0    2    1
3    2    2    0
4    2    0    2
5    0    0    1

Using table.reindex or table.ix[some_array] you can rearrange the data.

To transform to strings, use table.rename:

In [17]: table.rename(index=str)
Out[17]: 
   Qu1  Qu2  Qu3
1    1    1    1
2    0    2    1
3    2    2    0
4    2    0    2
5    0    0    1

In [18]: table.rename(index=str).index[0]
Out[18]: '1'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM