简体   繁体   中英

How to get row percentages with pandas crosstab in a three-way table?

I know this solution How to make a pandas crosstab with percentages? , but the solution proposed does not work with three-way tables .

Consider the following table:

df = pd.DataFrame({'A' : ['one', 'one', 'two', 'three'] * 6,
                   'B' : ['A', 'B', 'C'] * 8,
                   'C' : ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'] * 4})




pd.crosstab(df.A,[df.B,df.C],colnames=['topgroup','bottomgroup'])
Out[89]: 
topgroup      A       B       C    
bottomgroup bar foo bar foo bar foo
A                                  
one           2   2   2   2   2   2
three         2   0   0   2   2   0
two           0   2   2   0   0   2

Here, I would like to get the row percentage, within each topgroup ( A, B and C ).

Using apply(lambda x: x/sum(),axis=1) will fail because percentages have to sum to 1 within each group.

Any ideas?

If I understand your question, it seems that you could write:

>>> table = pd.crosstab(df.A,[df.B,df.C], colnames=['topgroup','bottomgroup'])
>>> table / table.sum(axis=1, level=0)

topgroup       A         B         C     
bottomgroup  bar  foo  bar  foo  bar  foo
A                                        
one          0.5  0.5  0.5  0.5  0.5  0.5
three        1.0  0.0  0.0  1.0  1.0  0.0
two          0.0  1.0  1.0  0.0  0.0  1.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM