简体   繁体   中英

Create cross-tabulation in python pandas showing which values are present

given the following data:

pd.DataFrame(dict(
    name = ['a', 'a', 'a', 'b', 'b', 'b'],
    vals = [1, 2 , 3, 99, 3, 4]
))

which looks as:

  name  vals
0    a     1
1    a     2
2    a     3
3    b    99
4    b     3
5    b     4

I'm wondering how to create the following:

     1     2    3      4     99
a  true  true  true  false  false
b  false false true  false  true

Note - having the exact values of true and false in the above aren't so important, I don't know how to go about creating a table of this type at the moment.

Try this crosstab

s=pd.crosstab(df.name,df.vals).astype(bool)
Out[38]: 
vals     1      2     3      4      99
name                                  
a      True   True  True  False  False
b     False  False  True   True   True

Could also get_dummies and then aggregate along the names

pd.get_dummies(df.set_index('name').vals).any(level=0) 
                                        #.max(level=0) for 1/0 dummies
                                        #.sum(level=0) for counts

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM