简体   繁体   English

在Python熊猫数据框中对唯一值进行分组和计数

[英]Group and count unique values in Python pandas dataframe

I have a dataframe of over 33,000 rows which I'd like to simplify: 我有一个超过33,000行的数据框,我想简化一下:

                   Crime type
GeographyCode                              
E01006687          Burglary
E01007229          Anti-social behaviour
E01007229          Anti-social behaviour
E01007229          Anti-social behaviour
E01007229          Burglary
E01007229          Other theft
E01007229          Other theft
E01007229          Shoplifting
E01007229          Theft from the person
E01007230          Anti-social behaviour
E01007230          Anti-social behaviour
E01007230          Anti-social behaviour
E01007230          Anti-social behaviour
E01007230          Anti-social behaviour
E01007230          Anti-social behaviour
...

There are 207 unique values of 'GeographyCode' and 12 unique values of 'Crime type'. “地理代码”有207个唯一值,“犯罪类型”有12个唯一值。

I'd like to make a new dataframe which has 207 rows and 12 columns plus the 'GeographyCode' index column, with each column representing a crime type, and containing a count of all occurances of that crime type within the GeographyCode. 我想制作一个新的数据框,其中包含207行和12列以及'GeographyCode'索引列,每列代表一种犯罪类型,并在GeographyCode中包含该犯罪类型的所有发生次数。

Something like this: 像这样:

                Burglary   Anti-social    Theft   Shoplifting   etc...
GeographyCode
E01006687       1          3              9       5             ...
E01007229       1          3              2       1             ...
E01007230       0          6              12      5             ...
...

I've tried a few things, but because there are no numeric values I'm finding it really difficult to get what I need. 我已经尝试了一些方法,但是由于没有数字值,因此我很难获得所需的信息。

You could use crosstab to compute this: 您可以使用crosstab来计算:

>>> pd.crosstab(df.index, df['Crime type'])
Crime type      Anti-social behaviour  Burglary  Other theft  Shoplifting  ...

E01006687                           0         1            0            0
E01007229                           3         1            2            1
E01007230                           6         0            0            0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM