简体   繁体   English

计算数据框中分类特征的级别

[英]counting levels for categorical features in data frame

I am trying to count how many levels for each categorical features in a data frame. 我正在尝试计算数据框中每个分类功能的级别。 Here is an example: 这是一个例子:

df_cat = pd.DataFrame([['green','M',10.1,'class1'],['red','L',13.5,'class2'],['blue','XL',15.3,'class1'],['red', 'M', 9, 'class1']], columns=['A','B','C','D'])

The desired output: 所需的输出:
A 3 A 3
B 3 B 3
D 2 第2天

Filter columns using select_dtypes and call DataFrame.nunique : 使用select_dtypes过滤列并调用DataFrame.nunique

df.select_dtypes([object]).nunique()

A    3
B    3
D    2
dtype: int64

If they're categorical columns and not objects, then this stricter filtering step would be preferred: 如果它们是分类列而不是对象,则首选此更严格的过滤步骤:

# Categorical column conversion.
df = df.astype(dict.fromkeys('AB', 'category'))

df.dtypes    
A    category
B    category
C     float64
D      object
dtype: object

df.select_dtypes([pd.Categorical]).nunique()
A    3
B    3
dtype: int64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM