找出对象列在熊猫中是否具有多种数据类型的最快和/或最惯用的方法是什么？

Question

I have a dataframe with a column like this: 我有一个像这样的列的数据框：

df.Chromosome
# 0        1
# 1        1
# 2        1
# 3        1
# 4        1
#         ..
# 94391    Y
# 94392    Y
# 94393    Y
# 94394    Y
# 94395    Y
# Name: Chromosome, Length: 94396, dtype: object

By doing df.Chromosome.apply(type).drop_duplicates() I find that it consists of two types of data: 通过执行df.Chromosome.apply(type).drop_duplicates()我发现它包含两种类型的数据：

0        <class 'int'>
65536    <class 'str'>
Name: Chromosome, dtype: object

Is there a faster and more idiomatic way of checking whether a column consists of multiple dtypes? 有没有更快，更惯用的方式来检查列是否包含多个dtype？

Answer 1

I think your solution is nice, another alternatives: 我认为您的解决方案很好，还有另一种选择：

df.Chromosome.map(type).unique()

set(df.Chromosome.map(type))

Also is possible first remove duplicates in values for improve performance: 也可以先删除值中的重复项以提高性能：

df.Chromosome.drop_duplicates().apply(type).drop_duplicates()

Answer 2

您也可以：

df.applymap(type).drop_duplicates()

Answer 3

Another alternative - 另一种选择-

{type(_) for _ in set(df.Chromosome.value_counts().index)}

This is quite slow 这很慢

找出对象列在熊猫中是否具有多种数据类型的最快和/或最惯用的方法是什么？

问题描述

3 个解决方案

解决方案1
4 已采纳 2019-09-10 13:02:24

解决方案2
1 2019-09-10 13:06:28

解决方案3
0 2019-09-10 13:59:43

找出对象列在熊猫中是否具有多种数据类型的最快和/或最惯用的方法是什么？

问题描述

3 个解决方案

解决方案1 4 已采纳 2019-09-10 13:02:24

解决方案2 1 2019-09-10 13:06:28

解决方案3 0 2019-09-10 13:59:43

解决方案1
4 已采纳 2019-09-10 13:02:24

解决方案2
1 2019-09-10 13:06:28

解决方案3
0 2019-09-10 13:59:43