为什么我对Pandas Series.apply和DataFrame.applymap获得不同的结果？

Question

我想检查所有值是否具有与第一行相同的类型。 df.applymap和series.apply的行为不像我想象的那样。

数据集来自kaggle上的imdb情绪分析。

打印（df.head（））

         id  sentiment                                             review
0  "5814_8"          1  "With all this stuff going down at the moment ...
1  "2381_9"          1  "\"The Classic War of the Worlds\" by Timothy ...
2  "7759_3"          0  "The film starts with a manager (Nicholas Bell...
3  "3630_4"          0  "It must be assumed that those who praised thi...
4  "9495_8"          1  "Superbly trashy and wondrously unpretentious ...

每行似乎是str，int，str。 因此，一切似乎都很好。

打印（df.applymap（类型））

              id      sentiment         review
0  <class 'str'>  <class 'int'>  <class 'str'>
1  <class 'str'>  <class 'int'>  <class 'str'>
2  <class 'str'>  <class 'int'>  <class 'str'>
3  <class 'str'>  <class 'int'>  <class 'str'>
4  <class 'str'>  <class 'int'>  <class 'str'>

在该系列上调用Apply看起来有些不同。 情感是int64而不是int 。

打印（df.iloc [0] .apply（type））

id                   <class 'str'>
sentiment    <class 'numpy.int64'>
review               <class 'str'>
Name: 0, dtype: object

也许还是一样，所以我比较了类型。

print（df.applymap（type）== df.iloc [0] .apply（type））

    id  sentiment   review
0   True    False   True
1   True    False   True
2   True    False   True
3   True    False   True
4   True    False   True

结果是意外的。 至少第一行应为True，True，True。 我在DataFrame上使用applymap应该是明智的。 第二个适用于系列，也应该是元素明智的。 那么为什么结果不相等呢？

Answer 1

我花了一段时间才了解jpp的评论。 但是我想我现在可以回答我自己的问题。

df.iloc [0]返回一个numpy数组的熊猫系列。 因此，其中的所有类型也都是numpy类型。 该数字将转换为numpy.int64

DataFrame中的值似乎是本机python类型。 这显然不等于numpy int。

我最初尝试进行的比较应该是这样的：

df.applymap(type) == df.head(1).applymap(type).iloc[0]

为什么我对Pandas Series.apply和DataFrame.applymap获得不同的结果？

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-10-03 18:14:08

为什么我对Pandas Series.apply和DataFrame.applymap获得不同的结果？

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-10-03 18:14:08

解决方案1
0 已采纳 2018-10-03 18:14:08