[英]np.isreal behavior different in pandas.DataFrame and numpy.array
I have a array
like below 我有一个像下面的
array
np.array(["hello","world",{"a":5,"b":6,"c":8},"usa","india",{"d":9,"e":10,"f":11}])
and a pandas
DataFrame
like below 和下面的
pandas
DataFrame
一样
df = pd.DataFrame({'A': ["hello","world",{"a":5,"b":6,"c":8},"usa","india",{"d":9,"e":10,"f":11}]})
When I apply np.isreal
to DataFrame
当我将
np.isreal
应用于DataFrame
df.applymap(np.isreal)
Out[811]:
A
0 False
1 False
2 True
3 False
4 False
5 True
When I do np.isreal
for the numpy
array. 当我为
numpy
数组做np.isreal
。
np.isreal( np.array(["hello","world",{"a":5,"b":6,"c":8},"usa","india",{"d":9,"e":10,"f":11}]))
Out[813]: array([ True, True, True, True, True, True], dtype=bool)
I must using the np.isreal
in the wrong use case, But can you help me about why the result is different ? 我必须在错误的用例中使用
np.isreal
,但是你可以帮我解释为什么结果不同吗?
A partial answer is that isreal
is only intended to be used on array-like as the first argument. 部分答案是,
isreal
仅用于类似数组的第一个参数。
You want to use isrealobj
on each element to get the bahavior you see here: 你想在每个元素上使用
isrealobj
来获得你在这里看到的行为:
In [11]: a = np.array(["hello","world",{"a":5,"b":6,"c":8},"usa","india",{"d":9,"e":10,"f":11}])
In [12]: a
Out[12]:
array(['hello', 'world', {'a': 5, 'b': 6, 'c': 8}, 'usa', 'india',
{'d': 9, 'e': 10, 'f': 11}], dtype=object)
In [13]: [np.isrealobj(aa) for aa in a]
Out[13]: [True, True, True, True, True, True]
In [14]: np.isreal(a)
Out[14]: array([ True, True, True, True, True, True], dtype=bool)
That does leave the question, what does np.isreal
do on something that isn't array-like eg 这确实留下了一个问题,
np.isreal
对不像数组的东西做了什么
In [21]: np.isrealobj("")
Out[21]: True
In [22]: np.isreal("")
Out[22]: False
In [23]: np.isrealobj({})
Out[23]: True
In [24]: np.isreal({})
Out[24]: True
It turns out this stems from .imag
since the test that isreal
does is: 事实证明这源于
.imag
因为isreal
所做的测试是:
return imag(x) == 0 # note imag == np.imag
and that's it. 就是这样。
In [31]: np.imag(a)
Out[31]: array([0, 0, 0, 0, 0, 0], dtype=object)
In [32]: np.imag("")
Out[32]:
array('',
dtype='<U1')
In [33]: np.imag({})
Out[33]: array(0, dtype=object)
This looks up the .imag
attribute on the array. 这会在数组中查找
.imag
属性。
In [34]: np.asanyarray("").imag
Out[34]:
array('',
dtype='<U1')
In [35]: np.asanyarray({}).imag
Out[35]: array(0, dtype=object)
I'm not sure why this isn't set in the string case yet... 我不知道为什么在字符串的情况下还没有设置...
I think this a small bug in Numpy to be honest. 我认为这是Numpy的一个小错误,说实话。 Here Pandas is just looping over each item in the column and calling
np.isreal()
on it. 在这里,Pandas只是循环遍历列中的每个项目并在其上调用
np.isreal()
。 Eg: 例如:
>>> np.isreal("a")
False
>>> np.isreal({})
True
I think the paradox here has to do with how np.real()
treats inputs of dtype=object
. 我认为这里的悖论与
np.real()
如何处理np.real()
dtype=object
输入有关。 My guess is it's taking the object pointer and treating it like an int, so of course np.isreal(<some object>)
returns True. 我的猜测是它采用了对象指针并将其
np.isreal(<some object>)
一个int,所以当然np.isreal(<some object>)
返回True。 Over an array of mixed types like np.array(["A", {}])
, the array is of dtype=object
so np.isreal()
is treating all the elements (including the strings) the way it would anything with dtype=object
. 在像
np.array(["A", {}])
这样的混合类型数组中,数组是np.isreal()
dtype=object
所以np.isreal()
正在处理所有元素(包括字符串)的方式dtype=object
。
To be clear, I think the bug is in how np.isreal()
treats arbitrary objects in a dtype=object
array, but I haven't confirmed this explicitly. 为了清楚
np.isreal()
,我认为错误在于np.isreal()
如何处理np.isreal()
dtype=object
数组中的任意对象,但我没有明确地证实这一点。
There are a couple things going on here. 这里有几件事情要发生。 First is pointed out by the previous answers in that
np.isreal
acts strangely when passed ojbects. 首先通过前面的答案指出,
np.isreal
在传递ojbects时表现np.isreal
奇怪。 However, I think you are also confused about what applymap
is doing. 但是,我认为你也对
applymap
感到困惑。 Difference between map, applymap and apply methods in Pandas is always a great reference. Pandas中map,applymap和apply方法之间的区别总是很好的参考。
In this case what you think you are doing is actually: 在这种情况下,您认为自己在做的事实上是:
df.apply(np.isreal, axis=1)
Which essentially calls np.isreal(df), whereas df.applymap(np.isreal) is essentially calling np.isreal on each individual element of df. 其实质上是调用np.isreal(df),而df.applymap(np.isreal)实际上是在df的每个元素上调用np.isreal。 eg
例如
np.isreal(df.A)
array([ True, True, True, True, True, True], dtype=bool)
np.array([np.isreal(x) for x in df.A])
array([False, False, True, False, False, True], dtype=bool)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.