大熊猫结果不一致且数值缺失

Question

Why does numpy return different results with missing values when using a Pandas series compared to accessing the series' values as in the following: 与使用以下方式访问系列值相比，为什么使用Pandas系列时numpy返回带有缺失值的不同结果：

import pandas as pd
import numpy as np

data = pd.DataFrame(dict(a=[1, 2, 3, np.nan, np.nan, 6]))
np.sum(data['a'])

#12.0

np.sum(data['a'].values)
#nan

Answer 1

Calling np.sum on a pandas Series delegates to Series.sum , which ignores NaNs when computing the sum (BY DEFAULT). 在熊猫Series上调用np.sum代表Series.sum ，它在计算总和时（按默认值）会忽略NaN。

data['a'].sum()
# 12.0

np.sum(data['a'])
# 12.0

You can see this from the source code of np.sum : 您可以从np.sum的源代码中np.sum ：

np.sum??

def sum(a, axis=None, dtype=None, out=None, keepdims=np._NoValue, initial=np._NoValue):
    ...
    return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims,

Taking a look at the source code for _wrapreduction , we see: 看一下_wrapreduction的源代码，我们看到：

np.core.fromnumeric._wrapreduction??

def _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs):
    ...
    if type(obj) is not mu.ndarray:
        try:
            reduction = getattr(obj, method)   # get reference to Series.add

reduction is then finally called at the end of the function: reduction然后最后调用在函数的末尾：

            return reduction(axis=axis, out=out, **passkwargs)

大熊猫结果不一致且数值缺失

问题描述

1 个解决方案

解决方案1
5 已采纳 2019-03-12 20:25:04

大熊猫结果不一致且数值缺失

问题描述

1 个解决方案

解决方案1 5 已采纳 2019-03-12 20:25:04

解决方案1
5 已采纳 2019-03-12 20:25:04