可能的 pandas 错误？

Question

Just looking at some strange behavior in Python/Pandas.只是看看 Python/Pandas 中的一些奇怪行为。

I know the setup is convoluted, I was doing some... challenges.我知道设置很复杂，我正在做一些……挑战。

def lucas_n(n):
    '''Return the fist n lucas numbers modulo 1_000_007'''
    my_list = [1,3]
    while len(my_list) < n:
        my_list.append((my_list[-1]+my_list[-2])%1_000_007)
    return my_list

def f(seq):
    '''Look up https://projecteuler.net/problem=739'''
    
    df = pd.Series(seq)
    
    for i in range(len(seq)-1):
        df = df.iloc[1:].cumsum()
        
    return df.iloc[0]

x = lucas_n(1e4)

f(x)

>>> -8402283173942682253

In short, x is a sequence of positive integers, and f applies consecutive .iloc[1:].cumsum() operations.简而言之， x是一个正整数序列， f应用连续.iloc[1:].cumsum()操作。

And the output is negative...而output是负数...

Is this a bug?这是一个错误吗？ A data type issue?数据类型问题？

Answer 1

It appears that you have an integer overflow.您似乎有 integer 溢出。 In Python itself integers can have arbitraty precision, but since pandas/numpy by default use C data types, overflow can happen:在 Python 中，整数本身可以具有任意精度，但由于 pandas/numpy 默认使用 C 数据类型，可能会发生溢出：

enter link description here 在此处输入链接描述

In order to solve the issue you might want to manually cast the data to Python integers:为了解决这个问题，您可能需要手动将数据转换为 Python 整数：

def f(seq):
    '''Look up https://projecteuler.net/problem=739'''
    
    df = pd.Series(seq).astype('int') # Casting to Python integer type
    
    for i in range(len(seq)-1):
        df = df.iloc[1:].cumsum()
        
    return df.iloc[0]

This solves overflow issue in my testing.这解决了我测试中的溢出问题。

可能的 pandas 错误？

问题描述

1 个解决方案

解决方案1
3 已采纳 2020-12-24 00:53:08

可能的 pandas 错误？

问题描述

1 个解决方案

解决方案1 3 已采纳 2020-12-24 00:53:08

解决方案1
3 已采纳 2020-12-24 00:53:08