Pearsonr：TypeError：沒有找到與指定簽名匹配的循環，並且為 ufunc add 找到了轉換

Question

我有一個名為“df”的時間序列 Pandas dataframe。 它具有一列和以下形狀：(2000, 1)。 下面 dataframe 的頭部顯示了它的結構：

            Weight
Date    
2004-06-01  1.9219
2004-06-02  1.8438
2004-06-03  1.8672
2004-06-04  1.7422
2004-06-07  1.8203

目標

我正在嘗試使用“for 循環”來計算“權重”變量在不同時間范圍或時間滯后的百分比變化之間的相關性。 這樣做是為了評估在不同時間段內飼養牲畜的影響。

循環可以在下面找到：

from scipy.stats.stats import pearsonr

# Loop for producing combinations of different timelags and holddays 
# and calculating the pearsonr correlation and p-value of each combination 

for timelags in [1, 5, 10, 25, 60, 120, 250]:
    for holddays in [1, 5, 10, 25, 60, 120, 250]:
        weight_change_lagged = df.pct_change(periods=timelags)
        weight_change_future = df.shift(-holddays).pct_change(periods=holddays)

        if (timelags >= holddays):
            indepSet=range(0, weight_change_lagged.shape[0], holddays)
        else:
            indepSet=range(0, weight_change_lagged.shape[0], timelags)

        weight_change_lagged = weight_change_lagged.iloc[indepSet]
        weight_change_future = weight_change_future.iloc[indepSet]

        not_na = (weight_change_lagged.notna() & weight_change_future.notna()).values

        (correlation, p-value)=pearsonr(weight_change_lagged[not_na], weight_change_future[not_na])
        print('%4i %4i %7.4f %7.4f' % (timelags, holddays, correlation, p-value))

循環執行良好，但是，在計算 pearsonr 相關性和 p 值時失敗，即在本節中：

(correlation, p-value)=pearsonr(weight_change_lagged[not_na], weight_change_future[not_na])

它生成此錯誤：

TypeError：沒有為 ufunc add 找到匹配指定簽名和轉換的循環

有沒有人知道如何解決我的問題？ 我瀏覽了論壇，沒有找到符合我確切要求的答案。

Answer 1

通過隨機修補，我設法解決了我的問題，如下所示：

scipy 的 pearsonr package 僅接受 arrays 或類似數組的輸入。 這意味着：

Numpy arrays 的輸入變量工作。
Pandas系列的輸入變量工作。

但是，完整的 Pandas 變量數據幀，即使它們包含一列，也不起作用。

因此，我將有問題的代碼段編輯如下：

# Define an object containing observations that are not NA
not_na = (weight_change_lagged.notna() & weight_change_future.notna()).values

# Remove na values before inputting the data into the peasonr function (not within the function as I had done):
weight_change_lagged = weight_change_lagged[not_na]
weight_change_future = weight_change_future[not_na]

# Input Pandas Series of the Future and Lagged Variables into the function
(correlation, p-value)=pearsonr(weight_change_lagged['Weight'], weight_change_future['Weight'])

只需稍作修改，代碼就可以順利執行。

筆記：

如果使用雙方括號，如下所示，您輸入的是 pandas dataframe 不是系列，並且 pearsonr function 將拋出錯誤：

weight_change_future[['Weight']]

感謝所有試圖提供幫助的人，您的問題使我得到了答案。

Answer 2

就我而言，這不是數據類型問題，而是因為維度錯誤。 感謝文章https://programmersought.com/article/67803965109/

Answer 3

即使您在 function 中輸入 numpy arrays 也可能會遇到此錯誤。原來是引入了“額外”維度 numpy 數組導致此問題。 W

np_data.shape
>> (391, 1)

這 (.., 1 ) 是問題的根源。 您可以使用 np.squeeze(np_data) 刪除此維度以僅提取數組的值，因為

np.squeeze(np_data).shape
>> (391,)

總而言之，解決方案是使用：

pearson, pvalue = pearsonr(np.squeeze(np_data_a), np.squeeze(np_data_b))

Pearsonr：TypeError：沒有找到與指定簽名匹配的循環，並且為 ufunc add 找到了轉換

問題描述

3 個解決方案

解決方案1
2 已采納 2020-05-21 09:22:54

解決方案2
0 2021-06-01 16:47:18

解決方案3
0 2022-03-20 11:16:50

Pearsonr：TypeError：沒有找到與指定簽名匹配的循環，並且為 ufunc add 找到了轉換

問題描述

3 個解決方案

解決方案1 2 已采納 2020-05-21 09:22:54

解決方案2 0 2021-06-01 16:47:18

解決方案3 0 2022-03-20 11:16:50

解決方案1
2 已采納 2020-05-21 09:22:54

解決方案2
0 2021-06-01 16:47:18

解決方案3
0 2022-03-20 11:16:50