函數適用於數據幀的每一行，但不使用df.apply

Question

我有這個pandas數據幀，每行包含兩個樣本X和Y：

import pandas as pd
import numpy as np
df = pd.DataFrame({'X': [np.random.normal(0, 1, 10),
                         np.random.normal(0, 1, 10),
                         np.random.normal(0, 1, 10)],
                   'Y': [np.random.normal(0, 1, 10),
                         np.random.normal(0, 1, 10),
                         np.random.normal(0, 1, 10)]})

我想在每一行上使用函數ttest_ind() （以兩個樣本作為輸入的統計測試），並獲取響應的第一個元素（該函數返回兩個元素）：

如果我為給定的行（例如第1行）執行此操作，則可以：

 from scipy import stats stats.ttest_ind(df['X'][0], df['Y'][0], equal_var = False)[0] # Returns a float

但是，如果我使用apply在每一行上執行它，我會收到一個錯誤：

 df.apply(lambda x: stats.ttest_ind(x['X'], x['Y'], equal_var = False)[0]) # Throws the following error: Traceback (most recent call last): File "pandas\\_libs\\index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc File "pandas\\_libs\\hashtable_class_helper.pxi", line 759, in pandas._libs.hashtable.Int64HashTable.get_item TypeError: an integer is required During handling of the above exception, another exception occurred: ... KeyError: ('X', 'occurred at index X')

我究竟做錯了什么？

Answer 1

您只需指定要應用函數的軸。 查看apply()的相關文檔。 簡而言之， axis = 1表示“將函數應用於我的數據幀的每一行”。 默認值為axis = 0 ，它嘗試將函數應用於每列。

df.apply(lambda x: stats.ttest_ind(x['X'], x['Y'], equal_var = False)[0], axis=1)

0    0.985997
1   -0.197396
2    0.034277

函數適用於數據幀的每一行，但不使用df.apply

問題描述

1 個解決方案

解決方案1
3 已采納 2018-05-15 18:09:06

函數適用於數據幀的每一行，但不使用df.apply

問題描述

1 個解決方案

解決方案1 3 已采納 2018-05-15 18:09:06

解決方案1
3 已采納 2018-05-15 18:09:06