Scipy Stats ttest_1samp 假設測試用於比較以前的性能與樣本

Question

我要解決的問題

我有 11 個月的性能數據：

        Month  Branded  Non-Branded  Shopping  Grand Total
0    2/1/2015     1330          334       161         1825
1    3/1/2015     1344          293       197         1834
2    4/1/2015      899          181       190         1270
3    5/1/2015      939          208       154         1301
4    6/1/2015     1119          238       179         1536
5    7/1/2015      859          238       170         1267
6    8/1/2015      996          340       183         1519
7    9/1/2015     1138          381       172         1691
8   10/1/2015     1093          395       176         1664
9   11/1/2015     1491          426       199         2116
10  12/1/2015     1539          530       156         2225

假設現在是 2016 年 2 月 1 日，我問“1 月份的結果在統計上與過去 11 個月有什么不同嗎？”

       Month  Branded  Non-Branded  Shopping  Grand Total
11  1/1/2016     1064          408       106         1578

我偶然發現了一個博客...

我偶然發現了iaingallagher的博客。 我將在這里重現（以防博客出現故障）。

1 樣本 t 檢驗

當我們想要將樣本均值與總體均值（我們已經知道）進行比較時，使用 1 樣本 t 檢驗。 英國男性的平均身高為 175.3 厘米。 一項調查記錄了 10 名英國男性的身高，我們想知道樣本的平均值是否與總體平均值不同。

# 1-sample t-test
from scipy import stats
one_sample_data = [177.3, 182.7, 169.6, 176.3, 180.3, 179.4, 178.5, 177.2, 181.8, 176.5]

one_sample = stats.ttest_1samp(one_sample_data, 175.3)

print "The t-statistic is %.3f and the p-value is %.3f." % one_sample

結果：

The t-statistic is 2.296 and the p-value is 0.047.

最后，對於我的問題...

在 iaingallagher 的示例中，他知道總體均值並正在比較樣本 ( one_sample_data )。 在我的示例中，我想查看1/1/2016年 1 月 1 日在統計上是否與前 11 個月不同。 因此，就我而言，前 11 個月是一個數組（而不是單個總體平均值），而我的樣本是一個數據點（而不是數組）......所以它有點倒退。

問題

如果我專注於Shopping列數據：

將scipy.stats. ttest_1samp ([161,197,190,154,179,170,183,172,176,199,156], 106)產生有效結果，即使我的樣本（第一個參數）是先前結果的列表，並且我將它與不是總體平均值而是一個樣本的popmean進行比較。

如果這不是正確的統計功能，有什么建議可以用於這種假設檢驗情況嗎？

Answer 1

如果您只對"Shopping"列感興趣，請嘗試創建一個 .xlsx 或 .csv 文件，其中僅包含"Shopping"列中的數據。

通過這種方式，您可以導入這些數據並使用 pandas 對每一列單獨執行相同的 T 檢驗。

import pandas as pd
from scipy import stats
data = pd.read_excel("datafile.xlxs")
    one_sample_data = data["Shopping"]

    one_sample = stats.ttest_1samp(one_sample_data, 175.3)

Scipy Stats ttest_1samp 假設測試用於比較以前的性能與樣本

問題描述

1 個解決方案

解決方案1
0 2020-05-23 12:36:52

Scipy Stats ttest_1samp 假設測試用於比較以前的性能與樣本

問題描述

1 個解決方案

解決方案1 0 2020-05-23 12:36:52

解決方案1
0 2020-05-23 12:36:52