简体   繁体   English

在numpy数组组之间进行T检验

[英]T-Test on between groups of numpy arrays

I have two groups of numpy arrays (A,B) and I want compare both groups using a statistical t-test (Two sample t-test). 我有两组numpy数组(A,B),我想使用统计t检验(两个样本t检验)比较两组。 The result should again be an array of the same dimensions providing eg the p-value or another statistical index. 结果应该再次是相同维度的数组,例如,提供p值或其他统计指标。

Here an two groups of example arrays I want to compare: 这是我要比较的两组示例数组:

import numpy as np

A1= numpy.random.normal(1,1,100)
A2= numpy.random.normal(1,1,100)
A3= numpy.random.normal(1,1,100)
A4= numpy.random.normal(1,1,100)
A5= numpy.random.normal(1,1,100)


B1= numpy.random.normal(3,1,100)
B2= numpy.random.normal(3,1,100)
B3= numpy.random.normal(3,1,100)
B4= numpy.random.normal(3,1,100)
B5= numpy.random.normal(3,1,100)

Is that possible with a standard function of Numpy/Scipy? 使用Numpy / Scipy的标准功能可以吗? Or do I have to loop over each element of the array? 还是我必须遍历数组的每个元素?

Since the five in each group are iid, you want to compare the concatenated A = [A1 A2 A3 A4 A5] to B = [B1 B2 B3 B4 B5]. 由于每个组中的五个都是同义的,因此您要比较串联的A = [A1 A2 A3 A4 A5]与B = [B1 B2 B3 B4 B5]。 You could have equivalently generated A = numpy.random.normal(1,1,500) , B = numpy.random.normal(3,1,500) . 您可以等效地生成A = numpy.random.normal(1,1,500)B = numpy.random.normal(3,1,500)

Then calculate the mean and deviation of both ( numpy.mean , numpy.std ), and compute Student's t statistic. 然后计算两者的均值和偏差( numpy.meannumpy.std ),并计算Student的t统计量。 Or use scipy.stats.ttest_ind . 或使用scipy.stats.ttest_ind

I think I found the right solution to get a t-test for each element of the array: 我想我找到了对数组的每个元素进行t检验的正确解决方案:

I involves to stack the arrays before performing the t-test: 我涉及在执行t检验之前堆叠数组:

import numpy
from scipy import stats


A1= numpy.random.normal(1,1,10).reshape(5, 2)
A2= numpy.random.normal(1,1,10).reshape(5, 2)
A3= numpy.random.normal(1,1,10).reshape(5, 2)
A4= numpy.random.normal(1,1,10).reshape(5, 2)
A5= numpy.random.normal(1,1,10).reshape(5, 2)
A = numpy.dstack((A1,A2,A3,A4,A5))

B1= numpy.random.normal(3,1,10).reshape(5, 2)
B2= numpy.random.normal(3,1,10).reshape(5, 2)
B3= numpy.random.normal(3,1,10).reshape(5, 2)
B4= numpy.random.normal(3,1,10).reshape(5, 2)
B5= numpy.random.normal(3,1,10).reshape(5, 2)
B = numpy.dstack((B1,B2,B3,B4,B5))

C = scipy.stats.ttest_ind(B,A,2)[1]

There remains just a small side question how to calculate eg the scipy.stats.mannwhitneyu() - Test (for non-normal data), where it is not possible to specify an array dimension like for the t-test. 仍然存在一个小问题,即如何计算scipy.stats.mannwhitneyu()-测试(用于非标准数据),在其中无法像t检验一样指定数组维数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM