如何获取 pandas DataFrame 中两列的每一行的 p 值？

Question

I would like to ask for any suggestion how to calculate p-value for each row in my pandas DataFrame.我想请教如何计算我的 pandas DataFrame 中每一行的 p 值。 My dataframe looks like this - there are columns with means of Data1 and Data2, and then also columns with standard error of the means.我的 dataframe 看起来像这样 - 有 Data1 和 Data2 的列，然后还有具有标准误差的列。 Each row represent one atom.每行代表一个原子。 Thus I need calculate p-value for each row (= it means, eg, compare mean of atom 1 from Data1 with mean of atom 1 from Data2).因此，我需要计算每一行的 p 值（= 这意味着，例如，比较来自 Data1 的原子 1 的平均值与来自 Data2 的原子 1 的平均值）。

    SEM-DATA1   MEAN-DATA1  SEM-DATA2   MEAN-DATA2  
0   0.001216    0.145842    0.000959    0.143103    
1   0.002687    0.255069    0.001368    0.250505    
2   0.005267    0.321345    0.003722    0.305767    
3   0.027265    0.906731    0.033637    0.731638    
4   0.029974    0.773725    0.150025    0.960804

I found here on Stack that many people recommend using scipy.我在 Stack 上发现很多人推荐使用 scipy。 But I dont know how to apply it in the way I need it.但我不知道如何以我需要的方式应用它。 Is it possible?可能吗？ Thank You.谢谢你。

Answer 1

You are comparing two samples df['MEAN...1'] and df['MEAN...2'] , so, you should do this:您正在比较两个样本df['MEAN...1']和df['MEAN...2'] ，因此，您应该这样做：

from scipy import stats
stats.ttest_ind(df['MEAN-DATA1'],df['MEAN-DATA2'])

which return:返回：

Ttest_indResult(statistic=0.01001479441863673, pvalue=0.9922547232600507)

or if you only want to p-value或者如果你只想 p 值

a = stats.ttest_ind(df['MEAN-DATA1'],df['MEAN-DATA2'])
a[1]

which gives这使

0.9922547232600507

EDIT编辑

A clarification is in order here.这里需要澄清一下。 A t-test (or the aquisition of a "p-value" is aimed at finding out is two distributions are coming from the same population (or sample). Testing for two single values will give NaN . t 检验（或“p 值”的获取旨在找出两个分布来自同一总体（或样本）。测试两个单个值将给出NaN 。

如何获取 pandas DataFrame 中两列的每一行的 p 值？

问题描述

1 个解决方案

解决方案1
0 2020-12-07 12:05:04

如何获取 pandas DataFrame 中两列的每一行的 p 值？

问题描述

1 个解决方案

解决方案1 0 2020-12-07 12:05:04

解决方案1
0 2020-12-07 12:05:04