简体   繁体   English

使用 Python 的双向方差分析

[英]Two Way Anova using Python

I am trying to do a two-way ANOVA, where I am trying to find the importance of two variables (B and M) on the classification of samples (given by the parameter C).我正在尝试做一个双向方差分析,我试图找到两个变量(B 和 M)对样本分类(由参数 C 给出)的重要性。

I am trying to reshape the data frame to make it suitable for statsmodels package.我正在尝试重塑数据框以使其适合statsmodels包。 However, I have only been able to include one variable at a time (either B or M) using pd.melt.但是,我只能使用 pd.melt 一次包含一个变量(B 或 M)。

Any suggestion on how can I use the values of both variables to perform the two-way ANOVA (in a way like the last two lines of the code given below) would be a great help.关于如何使用两个变量的值来执行双向方差分析(以类似于下面给出的代码的最后两行的方式)的任何建议都会有很大帮助。

The values of B, M and C: B、M 和 C 的值:

B : [10.,4.,4.,6.,5.]
M : [9.,6.,8.,4.,6.]
C : [1.,2.,2.,3.,1.]

import numpy as np
import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols
d = pd.read_csv("/Users/Hrihaan/Desktop/Data.txt", sep="\s+")
d_melt = pd.melt(d, id_vars=['C'], value_vars=['B'])
#model = ols('C ~ C(B) + C(M) + C(B):C(M)', data=d_melt).fit()
#anova_table = sm.stats.anova_lm(model, typ=2)

You were close to the answer:你接近答案:

B = [10.,4.,4.,6.,5.]
M = [9.,6.,8.,4.,6.]
C = [1.,2.,2.,3.,1.]

import numpy as np
import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols

d = pd.DataFrame()
d["B"]=B
d["M"]=M
d["C"]=C
model = ols("C ~ B + M + B:M",data = d).fit()
anova_table = sm.stats.anova_lm(model, typ=2)

You create a dataframe, you set your model, you perform the Anova你创建一个数据框,你设置你的模型,你执行 Anova

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM