推断我使用方差分析和双向假设检验的Inference（）函数； R / RStudio

Question

I'm trying to use a custom function called Inference() as seen in the code below. 我正在尝试使用一个名为Inference（）的自定义函数，如下面的代码所示。 There's no documentation for the function, but it is from my DASI class in Coursera. 没有该函数的文档，但这是我在Coursera的DASI类中获得的。 According to the feedback I have received, I am using the function properly. 根据收到的反馈，我正在正确使用该功能。 I'm trying to do a two-sided hypothesis test between my class variable and my wordsum variable, that is, between the two means of the categories low class and working class. 我正在尝试在我的类变量和我的wordum变量之间（即，在低类和工人阶级的两种方法之间）进行双向假设检验。 So, the average wordsum for working class - average wordsum for lower class. 因此，工人阶级的平均词汇量-下层阶级的平均词汇量。 However, the function/R/R Studio keep insisting I do an ANOVA test. 但是，功能/ R / R Studio坚持要求我进行ANOVA测试。 This doesn't work for me since I'm trying to reject the null, and create a confidence interval between the difference of two independent means. 这对我不起作用，因为我试图拒绝null，并在两个独立均值之差之间创建一个置信区间。 I've looked at the function, but as I'm no R expert, I don't see anything out of the ordinary. 我看过函数，但是由于我不是R专家，所以我看不到任何异常。 Any help is greatly appreciated. 任何帮助是极大的赞赏。

Code: 码：

load(url("http://bit.ly/dasi_gss_ws_cl"))
source("http://bit.ly/dasi_inference")

summary(gss)
by(gss$wordsum, gss$class, mean)
boxplot(gss$wordsum ~ gss$class)

gss_clean = na.omit(subset(gss, class == "WORKING" | class =="LOWER"))

inference(y = gss_clean$wordsum, x = gss_clean$class, est = "mean", type = "ht", 
          null = 0, alternative = "twosided", method = "theoretical")

Returns: 返回值：

Response variable: numerical, Explanatory variable: categorical
Error: Use alternative = 'greater' for ANOVA or chi-square test.
In addition: Warning message:
Ignoring null value since it's undefined for ANOVA.

Answer 1

You need 你需要

gss_clean <- droplevels(gss_clean)

Then your inference() call works: 然后您的inference()调用起作用了：

Response variable: numerical, Explanatory variable: categorical
Difference between two means
Summary statistics:
n_LOWER = 41, mean_LOWER = 5.0732, sd_LOWER = 2.2404
n_WORKING = 407, mean_WORKING = 5.7494, sd_WORKING = 1.8652
Observed difference between means (LOWER-WORKING) = -0.6762
H0: mu_LOWER - mu_WORKING = 0 
HA: mu_LOWER - mu_WORKING != 0 
Standard error = 0.362 
Test statistic: Z =  -1.868 
p-value =  0.0616

The problem is that unless you drop the unused levels of the factor, the internal machinery of inference() thinks that you have a 4-level categorical variable, and it can't do a t-test or equivalent 2-category test: it has to do a one-way ANOVA or analogue. 问题在于，除非删除未使用的因子水平，否则inference()的内部机制会认为您具有4级分类变量，并且它无法进行t检验或等效的2类检验：必须执行单向方差分析或类似方法。

推断我使用方差分析和双向假设检验的Inference（）函数； R / RStudio

问题描述

1 个解决方案

解决方案1
2 已采纳 2014-10-05 04:52:04

推断我使用方差分析和双向假设检验的Inference（）函数； R / RStudio

问题描述

1 个解决方案

解决方案1 2 已采纳 2014-10-05 04:52:04

解决方案1
2 已采纳 2014-10-05 04:52:04