简体   繁体   English

在R中,如何使用一个表来定义要在另一表中用于双向ANOVA的列?

[英]in R, How to use one table, to define columns to be used for two-way ANOVA in another table?

I have two tables, m and epi. 我有两个桌子,M和Epi。 The epi table contains names of m columns . Epi表包含m列的名称。

  head(m[,1:6])
         Geno    11DPW      8266         80647        146207    146227
1 SB002XSB012 0.87181895    G/G           C/C          G/G        A/A
2 SB002XSB018         Na    G/G           C/T          G/G        A/A
3 SB002XSB044   1.057744    G/G           C/C          G/G        A/A
4 SB002XSB051 1.64736814    G/G           C/C          G/G        A/A
5 SB002XSB067 0.69987475    A/G           C/C          G/G        A/G
6 SB002XSB073 0.60552177    A/G           C/C          G/G        A/G

    > dim(m)

[1]   167 28234
and 
head(epi)
       SNP1      SNP2
1  7789543   12846898
2 12846898  7789543
3 24862913  4603896
4  4603896   24862913
5 50592569  7789543
6 27293494   57162585

    dim(epi)

[1] 561   2

I want to take each row of epi, and to do a tow-way anova of these 2 columns in m on the 11DPW in m. 我要拍摄Epi的每一行,并在m的11DPW上对m的这两列进行拖拉方差分析。 I tried 我试过了

f<-function (x) {
 anova(lm (as.numeric(m$"11DPW")~ m[,epi[x,1]]*m[,epi[x,2]]))
 }
apply(epi,1,f)

and got error : Error in [.data.frame (m, , epi[x, 1]) : undefined columns selected Any suggestions ? 并得到错误: [.data.frame (m,,epi [x,1])中的错误:选择了未定义的列有什么建议吗? Thanks, Imri 谢谢,伊姆里

Putting aside for a moment the complications from using integers as column names (that is, assuming that this issue is handled correctly) 暂时搁置使用整数作为列名的复杂性(也就是说,假设此问题已正确处理)

You will still get the "undefined columns selected" error if the column indicated in epi does not exist in m 如果epi中指示的列在m中不存在,则仍然会出现"undefined columns selected""undefined columns selected"错误

offendingElements <- !sapply(epi, "%in%", colnames(m))

# since an offending element likely disqualifies the row from the anova test, identify the whole row
offendingRows <- which(offendingElements) %% nrow(epi)   

# perform your apply statement over:
epi[-offendingRows, ]



CLEANING UP THE FUNCTION USED IN APPLY 清理应用程序中使用的功能

when you use apply(epi, 1, f) what you are passing to each call of f is an entire row of epi . 当您使用apply(epi, 1, f)时,传递给f每次调用的都是整行epi Therefore, epi[x, 1] is not giving you the results you want. 因此, epi[x, 1]不能给您想要的结果。 For example, on the 7th iteration of the apply statement x is the equivalent of epi[7, ] . 例如,在apply语句的第7次迭代中, x等于epi[7, ] Therefore to get the first column, you just need to index x directly. 因此,要获得第一列,您只需直接索引x Therefore, in your function: 因此,在您的函数中:

Instead of       epi[x, 1]   and    epi[x, 2]
You want to use  x[[1]]      and    x[[2]]

That is the first part. 这是第一部分。 Second, we need to deal with integers as column names. 其次,我们需要将整数用作列名。 VERY IMPORTANT: If you use m[, 7823] this will get you the 7823rd column of m. 非常重要:如果使用m [,7823],则将获得m的7823列。 You have to be sure to convert the integers to strings, indicating that you want the column NAMED "7823", NOT (neceessarilly) the 7823rd column. 您必须确保将整数转换为字符串,这表示您希望将列命名为“ 7823”,而不是7823rd列(neceessarilly)。

Use as.character for this: 为此使用as.character

   m[, as.character(x[[1]])]

PUTTING IT ALL TOGETHER 全部放在一起

offendingElements <- !sapply(epi, "%in%", colnames(m))
offendingRows <- which(offendingElements) %% nrow(epi)   

apply(epi[-offendingRows, ], 1, function (x) 
   anova( lm ( as.numeric(m$"11DPW") ~ m[, as.character(x[[1]]) ] * m[, as.character(x[[2]]) ] ))
)




There is an alternative way to dealing with the names, the simplest would be to make them appropriate strings 有一种处理名称的替代方法,最简单的方法是使它们成为适当的字符串

# clean up the elements in epi
epi.clean <- sapply(epi, make.names)

# clean up m's column names
colnames(m) <- make.names(colnames(m))

# use epi.clean  in your apply statement.  Dont forget offendingRows
apply(epi.clean[-offendingRows, ], 1, function (x) 
   anova( lm ( as.numeric(m$"11DPW") ~ m[, x[[1]] ] * m[, x[[2]] ] ))
)

I suspect your values in epi are numbers, but what you want to use are their character equivalents, since the column names in m are character strings (even though these strings are made up of numerals). 我怀疑epi中的值是数字,但是您要使用的是它们的等价字符,因为m中的列名称是字符串(即使这些字符串由数字组成)。 Try this instead: 尝试以下方法:

m[[as.character(epi[x,])]] (etc) m[[as.character(epi[x,])]] (等)

The [[ operator is quirky but very cool. [[运算符很古怪,但非常酷。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM