简体   繁体   English

R 使用 Tapply 与 LD function

[英]R Using Tapply with LD function

I am trying to perform a linkage disequilibrium calculation using the LD() function from the genetics package.我正在尝试使用遗传学 package 中的 LD() function 执行连锁不平衡计算。 For those who don't know, it is written as follows:不知道的可以写成这样:

g1=genotype(a)
g2=genotype(b)
LD(g1,g2)

where a and b are characters其中 a 和 b 是字符

Given that, I have a dataframe with 4 columns and a large number of rows and I'm trying to find the LD of 2 of the columns.鉴于此,我有一个 dataframe 有 4 列和大量行,我试图找到 2 列的 LD。 Assuming df$col3 and df$col4 represent a and b from the above example, how would I go about performing the calculation?假设 df$col3 和 df$col4 代表上面示例中的 a 和 b,我将如何 go 执行计算?

I was considering using tapply, as a for loop would take forever:我正在考虑使用tapply,因为for循环将永远花费:

tapply(df$col3,df$col4,function)

The problem is that I can't figure out a way to set the following for the specific rows that they are in only:问题是我无法找到一种方法来为它们所在的特定行设置以下内容:

g1=genotype(row "n", col3)
g2=genotype(row "m", col4)

I know the "row 'n'" is not an actual valid code;我知道“行'n'”不是实际有效的代码; I just didn't know how else to describe it.我只是不知道该怎么形容它。

In the end, I plan on running the LD calculations once I can set the g1 and g2最后,我计划在可以设置 g1 和 g2 后运行 LD 计算

As James states in his comment you may want mapply .正如詹姆斯在他的评论中所说,您可能需要mapply I don't have your data but this should work:我没有您的数据,但这应该可以:

mapply(
     function(a, b) LD(genotype(a), genotype(b)),
     a = df$col3,
     b = df$col4
)

I made it community wiki cause answer is based on, not my, comment.我把它做成了社区维基,因为答案是基于而不是我的评论。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM