[英]How to use grep function in R to find the correlation based on sample names
I have two data frames that has information about genes. 我有两个包含有关基因信息的数据框。 Both of these data frames has the same dimensions (20,000 rows x 50 columns).
这两个数据帧具有相同的尺寸(20,000行x 50列)。 I have another file called info contains the matched subject names between those data frames.
我还有一个名为info的文件,其中包含这些数据框之间匹配的主题名称。 I want to grep the names from the file (info) to find the correlation coefficient between the matched subjects.
我想从文件(信息)中提取名称,以找到匹配主题之间的相关系数。 here is example of those files:
这是这些文件的示例:
df1
gene_name loc1 loc2 ......... loc50
gene1 1 23 25
gene2 24 15 67
df2
gene_name loc1 loc2 ......... loc50
gene1 21 31 55
gene2 2 65 89
info file
subject loc_in_df1 loc_in_df2
1 loc1 loc2
2 loc3 loc46
try something like the following 试试下面的东西
first build up a df
with your columns pulled from df1
and df2
according to the info file 首先根据信息文件,使用从
df1
和df2
提取的列建立一个df
df <- cbind(df1[, info$loc_in_df1],df2[, info$loc_in_df2])
and 和
cor = apply(df, MARGIN = 1, FUN = function(x) return(cor.test(x[1:50], x[51:100])$estimate))
the 1:50 and 51:100 is assuming that you have 50 pairings in your info file but it's all just guessing as you didn't provide a reproducible sample 1:50和51:100假设您的信息文件中有50个配对,但这只是猜测,因为您没有提供可重复的样本
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.