[英]Replacing for loop with apply
我有一個簡單的for循環,用於模式匹配和從另一個矩陣獲取值。 大量行的運行速度有點慢。 我正在嘗試將其轉換為函數,然后使用apply。 但是我沒有得到與for循環相同的結果。 有人可以告訴我我在做什么錯。 謝謝
這是for循環:
exp_target_com = structure(list(X06...2239_normal = c(12.2528814946075, 8.25298920937508), X06...2239_tumor = c(12.476021286337, 6.08504757235585), Ensembl_Id = structure(c(NA_integer_,
NA_integer_), .Label = "", class = "factor"), HGNC = structure(c(NA_integer_,
NA_integer_), .Label = "", class = "factor")), .Names = c("X06...2239_normal", "X06...2239_tumor", "Ensembl_Id", "HGNC"), class = "data.frame", row.names = c("A_23_P117082", "A_33_P3246448"))
head(exp_target_com)
#> X06...2239_normal X06...2239_tumor Ensembl_Id HGNC
#> A_23_P117082 12.252881 12.476021 <NA> <NA>
#> A_33_P3246448 8.252989 6.085048 <NA> <NA>
probe_anno = structure(c("A_23_P117082", "A_33_P3246448", "NM_015987", "NM_080671", "NM_015987", "NM_080671", "ENSG00000013583", "ENSG00000152049",
"HEBP1", "KCNE4"), .Dim = c(2L, 5L), .Dimnames = list(c("44693",
"31857"), c("Probe.ID", "SystematicName", "refseq_biomart", "Ensembl_Id",
"HGNC")))
probe_anno
#> Probe.ID SystematicName refseq_biomart Ensembl_Id HGNC
#> 44693 A_23_P117082 NM_015987 NM_015987 ENSG00000013583 HEBP1
#> 31857 A_33_P3246448 NM_080671 NM_080671 ENSG00000152049 KCNE4
for(i in 1:nrow(exp_target_com)) {
pos <- which(as.character(probe_anno$Probe.ID) == rownames(exp_target_com)[i])
if(length(pos) > 0) {
exp_target_com[i,3] <- as.character(probe_anno$Ensembl_Id)[pos[1]]
exp_target_com[i,4] <- as.character(probe_anno$HGNC)[pos[1]]
}
}
這是功能和適用
get_anno <- function(data_row, probe_anno) {
pos <- which(as.character(probe_anno$Probe.ID) == rownames(data_row))
if (length(pos) > 0) {
data_row$Ensembl_Id <- as.character(probe_anno$Ensembl_Id)[pos[1]]
data_row$HGNC <- as.character(probe_anno$HGNC)[pos[1]]
}
return(data_row)
}
apply(exp_target_com, c(1,2), FUN = function(x) get_anno(x, probe_anno))
同意注釋,這看起來像使用內置函數(例如merge
或dplyr
等效dplyr
函數)會更簡單,更快。 在這里,我將行名轉換為列,並使用它與probe_anno
。
library(dplyr)
exp_target_com2 <- exp_target_com %>%
select(-3, -4) %>%
tibble::rownames_to_column("Probe.ID") %>%
left_join(probe_anno %>% as.data.frame(), by = ("Probe.ID"))
> exp_target_com2
Probe.ID X06...2239_normal X06...2239_tumor SystematicName refseq_biomart Ensembl_Id HGNC
1 A_23_P117082 12.252881 12.476021 NM_015987 NM_015987 ENSG00000013583 HEBP1
2 A_33_P3246448 8.252989 6.085048 NM_080671 NM_080671 ENSG00000152049 KCNE4
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.