簡體   English   中英

R-根據其他兩列的比較將值分配給一列

[英]R - assigning value to one column based on a comparison of two other columns

假設我有以下數據:

SNP eff_allele A1 A2
rs1000000 A A G
rs10000010 C C T
rs1000002 T T C
rs10000023 G T G

我想創建一個新的變量alt_allele,該變量采用列A1或A2的值,具體取決於列eff_allele的值。 如果eff_allele等於A1,則alt_allele應獲取A2的值,如果eff_allele等於A2,則alt_allele應獲取A1的值。 我做了兩次嘗試:

嘗試1:

if (myData$eff_allele == myData$A1) {
myData$alt_allele <- myData$A2
}
if (myData$eff_allele == myData$A2) {
myData$alt_allele <- myData$A1
}

嘗試2:

height_fam$alt_allele[height_fam$eff_allele == height_fam$A1] <- height_fam$A2
height_fam$alt_allele[height_fam$eff_allele == height_fam$A2] <- height_fam$A1

這兩個都不起作用...我在做什么錯? 如何實現對數據的以下更新:

SNP eff_allele A1 A2 alt_allele
rs1000000 A A G G
rs10000010 C C T T
rs1000002 T T C C
rs10000023 G T G T

Rmatlab盡量不要使用循環,它們比較慢。 嘗試通過向量 s解決您的問題。

編輯 :哦,我讀錯了您的問題,無論如何您都沒有使用向量:)

a=read.table("a.csv", sep = " ", header = T)
row = dim(a)
# Number of rows
row = row[2]
newcol = rep("",row)
A1 = as.character(a$A1)
A2 = as.character(a$A2)
eff_allele = as.character(a$eff_allele)
# a1_ind is FALSE for index that should be equal to A1
a1_ind = eff_allele!= A1
newcol[a1_ind] = A1[a1_ind]
newcol[!a1_ind] = A2[!a1_ind]
a = cbind(a,newcol)

輸出將是:

         SNP eff_allele A1 A2 newcol
1  rs1000000          A  A  G      G
2 rs10000010          C  C  T      T
3  rs1000002          T  T  C      C
4 rs10000023          G  T  G      T

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM