Want to create a new column "non_coded" using existing 3 columns- allele_2 , allele_1 and A1
the conditions I want satisfied are :
if allele_2 == A1 then non_coded = allele_1
if allele_2 != A1 then non_coded = allele_2
Thanks in advance,
Rad
SNPID chrom STRAND IMPUTED allele_2 allele_1 MAF CALL_RATE HET_RATE
1 rs1000000 12 + Y A G 0.12160 1.00000 0.2146
2 rs10000009 4 + Y G A 0.07888 0.99762 0.1386
HWP RSQ PHYS_POS A1 M1_FRQ M1_INFO M1_BETA M1_SE M1_P
1 1.0000 0.9817 125456933 A 0.1173 0.9452 -0.0113 0.0528 0.83090
2 0.1164 0.8354 71083542 A 0.9048 0.9017 -0.0097 0.0593 0.87000
Hy_MVA$non_coded <- ifelse(Hy_MVA$allele_2 == Hy_MVA$A1, Hy_MVA$allele_1, Hy_MVA$allele_2)
result:
SNPID chrom STRAND IMPUTED allele_2 allele_1 MAF CALL_RATE HET_RATE
1 rs1000000 12 + Y A G 0.12160 1.00000 0.2146
2 rs10000009 4 + Y G A 0.07888 0.99762 0.1386
HWP RSQ PHYS_POS A1 M1_FRQ M1_INFO M1_BETA M1_SE M1_P non_coded
1 1.0000 0.9817 125456933 A 0.1173 0.9452 -0.0113 0.0528 0.83090 3
2 0.1164 0.8354 71083542 A 0.9048 0.9017 -0.0097 0.0593 0.87000 3
What I want:
SNPID chrom STRAND IMPUTED allele_2 allele_1 MAF CALL_RATE HET_RATE
1 rs1000000 12 + Y A G 0.12160 1.00000 0.2146
2 rs10000009 4 + Y G A 0.07888 0.99762 0.1386
HWP RSQ PHYS_POS A1 M1_FRQ M1_INFO M1_BETA M1_SE M1_P non_coded
1 1.0000 0.9817 125456933 A 0.1173 0.9452 -0.0113 0.0528 0.83090 G
2 0.1164 0.8354 71083542 A 0.9048 0.9017 -0.0097 0.0593 0.87000 G
As Chase said, use ifelse()
. I guess the code then becomes:
non_coded <- ifelse(allele_2 == A1, allele_1, allele_2)
After seeing the updated question, it makes sense that you get numbers because allele_1
and allele_2
are factors. Adding a as.character()
should fix this:
A1 <- c("A","A","B")
allele_1 <- as.factor(c("A","C","C"))
allele_2 <- as.factor(c("A","B","B"))
non_coded <- ifelse(allele_2 == A1, as.character(allele_1), as.character(allele_2))
non_coded
[1] "A" "B" "C"
Since you want non_coded to be one of two values:
Hy_MVA$non_coded <- Hy_MVA$allele_2
Hy_MVA$non_coded[Hy_MVA$allele_2 == Hy_MVA$A1] <- Hy_MVA$allele_1[Hy_MVA$allele_2 == Hy_MVA$A1]
That replaces values with allele_1 values in only the rows where allele_2 == A1. It sounds as though you might have a problem with ifelse converting a factor to a numeric.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.