[英]Replace DNA nucleotide at given position in DNA sequence using for loop
In the R data frame, I am trying to replace mutation column DNA nucleotide into WT.seq using position column numbers.在 R 数据框中,我正在尝试使用位置列号将突变列 DNA 核苷酸替换为 WT.seq。
Following is my data frame以下是我的数据框
transcript position ref mutation type WT.seq
1 trx1 5 A G substitution ATAAAA
2 trx2 3 C A substitution CCCCCC
3 trx3 7 T C substitution AAAAAATGG
Expected output in the data frame数据框中的预期输出
transcript position ref mutation type WT.seq
1 trx1 5 A G substitution ATAAGA
2 trx2 3 C A substitution CCACCC
3 trx3 7 T C substitution AAAAAACGG
Explanation解释
for example, in the given output data frame WT.seq column
contains DNA sequences, and in the first row of WT.seq there is DNA sequence ATAAAA
is present and I have to replace mutation column DNA nucleotide G(mutation column,1st row)
at 5th position of ATAAAA
, after replacing G at 5th position
in this sequence it will be ATAAGA
.例如,在给定的输出数据帧WT.seq column
中包含 DNA 序列,并且在 WT.seq 的第一行中存在 DNA 序列ATAAAA
并且我必须替换突变列 DNA 核苷酸G(mutation column,1st row)
在5th position of ATAAAA
G at 5th position
后,它将是ATAAGA
。 This position number is given from the position column,1st row
.这个位置编号是从position column,1st row
给出的。 I have to do this for all rows in the data frame.我必须对数据框中的所有行执行此操作。 My data frame contains thousands of rows.我的数据框包含数千行。
In the above output,i have done it for the first row using the following code.在上面的输出中,我使用以下代码为第一行完成了它。
DNA_seq <- read.table("sequences.txt",sep = "\t",header = T)
df<- as.data.frame(DNA_seq)
substring(df[1,6], first=df[1,2]) <- df[1,4]
I want to run for loop on the remaining rows so that all mutation nucleotide replacement will be done in WT.seq column with help of position column numbers我想在剩余的行上运行 for 循环,以便在位置列号的帮助下在 WT.seq 列中完成所有突变核苷酸替换
You could strsplit
, replace
position with mutation in Map
and paste
back together.您可以strsplit
, replace
位置替换为Map
中的突变并重新paste
在一起。
transform(dat, WT.mut=Map(replace, strsplit(WT.seq, ''), position, mutation) |>
sapply(paste, collapse=''))
# transcript position ref mutation type WT.seq WT.mut
# 1 trx1 5 A G substitution ATAAAA ATAAGA
# 2 trx2 3 C A substitution CCCCCC CCACCC
# 3 trx3 7 T C substitution AAAAAATGG AAAAAACGG
I used an extra column to demonstrate, just replace WT.mut=
with WT.seq=
to overwrite.我使用了一个额外的列来演示,只需将WT.mut=
替换为WT.seq=
即可覆盖。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.