[英]Adding column based on the matching value in second dataframe
Hello all,大家好,
I'm trying to add a new column next to my REP column based on the condition of the second data frame.我正在尝试根据第二个数据框的条件在我的 REP 列旁边添加一个新列。 for example, if 1 from Block1 is present in GEN and its REP is 1 then in the new column add 1, I can do it with case_when as below
例如,如果来自 Block1 的 1 出现在 GEN 中并且其 REP 为 1,则在新列中添加 1,我可以使用 case_when 来完成,如下所示
df<-df%>%mutate(Block = case_when(
GEN=="G132" & REP=="R1" ~ "1",
GEN=="G100" & REP=="R1" ~ "1",
GEN=="G120" & REP=="R1" ~ "1",
GEN=="G58" & REP=="R1" ~ "1",
GEN=="G48" & REP=="R1" ~ "1",
GEN=="G125" & REP=="R1" ~ "1",
GEN=="G1" & REP=="R1" ~ "1",
GEN=="G29" & REP=="R1" ~ "1",
GEN=="G42" & REP=="R1" ~ "1",
TRUE~GEN
))
I was wondering if I could loop it somehow, I have 144 GEN replicated twice and they fall into 32 blocks我想知道我是否可以以某种方式循环它,我将 144 GEN 复制了两次,它们分为 32 个块
Thank you谢谢
from how you describe your problem I would go about it in a different way, than what you have illustrated in your example.从你描述你的问题的方式来看,我会 go 以不同的方式,而不是你在你的例子中说明的方式。 To avoid having to make a case when for each column in your block;
为了避免在你的块中的每一列都提出一个案例; if you convert your "block" data frame into long format and then simply merge the matching gens, then afterwards adjust the block values to you conditions.
如果您将“块”数据框转换为长格式,然后简单地合并匹配的基因,然后根据您的条件调整块值。
library(plyr)
library(dplyr)
library(tidyr) #for wide to long format
#first data frame
df <- data.frame(GEN = c("G1", "G1", "G2", "G2", "G3", "G3", "G4", "G4", "G5", "G5"),
REP = rep(c("R1", "R2"), 5))
#second data frame
df2 <- data.frame(Block1 = c(132,100,120,58,48,125,1,29,142),
Block2 = c(2,107,113,89,87,75,38,70,81),
Block3 = c(136,134,53,106,143,63,22,40,56),
Block4 = c(112,144,122,32,39,50,130,74,13))
#take the blocks data frame from wide to long format using tidyr library
blocks <- df2 %>% pivot_longer(cols=names(df2),
names_to='Block',
values_to='GEN')
blocks$GEN <- paste0("G", blocks$GEN) #add a G before the number so that values can be matched
blocks$Block <- gsub("Block", "", blocks$Block) #remove block from string so that the bumber is left
res <- merge(df, blocks, by = "GEN", all.x = T) #merge the two dataframes.
#here the other condition of "R2" not being a match is established, and no matches is also replaced with the GEN column value
res[is.na(res$Block) | res$REP == "R2", "Block"] <- res[is.na(res$Block) | res$REP == "R2", "GEN"]
Hope that it makes sense, and that i have understood the desired solution correctly!希望它有意义,并且我已经正确理解了所需的解决方案!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.