根据秒中的匹配值添加列 dataframe

Question

Hello all,大家好，

I'm trying to add a new column next to my REP column based on the condition of the second data frame.我正在尝试根据第二个数据框的条件在我的 REP 列旁边添加一个新列。 for example, if 1 from Block1 is present in GEN and its REP is 1 then in the new column add 1, I can do it with case_when as below例如，如果来自 Block1 的 1 出现在 GEN 中并且其 REP 为 1，则在新列中添加 1，我可以使用 case_when 来完成，如下所示

df<-df%>%mutate(Block = case_when(
  GEN=="G132" & REP=="R1" ~ "1",
  GEN=="G100" & REP=="R1" ~ "1",
  GEN=="G120" & REP=="R1" ~ "1",
  GEN=="G58" & REP=="R1" ~ "1",
  GEN=="G48" & REP=="R1" ~ "1",
  GEN=="G125" & REP=="R1" ~ "1",
  GEN=="G1" & REP=="R1" ~ "1",
  GEN=="G29" & REP=="R1" ~ "1",
  GEN=="G42" & REP=="R1" ~ "1",
  TRUE~GEN
))

I was wondering if I could loop it somehow, I have 144 GEN replicated twice and they fall into 32 blocks我想知道我是否可以以某种方式循环它，我将 144 GEN 复制了两次，它们分为 32 个块

Thank you谢谢

Answer 1

from how you describe your problem I would go about it in a different way, than what you have illustrated in your example.从你描述你的问题的方式来看，我会 go 以不同的方式，而不是你在你的例子中说明的方式。 To avoid having to make a case when for each column in your block;为了避免在你的块中的每一列都提出一个案例； if you convert your "block" data frame into long format and then simply merge the matching gens, then afterwards adjust the block values to you conditions.如果您将“块”数据框转换为长格式，然后简单地合并匹配的基因，然后根据您的条件调整块值。

library(plyr)
library(dplyr)
library(tidyr) #for wide to long format

#first data frame
df <- data.frame(GEN = c("G1", "G1", "G2", "G2", "G3", "G3",  "G4", "G4",  "G5", "G5"), 
                 REP = rep(c("R1", "R2"), 5))

#second data frame
df2 <- data.frame(Block1 = c(132,100,120,58,48,125,1,29,142),
                  Block2 = c(2,107,113,89,87,75,38,70,81),
                  Block3 = c(136,134,53,106,143,63,22,40,56),
                  Block4 = c(112,144,122,32,39,50,130,74,13))

#take the blocks data frame from wide to long format using tidyr library
blocks <- df2 %>% pivot_longer(cols=names(df2),
                               names_to='Block',
                               values_to='GEN')

blocks$GEN <- paste0("G", blocks$GEN) #add a G before the number so that values can be matched
blocks$Block <- gsub("Block", "", blocks$Block) #remove block from string so that the bumber is left

res <- merge(df, blocks, by = "GEN", all.x = T) #merge the two dataframes.

#here the other condition of "R2" not being a match is established, and no matches is also replaced with the GEN column value
res[is.na(res$Block) | res$REP == "R2", "Block"] <- res[is.na(res$Block) | res$REP == "R2", "GEN"]

Hope that it makes sense, and that i have understood the desired solution correctly!希望它有意义，并且我已经正确理解了所需的解决方案！

根据秒中的匹配值添加列 dataframe

问题描述

1 个解决方案

解决方案1
0 2023-01-03 12:51:19

根据秒中的匹配值添加列 dataframe

问题描述

1 个解决方案

解决方案1 0 2023-01-03 12:51:19

解决方案1
0 2023-01-03 12:51:19