![](/img/trans.png)
[英]R: Aggregating data by column group - mutate column with values for each observation
[英]How to grep a group based on string in another column that doesn't occur in each observation using R?
必须简化之前失败的问题。
我想提取由“ id”标识的整个组,它们在另一个名为“ strmatch”的列中包含一个字符串(“ inter”或“ high”)。 该字符串不会出现在对该组的每次观察中,但是如果出现该字符串,我想将该组分配给相应的数据帧。
数据框
df <- data.frame(id = c("a", "a", "b", "b","c", "c","d","d"),
std = c("y", "y","n","n","y","y","n","n"),
strmatch = c("alpha","TMB-inter","beta","TMB-high","gamma","delta","epsilon","TMB-inter"))
看起来像这样
id std strmatch
a y alpha
a y TMB-inter
b n beta
b n TMB-high
c y gamma
c y delta
d n epsilon
d n TMB-inter
预期结果
dfa
id std strmatch
a y alpha
a y TMB-inter
d n epsilon
d n TMB-inter
dfb
id std strmatch
b n beta
b n TMB-high
DFC
id std strmatch
c y gamma
c y delta
我尝试过的
split(df, grepl("high", df$strmatch))
仅给出两个数据帧,一个数据行包含“高”行,另一数据帧包含其余数据。
非常感谢你的帮助。
您可以将其分为两部分。 首先找出与"inter|high"
匹配的值,并将它们分成单独的数据帧,然后找到与任何unique_vals
不匹配的unique_vals
。
unique_vals <- unique(grep("inter|high", df$strmatch, value = TRUE))
c(lapply(unique_vals, function(x) subset(df, id %in% id[strmatch == x])),
list(subset(df, !id %in% id[strmatch %in% unique_vals])))
#[[1]]
# id std strmatch
#1 a y alpha
#2 a y TMB-inter
#7 d n epsilon
#8 d n TMB-inter
#[[2]]
# id std strmatch
#3 b n beta
#4 b n TMB-high
#[[3]]
# id std strmatch
#5 c y gamma
#6 c y delta
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.