[英]extracting specific values in R
I am a beginner in R coding and I have the sample data here. 我是R编码的初学者,我在这里有样本数据。 I am trying to extract all the entries which have 2 "d7" and one "d1" for identical Idvalue number.
我试图提取所有具有2“d7”和一个“d1”的条目以获得相同的Idvalue数。
Sample name Idvalue_number
a d1 1
f d7 1
b d7 1
s d1 5
g d7 5
r d7 5
z d1 7
y d7 7
d d1 7
Expected output
a d1 1
f d7 1
b d7 1
s d1 5
g d7 5
r d7 5
Some code I have tried which is not giving me the desired output is here: 我试过的一些代码没有给我所需的输出在这里:
d1d7 <- data_ %>%
group_by(dvalue_number) %>%
filter(n() >= 3 & any(name == first(name)))
Could someone help me here? 有人可以帮我吗? Thanks in advance.
提前致谢。
An option would be to filter
based on the frequency of 'd1', 'd7' in each 'Idvalue_number' 一个选项是根据每个'Idvalue_number'中'd1','d7'的频率进行
filter
library(dplyr)
data_ %>%
group_by(Idvalue_number) %>%
filter(n() >= 3, sum(name == 'd1') == 1, sum(name == "d7")== 2)
# A tibble: 6 x 3
# Groups: Idvalue_number [2]
# Sample name Idvalue_number
# <chr> <chr> <int>
#1 a d1 1
#2 f d7 1
#3 b d7 1
#4 s d1 5
#5 g d7 5
#6 r d7 5
data_ <- structure(list(Sample = c("a", "f", "b", "s", "g", "r", "z",
"y", "d"), name = c("d1", "d7", "d7", "d1", "d7", "d7", "d1",
"d7", "d1"), Idvalue_number = c(1L, 1L, 1L, 5L, 5L, 5L, 7L, 7L,
7L)), class = "data.frame", row.names = c(NA, -9L))
One way that you can do it is shown below.] 您可以这样做的一种方式如下所示。]
library(tidyverse)
#create a dataframe for example
df = data.frame(Sample = c("a", "f", "b", "s", "g", "r", "z", "y", "d"),
name = c("d1", "d7", "d7", "d1", "d7", "d7", "d7", "d7","d7"),
Idvalue_number = c(1, 1, 1, 5, 5, 5, 7, 7, 7))
df %>% group_by(Idvalue_number, name) %>%
summarise(total = n()) %>%
filter(name == "d1" & total == 1 | name == "d7" & total == 2)
Idvalue_number name total
<dbl> <fct> <int>
1 1 d1 1
2 1 d7 2
3 5 d1 1
4 5 d7 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.