简体   繁体   English

如何根据两个条件得到select数据?

[英]How to select data based on two conditions?

With use of R I want to prepare my data for analysis, and I want to select only cows that have been mated with both a X (= breed 1) bull, and a Y (= breed 2) bull.使用 R 我想准备我的数据进行分析,我只想 select 与 X(= 品种 1)公牛和 Y(= 品种 2)公牛交配的奶牛。 For now my data looks as follows:现在我的数据如下所示:

Cow奶牛 Parity平价 Bullbreed公牛品种
1 1个 1 1个 X X
1 1个 2 2个 X X
1 1个 3 3个 Y
2 2个 1 1个 X X
2 2个 2 2个 X X
2 2个 3 3个 X X
3 3个 1 1个 X X
3 3个 2 2个 Y
3 3个 3 3个 Y
4 4个 1 1个 Y
4 4个 2 2个 Y
4 4个 3 3个 Y

Cow 1 and 3 have been pregnant with two different bullbreeds, whereas cow 2 and 4 have only been pregnant with one type of bullbreed.奶牛 1 和 3 怀有两种不同的公牛品种,而奶牛 2 和 4 只怀有一种公牛品种。 I therefore want to take cow 2 and cow 4 (and all other animals that have been pregnant with only one type of bullbreed) out of my data to make it look like this:因此,我想从我的数据中取出奶牛 2 和奶牛 4(以及所有其他只怀有一种公牛品种的动物),使其看起来像这样:

Cow奶牛 Parity平价 Bullbreed公牛品种
1 1个 1 1个 X X
1 1个 2 2个 X X
1 1个 3 3个 Y
3 3个 1 1个 X X
3 3个 2 2个 Y
3 3个 3 3个 Y

In my real dataset I also only have two types of bullbreeds, but cownumbers are more specified instead of 1, 2, 3, 4, ..., N.在我的真实数据集中,我也只有两种类型的公牛品种,但奶牛编号更具体,而不是 1、2、3、4、...、N。

Is there an easy way to do this selection?有没有简单的方法来做这个选择?

I tried checking cows pregnant by only one bullbreed 'by hand', but my data exists of over 600,000 rows.我尝试“手动”检查仅由一种公牛品种怀孕的奶牛,但我的数据存在超过 600,000 行。 Therefore first checking which animals only have been pregnant with only breed X or Y, and then deleting those out of the data takes too long.因此,首先检查哪些动物仅怀有 X 或 Y 品种,然后将其从数据中删除会花费太长时间。

Using dplyr::n_distinct you could do:使用dplyr::n_distinct你可以这样做:

library(dplyr)

dat |> 
  group_by(Cow) |> 
  filter(n_distinct(Bullbreed) > 1) |> 
  ungroup()
#> # A tibble: 6 × 3
#>     Cow Parity Bullbreed
#>   <int>  <int> <chr>    
#> 1     1      1 X        
#> 2     1      2 X        
#> 3     1      3 Y        
#> 4     3      1 X        
#> 5     3      2 Y        
#> 6     3      3 Y

DATA数据

dat <- data.frame(
               Cow = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L),
            Parity = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L),
         Bullbreed = c("X","X","Y","X","X","X",
                       "X","Y","Y","Y","Y","Y")
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM