简体   繁体   English

子集 a dataframe - 分组和多个值 R

[英]Subset a dataframe - grouped and multiple values R

I am trying to subset a dataframe by multiple values in one column.我正在尝试通过一列中的多个值对 dataframe 进行子集化。

The input is the following:输入如下:

输入

Output should be: Output 应该是:

输出

So i want only the rows in the new dataframe, which contains 0 AND 1 in the column "Auto" - grouped for each shop.所以我只想要新的 dataframe 中的行,它在“自动”列中包含0 和 1 - 为每个商店分组。

Already tried this, but doesnt work:已经尝试过这个,但不起作用:

test <- subset(rawdata, Auto == 0 & Auto == 1) test <- subset(rawdata, min( Auto ) == 0 & max( Auto ) == 1) test<- rawdata[ which(rawdata$ Auto '==0 & rawdata$ Auto == 1), ]测试 <- 子集(原始数据, Auto == 0 & Auto == 1)测试 <- 子集(原始数据,最小( Auto )== 0 & 最大( Auto )== 1)测试 <- 原始数据[哪个(原始数据$ Auto '==0 & 原始数据 $ Auto == 1), ]

Thanks for any help.谢谢你的帮助。 Regards问候

It is not very clear what you are trying to do based of your question.根据您的问题,您正在尝试做什么不是很清楚。 If I interpreted it correctly, you want to keep every row of shops where 1s and 0s occur.如果我解释正确,您希望保留出现 1 和 0 的每一行商店。

To do this one possible solution might be to count the number of rows that each shop has and check wether that value is the same as the sum of auto (means all 1s) or equal to 0 (means all 0s).为此,一种可能的解决方案可能是计算每个商店的行数,并检查该值是否与 auto 的总和相同(表示全为 1)或等于 0(表示全为 0)。

If that criteria is met you want all rows of the shop to be excluded.如果满足该条件,您希望排除商店的所有行。

Look into the function summarise .查看 function总结

Is this what you're looking for?这是你要找的吗?

library(magrittr)
library(dplyr)

#Toy data.
df <- data.frame(Shop = c(1, 1, 1, 2, 2, 2, 3, 3, 3), 
                 Order = c(1, 2, 3, 1, 2, 3, 1 , 2, 3), 
                 Auto = c(0, 0, 1, 1, 1, 1, 0, 0, 0))

#Solution.
df %>% 
  group_by(Shop) %>%
  filter(n_distinct(Auto) > 1) %>%
  ungroup()

# # A tibble: 3 × 3
#    Shop Order  Auto
#   <dbl> <dbl> <dbl>
# 1     1     1     0
# 2     1     2     0
# 3     1     3     1

The key idea here is using dplyr::n_distinct() to count the number of unique values in Auto within each Shop group, and subsequently retaining only those groups that have more than 1 n_distinct values.这里的关键思想是使用dplyr::n_distinct()来计算每个Shop组中Auto唯一值的数量,然后仅保留那些具有超过1 n_distinct值的组。

Please do not add data as images, provide data in a reproducible format请不要以图像形式添加数据,以可重现的格式提供数据

You can select the Shop where both 0 and 1 are present.您可以 select 0 和 1 都存在的Shop

library(dplyr)

df %>%
  group_by(Shop) %>%
  filter(all(c(0, 1) %in% Auto)) %>%
  ungroup

#   Shop Order  Auto
#  <dbl> <dbl> <dbl>
#1     1     1     0
#2     1     2     0
#3     1     3     1

data数据

df <- structure(list(Shop = c(1, 1, 1, 2, 2, 2, 3, 3, 3), Order = c(1, 
2, 3, 1, 2, 3, 1, 2, 3), Auto = c(0, 0, 1, 1, 1, 1, 0, 0, 0)), 
class = "data.frame", row.names = c(NA, -9L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM