简体   繁体   English

根据符合标准的观测数量在R中对数据集进行子集。 [R]

[英]Subsetting Dataset in R based on the number of observations that meet criteria. [R]

I have a dataset that looks like this: 我有一个如下所示的数据集:

Employee    Month       CSAT

ABROWN      February    4

ABROWN      January     5

ABROWN      March       3

ABROWN      March       5

JSMITH      February    5

JSMITH      January     3

JSMITH      February    5

JSMITH      March       5

JSMITH      February    5

JSMITH      January     4

Except of course much larger. 除了当然要大得多。 I'm trying to run analysis on Employee by month, but I don't want to include employees for whom there aren't enough observations in a certain month. 我试图按月对员工进行分析,但我不想包括在某个月内没有足够观察的员工。

For instance, lets say in this case, I only want to keep observation where an Employee has at least two CSAT scores in the same month. 例如,假设在这种情况下,我只想观察员工在同一个月内至少有两个CSAT分数。 In this case we would filter out observations 1,2, and 8. 在这种情况下,我们将过滤掉观察1,2和8。

I've messed with this for too long. 我已经搞砸了太久了。 And am at a loss. 我不知所措。

We can do this with data.table . 我们可以使用data.table来做到这data.table Convert the 'data.frame' to 'data.table' ( setDT(df1) ), grouped by 'Employee', 'Month', if the number of observations ( .N ) is greater than 1, Subset the Data.table ( .SD ) if观察数( .N )大于1,则将'data.frame'转换为'data.table'( setDT(df1) ),按'Employee','Month'分组,Subset the Data.table( .SD

library(data.table)
setDT(df1)[, if(.N >1) .SD,  by = .(Employee, Month)]
#   Employee    Month CSAT
#1:   ABROWN    March    3
#2:   ABROWN    March    5
#3:   JSMITH February    5
#4:   JSMITH February    5
#5:   JSMITH February    5
#6:   JSMITH  January    3
#7:   JSMITH  January    4

Or using dplyr with similar logic in filter after grouping by 'Employee', 'Month' 或者在“员工”,“月份”分组后在filter使用具有类似逻辑的dplyr

library(dplyr)
df1 %>%
   group_by(Employee, Month) %>%
   filter(n() >1)

Or using base R with ave to create a logical index filter the rows of 'df1'. 或者使用带有ave base R来创建逻辑索引过滤'df1'的行。

df1[with(df1, ave(CSAT, Employee, Month, FUN=length)>1),]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM