根据符合标准的观测数量在R中对数据集进行子集。 [R]

Question

I have a dataset that looks like this: 我有一个如下所示的数据集：

Employee    Month       CSAT

ABROWN      February    4

ABROWN      January     5

ABROWN      March       3

ABROWN      March       5

JSMITH      February    5

JSMITH      January     3

JSMITH      February    5

JSMITH      March       5

JSMITH      February    5

JSMITH      January     4

Except of course much larger. 除了当然要大得多。 I'm trying to run analysis on Employee by month, but I don't want to include employees for whom there aren't enough observations in a certain month. 我试图按月对员工进行分析，但我不想包括在某个月内没有足够观察的员工。

For instance, lets say in this case, I only want to keep observation where an Employee has at least two CSAT scores in the same month. 例如，假设在这种情况下，我只想观察员工在同一个月内至少有两个CSAT分数。 In this case we would filter out observations 1,2, and 8. 在这种情况下，我们将过滤掉观察1,2和8。

I've messed with this for too long. 我已经搞砸了太久了。 And am at a loss. 我不知所措。

Answer 1

We can do this with data.table . 我们可以使用data.table来做到这data.table 。 Convert the 'data.frame' to 'data.table' ( setDT(df1) ), grouped by 'Employee', 'Month', if the number of observations ( .N ) is greater than 1, Subset the Data.table ( .SD ) if观察数（ .N ）大于1，则将'data.frame'转换为'data.table'（ setDT(df1) ），按'Employee'，'Month'分组，Subset the Data.table（ .SD ）

library(data.table)
setDT(df1)[, if(.N >1) .SD,  by = .(Employee, Month)]
#   Employee    Month CSAT
#1:   ABROWN    March    3
#2:   ABROWN    March    5
#3:   JSMITH February    5
#4:   JSMITH February    5
#5:   JSMITH February    5
#6:   JSMITH  January    3
#7:   JSMITH  January    4

Or using dplyr with similar logic in filter after grouping by 'Employee', 'Month' 或者在“员工”，“月份”分组后在filter使用具有类似逻辑的dplyr

library(dplyr)
df1 %>%
   group_by(Employee, Month) %>%
   filter(n() >1)

Or using base R with ave to create a logical index filter the rows of 'df1'. 或者使用带有ave base R来创建逻辑索引过滤'df1'的行。

df1[with(df1, ave(CSAT, Employee, Month, FUN=length)>1),]

根据符合标准的观测数量在R中对数据集进行子集。 [R]

问题描述

1 个解决方案

解决方案1
0 已采纳 2016-09-19 14:56:41

根据符合标准的观测数量在R中对数据集进行子集。 [R]

问题描述

1 个解决方案

解决方案1 0 已采纳 2016-09-19 14:56:41

解决方案1
0 已采纳 2016-09-19 14:56:41