简体   繁体   English

在 R 的数据框中创建组

[英]create groups in a dataframe in R

hello i have the following dataframe你好我有以下数据框


n<-c(2,8,9,3,7,5,7,6,3,8,2,9,10,1)
tab<-data.frame("note"=n)

I need to add a new column that classifies if the number is less than 3 it will be group 1 if it is greater than 5 it will be group 2 from 5 to 7 it will be group 3 and from 7 to 10 group 4 as shown below我需要添加一个新列来分类,如果数字小于 3,它将是第 1 组,如果它大于 5,它将是第 2 组,从 5 到 7,它将是第 3 组,从 7 到 10 第 4 组,如图所示以下

在此处输入图像描述

One option is to use case_when to define the groups:一种选择是使用case_when来定义组:

library(dplyr)

tab %>%
  mutate(groups = case_when(note < 3 ~ 1,
                           note >= 3 & note < 7 ~ 2,
                           note == 7 ~ 3,
                           TRUE ~ 4))

Or another option using cut :或使用cut的另一个选项:

tab %>% 
  mutate(groups = cut(tab$note, breaks = c(0, 2, 6, 7, 10), labels = 1:4))

Output输出

   note groups
1     2     1
2     8     4
3     9     4
4     3     2
5     7     3
6     5     2
7     7     3
8     6     2
9     3     2
10    8     4
11    2     1
12    9     4
13   10     4
14    1     1

Base R (borrowing heavily from the latemail and AndrewGB) with a reusable function:具有可重用功能的 Base R(大量借鉴了 latemail 和 AndrewGB):

# Function to group the numeric data: 
# group_numeric_data => function()
group_numeric_data <- function(num_vec, break_points){
  
 # Compute the group values: group_vals => integer vector
 group_vals <- seq_along(break_points)[-length(break_points)]
 
 # Compute the groups: res => factor vector
  res <- cut(
    num_vec, 
    breaks = break_points, 
    labels = group_vals
  )

 # Explictly define returned object: factor vector => env
 return(res)
 
}

# Define the break points: break_points => numeric vector
break_points <- c(-Inf, 2, 6, 7, 10)

# Apply the function: groups => factor vector
tab$groups <- group_numeric_data(
  tab$note, 
  break_points
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM