[英]create groups in a dataframe in R
hello i have the following dataframe你好我有以下数据框
n<-c(2,8,9,3,7,5,7,6,3,8,2,9,10,1)
tab<-data.frame("note"=n)
I need to add a new column that classifies if the number is less than 3 it will be group 1 if it is greater than 5 it will be group 2 from 5 to 7 it will be group 3 and from 7 to 10 group 4 as shown below我需要添加一个新列来分类,如果数字小于 3,它将是第 1 组,如果它大于 5,它将是第 2 组,从 5 到 7,它将是第 3 组,从 7 到 10 第 4 组,如图所示以下
One option is to use case_when
to define the groups:一种选择是使用
case_when
来定义组:
library(dplyr)
tab %>%
mutate(groups = case_when(note < 3 ~ 1,
note >= 3 & note < 7 ~ 2,
note == 7 ~ 3,
TRUE ~ 4))
Or another option using cut
:或使用
cut
的另一个选项:
tab %>%
mutate(groups = cut(tab$note, breaks = c(0, 2, 6, 7, 10), labels = 1:4))
Output输出
note groups
1 2 1
2 8 4
3 9 4
4 3 2
5 7 3
6 5 2
7 7 3
8 6 2
9 3 2
10 8 4
11 2 1
12 9 4
13 10 4
14 1 1
Base R (borrowing heavily from the latemail and AndrewGB) with a reusable function:具有可重用功能的 Base R(大量借鉴了 latemail 和 AndrewGB):
# Function to group the numeric data:
# group_numeric_data => function()
group_numeric_data <- function(num_vec, break_points){
# Compute the group values: group_vals => integer vector
group_vals <- seq_along(break_points)[-length(break_points)]
# Compute the groups: res => factor vector
res <- cut(
num_vec,
breaks = break_points,
labels = group_vals
)
# Explictly define returned object: factor vector => env
return(res)
}
# Define the break points: break_points => numeric vector
break_points <- c(-Inf, 2, 6, 7, 10)
# Apply the function: groups => factor vector
tab$groups <- group_numeric_data(
tab$note,
break_points
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.