I have a dataframe with several columns. One of them is the column participant
, where different participant codes are listed. These are all either in the 100 range, the 200 range or the 500 range. For example: 101, 203, 209, 504, 103, 512
and so on.
I want to create an extra column in the dataframe called group
with 3 possible values: 100
, 200
and 500
. Thus, depending on the number a participant code starts with, it will be assigned one of these 3 labels.
I have tried using a combination of startsWith()
and ifelse
statements, but I can't make it work.
data$group = ifelse(startsWith(as.character(data$participant), "1"), "100",
((ifelse(startsWith(as.character(data$participant), "2"), "200",
(ifelse(startsWith(as.character(data$participant), "5"), "500")), NULL)))
simple tidyverse solution (similar to s__ soluiton.)
tibble(
participant = c(101, 203, 209, 504, 103, 512),
group = round(participant, -2)
)
# A tibble: 6 x 2
participant group
<dbl> <dbl>
1 101 100
2 203 200
3 209 200
4 504 500
5 103 100
6 512 500
Based on your examples and comments it looks like you want to divide a numeric value into ranges and assign a character label.
case_when
provides a straightforward option. It takes longer to type, but it may be more readable for people unfamiliar with cut
or more mathematical approaches.
tibble(old = c(101, 203, 209, 504, 103, 512)) %>%
mutate(
new = case_when(
old < 100 ~ NA_character_,
old < 200 ~ "100",
old < 300 ~ "200",
old < 400 ~ "300",
old < 500 ~ "400",
old < 600 ~ "500",
TRUE ~ NA_character_
)
)
Result
# A tibble: 6 x 2
old new
<dbl> <chr>
1 101 100
2 203 200
3 209 200
4 504 500
5 103 100
6 512 500
That said, the cut
function was designed to do precisely what you described, and has an option to specify the output labels.
old <- c(101, 203, 209, 504, 103, 512)
new <- cut(
x = old,
breaks = seq(from = 100, to = 600, by = 100),
labels = seq(from = 100, to = 500, by = 100)
)
as.character(new)
Result
[1] "100" "200" "200" "500" "100" "500"
May be this can be done more easily
(data$participant %/% 100) * 100
#[1] 100 200 200 500 100 500
In the OP's code, the last 'no' should be NA_character_
and not NULL
as NULL
returns with a length
of 0. eg
v1 <- c(10, 20, 5, 2, 40)
ifelse(v1 > 50, 3, NULL)
Error in ans[npos] <- rep(no, length.out = len)[npos]: replacement has length zero In addition: Warning message: In rep(no, length.out = len): 'x' is NULL so the result will be NULL
ifelse(v1 > 50, 3, NA)
#[1] NA NA NA NA NA
data <- structure(list(participant = c(101, 203, 209, 504, 103, 512)),
class = "data.frame", row.names = c(NA, -6L))
You can manage it also with round()
:
x <- c(101, 203, 209, 504, 103, 512)
round(x, -2)
[1] 100 200 200 500 100 500
In you case:
data$group <- round(data$participant, -2)
Using ifelse
:
data$group <- ifelse(data$participant > 100 & data$participant <= 200, 100,
ifelse(data$participant > 200 & data$participant <= 300, 200, 500))
Result:
data
participant group
1 101 100
2 203 200
3 209 200
4 504 500
5 103 100
6 512 500
Another option in data.table
you can try
library(data.table)
df <- data.table(participants=c(101, 203, 209, 504, 103, 512))
df[,groups:= (participants - participants%%100)]
participants groups
1: 101 100
2: 203 200
3: 209 200
4: 504 500
5: 103 100
6: 512 500
Not exactly your answer but you can use cut
function too, for instance, in data.table
it may look like this:
library(data.table)
df <- data.table(participants = c(101, 203, 209, 504, 103, 512))
df[, groups:=cut(participants, seq(100,600,100))]
participants groups
1: 101 (100,200]
2: 203 (200,300]
3: 209 (200,300]
4: 504 (500,600]
5: 103 (100,200]
6: 512 (500,600]
It's rather verbose but it's just another way:
library(dplyr)
participant <- c(101, 203, 209, 504, 103, 512)
df <- tibble(participant)
df %>%
mutate(group = case_when(
participant %in% 100:199 ~ 100,
participant %in% 200:299 ~ 200,
participant %in% 500:599 ~ 500
))
# A tibble: 6 x 2
participant group
<dbl> <dbl>
1 101 100
2 203 200
3 209 200
4 504 500
5 103 100
6 512 500
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.