I have a dataframe that comes from a survey where the order the answer choices appeared is randomized, such that we can imagine this dataframe:
example_df <- data.frame(
Choice = c ('A', 'B', 'A', 'C'),
Option1 = c('A', 'A', 'C', 'B'),
Option2 = c('B', 'C', 'A', 'A'),
Option3 = c('C', 'B', 'B', 'C'),
stringsAsFactors = FALSE
)
Which will look like this:
choice Option1 Option2 Option3
A A B C
B A C B
A C A B
C B A C
What I'd like back is a new column in the dataframe where the value in the new column is the name of the column that the value in Choice appears in.
In the example, the new column would be:
The example df above would produce this new column:
Option1
Option3
Option2
Option3
We can use max.col
example_df$newcol <- names(example_df)[-1][max.col(example_df[-1] ==
example_df$Choice)]
-output
example_df
# Choice Option1 Option2 Option3 newcol
#1 A A B C Option1
#2 B A C B Option3
#3 A C A B Option2
#4 C B A C Option3
Or with tidyverse
library(dplyr)
example_df %>%
rowwise %>%
mutate(newcol = names(.)[-1][match(Choice,
c_across(starts_with('Option')))]) %>%
ungroup
-output
# A tibble: 4 x 5
# Choice Option1 Option2 Option3 newcol
# <chr> <chr> <chr> <chr> <chr>
#1 A A B C Option1
#2 B A C B Option3
#3 A C A B Option2
#4 C B A C Option3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.