简体   繁体   中英

Create column which tells source of number between the first two columns

I have a data frame which has three columns:

df <- structure(list(lowage = c(45, 15, 9, 51, 22, 45, 4, 4, 9, 25), 
    highage = c(50, 21, 14, 60, 24, 50, 8, 8, 14, 30)), .Names = c("lowage", 
"highage"), row.names = c(NA, 10L), class = "data.frame")

df$random_number <- apply(df, 1, function(x) sample(seq(x[1], x[2]), 1))

I want to create a fourth column that tells us the source of the where the random_number comes from. So for example, in the first row, the column lowage = 45 and highage = 46. Say, the random number generated is 46 (for example). I'd like to create a fourth column where it says as a label 'highage' since it comes from the highage column. And so on...

If the solution can be in dplyr , that would be great!

Is this what you want?

df %>% 
  mutate(newcol = 
           case_when(random_number == lowage ~ "lowage", 
                     random_number == highage ~ "highage", 
                     TRUE ~ "between"))

#    lowage highage random_number  newcol
# 1      45      50            47 between
# 2      15      21            18 between
# 3       9      14            13 between
# 4      51      60            57 between
# 5      22      24            23 between
# 6      45      50            49 between
# 7       4       8             4  lowage
# 8       4       8             6 between
# 9       9      14             9  lowage
# 10     25      30            27 between

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM