简体   繁体   中英

In R, split a column in to 2 columns

I am trying to split data frame hire into 2 columns based of percentage.

  group                    percentage
 0 hired                     60%  
 0 hired next_month          65% 
 0 or 1 hired                68% 
 0 or 1 hired next_month     70%  
 1 hired                     79% 
 1 or 2 employee             80% 
 2 retired                   85%
 2 or 3 fired                92%
 3 not-retired               96%

I want 2 columns group and decision output should be (column percentage and decision should be as it is no change, column group should be 0 if percentage is between 60% to 69% (3rd row), group should be 1 if percentage is between 70% to 79% (4th row), group should be 2 if percentage is between 80% to 89%, group should be 3 if percentage is between 90% to 99% )

  group   decision         percentage
    0     hired              60% 
    0     hired next_month   65% 
    0     hired              68% 
    1     hired next_month   70% 
    1     hired              79% 
    2     employee           80% 
    2     retired            85%
    3     fired              92% 
    3     not-retired        96% 

my code:

df1 <- structure(list(
           group = c("0 hired", "0 hired next_month ", "0 or 1 hired", 
            "0 or 1 hired next_month", "1 hired", "1 or 2 employee",
            "2 retired", "2 or 3 fired", "3 not-retired"), 
           percentage = c("60%", "65%", "68%", "70%", "79%", "80%", "89%", "90%", "96%") ), 
         .Names = c("group", "percentage"), class = "data.frame", row.names = c(NA, -9L))

df2 <- df1 %>% extract(group, into = c('group', 'decision'), "^(\\d+).*(hired|hired next_month|employee|retired|fired|not-retired)")%>% mutate(group = replace(group, parse_number(percentage)>=100, 3))

can anyone help. Thanks in advance

You can do this in base R like this

df2 = data.frame(percentage = df1$percentage)
df2$decision = sub(".*\\d\\s*", "", df1$group)
df2$group = as.numeric(cut(as.numeric(sub("%", "", df1$percentage)), 
    breaks = c(59, 69, 79,89,100))) - 1
df2 = df2[,3:1]
  group          decision percentage
1     0             hired        60%
2     0 hired next_month         65%
3     0             hired        68%
4     1  hired next_month        70%
5     1             hired        79%
6     2          employee        80%
7     2           retired        89%
8     3             fired        90%
9     3       not-retired        96%

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM