简体   繁体   中英

Finding and replacing text in R

Recently, I have started to learn R and trying to explore more by automating the process. Below is the sample data and I'm trying to create a new column by finding and replacing the particular text within the label (colname:Designations).

Since, I'm getting this work with loads of new data I would like to automate using R programming than using excel formulas.

Dataset:

strings<-c("Zonal Manager","Department Manager","Network Manager","Head of Sales","Account Manager","Alliance Manager","Additional Manager","Senior Vice President","General manager","Senior Analyst", "Solution Architect","AGM")

R code i used:

t<-data.frame(strings,stringsAsFactors = FALSE)
colnames(t)[1]<-"Designations"
y<-sub(".*Manager*","Manager",strings,ignore.case = TRUE)

Challenge:
In this all the data got changed as Manager but I needed to replace other designations with the main themes.

I tried with ifelse statement, grep, grepl, str,sub, etc but I didn't get what I'm looking for

I can't use first/second/last words (as'delimit') since the main themes scatters to and fro.. Eg: Chief Information Officer or Commercial Finance Manager or AGM

Excel Work:
I have already coded 300 main themes as...

Manager (for all GM, Asst.Manager,Sales Manager,etc) Architect (Solution Arch, Sr. Arch, etc) Director (Senior Director, Director, Asst.Director, etc) Senior Analyst Analyst Head (for head of sales)

What I'm looking for: I needed to create a new column and should replace the text with the relevant main themes as I did in excel using R.

I'm ok if i can take the main themes that I have already coded in excel to match the themes using R programming (as vlookup in excel).

Expected result: enter image description here Thanks in advance for your help!

Yes, exactly the same thing I'm expeccting. Thanks!! But when I tried the same methodology by uploading the new dataset (excel file) and with

df %>% 
   mutate(theme=gsub(".*(Manager|Lead|Director|Head|Administrator|Executive|Executive|VP|President|Consultant|CFO|CTO|CEO|CMO|CDO|CIO|COO|Cheif Executive Officer|Chief Technological Officer|Chief Digital Officer|Chief Financial Officer|Chief Marketing Officer|Chief Digital Officer|Chief Information Officer,Chief Operations Officer)).*","\\1",Designations,ignore.case = TRUE))

it didn't work. Should I correct somewhere else.?

data:

strings<-c("Zonal Manager","Department Manager","Network Manager","Head of Sales","Account Manager",
           "Alliance Manager","Additional Manager","Senior Vice President","General manager","Senior Analyst", "Solution Architect","AGM")

you need to prepare a good look up table: (you complete it and make it perfect.)

lu_table <- data.frame(new = c("Manager", "Architect","Director"), old = c("Manager|GM","Architect|Arch","Director"), stringsAsFactors = F)

Then you can let mapply do the job:

mapply(function(new,old) {ans <- strings; ans[grepl(old,ans)]<-new; strings <<- ans; return(NULL)}, new = lu_table$new, old = lu_table$old)

now look at strings :

> strings
 [1] "Manager"               "Manager"               "Manager"               "Head of Sales"         "Manager"               "Manager"              
 [7] "Manager"               "Senior Vice President" "General manager"       "Senior Analyst"        "Architect"             "Manager" 

please note:

This solution uses <<- . So this might not be the best possible solution. But works in this case.

Do you mean something like this?

library(dplyr)
strings <-
  c(
    "Zonal Manager",
    "Department Manager",
    "Network Manager",
    "Head of Sales",
    "Account Manager",
    "Alliance Manager",
    "Additional Manager",
    "Senior Vice President",
    "General manager",
    "Senior Analyst",
    "Solution Architect",
    "AGM"
  )

df = data.frame(Designations = strings)


df %>%
  mutate(
    theme = gsub(
      ".*(manager|head|analyst|architect|agm|director|president).*",
      "\\1",
      Designations,
      ignore.case = TRUE
    )
  )
#>             Designations     theme
#> 1          Zonal Manager   Manager
#> 2     Department Manager   Manager
#> 3        Network Manager   Manager
#> 4          Head of Sales      Head
#> 5        Account Manager   Manager
#> 6       Alliance Manager   Manager
#> 7     Additional Manager   Manager
#> 8  Senior Vice President President
#> 9        General manager   manager
#> 10        Senior Analyst   Analyst
#> 11    Solution Architect Architect
#> 12                   AGM       AGM

Created on 2018-10-04 by the reprex package (v0.2.1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM