简体   繁体   中英

Return a string between two characters '.'

I have column names similar to the following

names(df_woe)

# [1] "A_FLAG" "woe.ABCD.binned" "woe.EFGHIJ.binned"       
 ...

I would like to rename the columns by removing the "woe." and ".binned" sections, so that the following will be returned

names(df_woe)
# [1] "A_FLAG" "ABCD" "EFGHIJ"       
 ...

I have tried substr(names(df_woe), start, stop) but I am unsure how to set variable start/stop arguments.

Another possible and readable regex can be to create groups and return the group after the first and before the second dot, ie

gsub("(.*\\.)(.*)\\..+", "\\2", names(df_woe))
#[1] "A_FLAG" "ABCD"   "EFGH"
nam <- c("A_FLAG", "woe.ABCD.binned", "woe.EFGH.binned")
gsub("woe\\.|\\.binned", "", nam)
[1] "A_FLAG" "ABCD"   "EFGH"  

EDIT: a solution that deals with wierder cases such as woe..binned.binned

gsub("^woe\\.|\\.binned$", "", nam)

Another solution, using stringr package:

 str_replace_all("woe.ABCD.binned", pattern = "woe.|.binned", replacement = "")
 # [1] "ABCD"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM