I have a df which looks like this -
df <- data.frame(c = c('X.Int.2', 'BI', 'X.Int..4', 'BI.4', 'X.Int.6'),
d = sample(1:5, replace=T))
I am trying to remove all special characters, the 'X' and the numbers from col d.
I have tried
df %>%
mutate(c = gsub("\\s[0-9()]+", '', c))
and
df %>%
mutate(c = str_extract_all(c, "field:[a-zA-Z]+"))
Neither throw up an errors, but the first doesn't change the df and the second empties the column.
I'm clearly missing something obvious.
I'm hoping for -
c<-c('Int', "BI', 'Int', 'BI', 'Int')
In base R, you can try with gsub
:
gsub('[X.0-9]', '', df$c)
#> [1] "Int" "BI" "Int" "BI" "Int"
This removes character "X"
, "."
and numbers from c
column.
Remove X. and digits
str_remove_all(df$c, "[X.]|[:digit:]")
#> [1] "Int" "BI" "Int" "BI" "Int"
inside mutate:
df %>%
mutate(c = str_remove_all(c, "[X.]|[:digit:]"))
#> c d
#> 1 Int 4
#> 2 BI 1
#> 3 Int 2
#> 4 BI 3
#> 5 Int 5
Another option with gsub
gsub("[X.\\d+]", "", df$c, perl=TRUE)
#[1] "Int" "BI" "Int" "BI" "Int"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.