I have a data set that I have imported into R, but need to get rid of everything in the column names after "(". I've tried string.split(), sub(), and grepl()
functions, but no success. Any and all help would be appreciated!
I would like the following to become this:
Fruit => Fruit
Vegetables (Few) => Vegetable
Bread Crumbs => Bread Crumbs
Cheese (Cheddar) => Cheese
Yogurt (Plain%) => Yogurt
Using base R:
items <- c('Fruit', 'Vegetables (Few)', 'Bread Crumbs', 'Cheese (Cheddar)', 'Yogurt (Plain%)')
items_simplified <- trimws(gsub('\\(.*', '', items))
> items_simplified
[1] "Fruit" "Vegetables" "Bread Crumbs" "Cheese" "Yogurt"
You could also use stringr
from the tidyverse
package:
library(stringr)
items_stringr <- str_trim(str_extract(items, '[^(]*'))
> items_stringr
[1] "Fruit" "Vegetables" "Bread Crumbs" "Cheese" "Yogurt"
trimws
and str_trim
trim the trailing and leading whitespace from the items.
Use regex or regular expression
Like: /(.+)/g
And remove everything that is found
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.