简体   繁体   中英

Removing all characters after Special Character in Column Name

I have a data set that I have imported into R, but need to get rid of everything in the column names after "(". I've tried string.split(), sub(), and grepl() functions, but no success. Any and all help would be appreciated!

I would like the following to become this:

Fruit => Fruit

Vegetables (Few) => Vegetable

Bread Crumbs => Bread Crumbs

Cheese (Cheddar) => Cheese

Yogurt (Plain%) => Yogurt

Using base R:

items <- c('Fruit', 'Vegetables (Few)', 'Bread Crumbs', 'Cheese (Cheddar)', 'Yogurt (Plain%)')
items_simplified <- trimws(gsub('\\(.*', '', items))

> items_simplified
[1] "Fruit"        "Vegetables"   "Bread Crumbs" "Cheese"       "Yogurt"   

You could also use stringr from the tidyverse package:

library(stringr)
items_stringr <- str_trim(str_extract(items, '[^(]*'))

> items_stringr
[1] "Fruit"        "Vegetables"   "Bread Crumbs" "Cheese"       "Yogurt"      

trimws and str_trim trim the trailing and leading whitespace from the items.

Use regex or regular expression

Like: /(.+)/g

And remove everything that is found

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM