I have a vector of character strings:
cities <- c("London", "001 London", "Stockholm", "002 Stockholm")
I need to erase anything in each string that precedes first letter so that I would have:
cities <- c("London", "London", "Stockholm", "Stockholm")
I've tried eg this
cities <- sub("^.*?[a-zA-Z]", "", cities)
but that erases the first letter too, which I don't want to happen.
Use a negated character class to match all the non-alphabetic characters which exists at the start.
cities <- sub("^[^a-zA-Z]*", "", cities)
or
Use capturing group to capture the first letter character.
cities <- sub("^.*?([a-zA-Z])", "\\1", cities)
Use
cities <- c("London", "001 London", "Stockholm", "002 Stockholm")
gsub("^\\P{L}*", "", cities, perl=T)
See IDEONE demo
The ^\\\\P{L}*
regex means:
^
- Assert the beginning of the string \\\\P{L}*
- 0 or more characters other than a letter. This solution is preferable if you have city names starting with Unicode letters.
Delete number:
gsub('\\d+','',cities)
[1] "London" " London" "Stockholm" " Stockholm"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.