简体   繁体   中英

How to remove leading digits and dot from character vector?

I have a dataframe which only has one column. In that column there are two types of data:

  1. only character
  2. "number.character"

I want to find the second type of data and delete the number and the dot. I first convert the data from factors to characters. Then I used 'strsplit' to split the second type of data, but it did not work.

An example of my data:

df <- data.frame(Col1 = c("ab","12.cd","cc","dd","34.af"), stringsAsFactors=FALSE)

I want to find "12.cd" and "34.af" and turn into "cd" and "af"

Could anyone please kindly solve this?

We can match one or more numbers ( [^0-9]+ ) followed by a . from the start ( ^ ) of the string and replace it with blank ( "" )

df$Col1 <- sub("^[0-9]+\\.", "", df$Col1)
df$Col1
#[1] "ab" "cd" "cc" "dd" "af"

Or another option is to match all non alphabetic characters and replace with blank

sub("[^[:alpha:]]+", "", df$Col1)

data

df <- data.frame(Col1 = c("ab","12.cd","cc","dd","34.af"), stringsAsFactors=FALSE)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM