简体   繁体   中英

Extracting and Splitting numbers and characters from string in R

I am trying to split extract and split numbers and characters from strings. I also want to remove a few characters and numbers at the end of each string. For example, I have following strings.

dm<-c("2December2005MOMENT55", "3December2005ROYALS56", "1July2012ANGELS57")

I want to make them as

Day Month    Year
2   December 2005
3   December 2005
1   July     2012

Split and extract the values and put them under different variables.

I was trying this with strsplit command. But I couldn't proceed enough. I am really sorry I don't have codes for this.

I hope can have any command or code suggestions. Thank you!

  1. Convert to a date object (format '%d%B%Y' ( given the provided example))
  2. Use year , mday and month to get the data.frame you want

df <- data.frame(string = dm, date = as.Date(dm,format = '%d%B%Y'))
df[c('Day','Month','Year')] <- with(df, list(mday(date), 
                                             month.name[month(date)],
                                             year(date)))

Here is a regex solution:

library(stringr)
str_match(dm, "(^[0-9]{1,3})([A-z]+)([0-9]{4})")[, 2:4]
##      [,1] [,2]       [,3]  
## [1,] "2"  "December" "2005"
## [2,] "3"  "December" "2005"
## [3,] "1"  "July"     "2012"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM