简体   繁体   English

使用gsub和regexpr在r中子字符串化

[英]Subset a string in r using gsub and regexpr

I need to change the following 我需要更改以下内容

test <- c("August 08, 2016, Hour 23",
          "June 26, 2016, Hour 14",
          "November 26, 2016, Hour 01")

test1 <- c("Wednesday:8pm-12pm:31days",
"Tuesday:7pm-10pm:6days|Today:7AM-6PM:7days")

Edit:- In test1, I don't really care much about the day of the week, but am more interested in the timestamp. 编辑:-在test1中,我不太在意星期几,但对时间戳更感兴趣。 I would like to see 8PM-12PM converted into 24 hr time format as : 2000 - am agreeable with a string as an output as I require a 4 digit number. 我希望将8 PM-12PM转换为24小时格式,如:2000-因为我需要4位数字,所以可以将字符串作为输出。 (Anything before 10 AM would need to be 0x) (上午10点之前的任何内容都必须为0x)

into two datasets as:- 分为两个数据集:

a$date <- c(08/08/2016,06/26/2016,11/26/2016) # all in date class
a$hour <- c(23, 14 , 01) #all should be numeric


b$time <- c("2000","1922","0718") #can be character
b$days <- c(31,6,7)  #needs to be numeric

The logic for the hour and days cases would be similar. 小时和天情况的逻辑将是相似的。 I'm looking to use gsub and regexpr in R. 我想在R中使用gsubregexpr

My current process for the date section is too long and tedious:- 我当前的日期部分流程太长且乏味:

mat <- as.data.frame(matrix(unlist(strsplit(test," ")),ncol=5,byrow=T))

mat$V6 <-  str_replace_all(paste(as.numeric(str_replace_all(mat$V2,"[[:punct:]]","")),
                          "-",as.character(mat$V1),
                          "-",as.numeric(str_replace_all(mat$V3,"[[:punct:]]",""))),
                          "[[:space:]]","")


mat$V7 <- as.Date(mat$V6, format="%d-%B-%Y")

class(mat$V7)

mat$V8 <- as.numeric(as.character(mat$V5))

Any suggestions for using gsub and regexpr in both cases would be appreciated. 在这两种情况下使用gsubregexpr任何建议将不胜感激。

This does the same thing as your mat line. 这和垫子线一样。 Go ahead and try it. 继续尝试。

library(reshape2)
mat <- colsplit(test," ", c("M","D","YYYY","HR","Time"))

I think is your best bet, instead of using gsub or regexpr. 我认为这是您最好的选择,而不是使用gsub或regexpr。

mat$Len <- paste(mat$D,mat$M,mat$YYYY)
mat$Len <- gsub(",","",gsub(" ","-",mat$Len))

I am not a fan of using nested gsub's but it serves a purpose here. 我不喜欢使用嵌套的gsub,但是在这里可以达到目的。 Keeps this a bit more concise. 保持简洁。 This should take care of the mat$v6 line. 这应该注意mat $ v6行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM