簡體   English   中英

從 R 中的字符串中提取混合日期

[英]Extracting mixed date from string in R

我有一個字符向量,如下表所示,我想從中提取日期並將它們轉換為as.Date 例如,第一行是 09-11-2021。 字符串中的最后一個數字是列數而不是日期的一部分。

   <chr>                                                                       
 1 By Leigh-Ann Butler, Shannon Cobb, Michael R. DonaldsonNov 9, 20213 Comments
 2 By Leigh-Ann Butler, Shannon Cobb, Michael R. DonaldsonNov 8, 20212 Comments
 3 By Rick AndersonNov 4, 202114 Comments                                      
 4 By Victoria Ficarra, Rob JohnsonNov 3, 20215 Comments                       
 5 By Roger C. SchonfeldNov 1, 202123 Comments                                 
 6 By Joseph EspositoOct 29, 20211 Comment                                     
 7 By Brigitte ShullOct 20, 20216 Comments                 
example.data <- c("By Leigh-Ann Butler, Shannon Cobb, Michael R. DonaldsonNov 9, 20213 Comments",
"By Leigh-Ann Butler, Shannon Cobb, Michael R. DonaldsonNov 8, 20212 Comments",
"By Rick AndersonNov 4, 202114 Comments",                                      
"By Victoria Ficarra, Rob JohnsonNov 3, 20215 Comments")

你可以使用

as.Date(gsub(".+(\\w{3}\\s\\d{1,2},\\s\\d{4}).*", "\\1", example.data), format = "%b %d, %Y")

#> [1] "2021-11-09" "2021-11-08" "2021-11-04" "2021-11-03"
strcapture(".*(\\D{3})\\s+(\\d{1,2}),\\s+(\\d{4}).*", example.data, proto = list(mon="", day=0L, year=0L)) |> transform(date = as.Date(paste(mon, day, year), format = "%b %d %Y")) # mon day year date # 1 Nov 9 2021 2021-11-09 # 2 Nov 8 2021 2021-11-08 # 3 Nov 4 2021 2021-11-04 # 4 Nov 3 2021 2021-11-03

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM