[英]How to split 2 dates that are in one column into 2 columns in R
我有一個包含兩個日期的日期列(例如開始日期 1/1/13 到結束日期 12/31/13")並且一些行包含不同的格式,例如(例如 1/1/13 到 12/31/13),(十月至五月)
我想將格式統一為 MM/DD/YYYY 並將開始日期和結束日期分成兩列。
如何擺脫字符並將兩個日期分開並將它們放入兩個單獨的列中,如附圖所示?
這可以在R中實現嗎?
您可以將正則表達式與stringr
和lubridate
:
df <- data.frame(range = c("1/1/13 to 12/31/13",
"5/5/15 to 10/27/15"))
df$from <- lubridate::mdy(stringr::str_extract(df$range,"^.*?(?=to)"))
df$to <- lubridate::mdy(stringr::str_extract(df$range,"(?=to).*?$"))
df
#> range from to
#> 1 1/1/13 to 12/31/13 2013-01-01 2013-12-31
#> 2 5/5/15 to 10/27/15 2015-05-05 2015-10-27
由reprex 包(v0.3.0) 於 2020 年 9 月 20 日創建
或者不轉換為日期:
library(dplyr)
df <- data.frame(range = c("1/1/13 to 12/31/13",
"5/5/15 to 10/27/15",
"October to November"))
df %>% mutate(from = stringr::str_extract(range,"^.*?(?= to)"),
to = stringr::str_extract(range,"(?<=to ).*?$"))
#> range from to
#> 1 1/1/13 to 12/31/13 1/1/13 12/31/13
#> 2 5/5/15 to 10/27/15 5/5/15 10/27/15
#> 3 October to November October November
由reprex 包(v0.3.0) 於 2020 年 9 月 20 日創建
您應該可以僅使用正則表達式並在 R 中反向引用Backreference來完成此操作。
dat<-data.frame(datestring=c("November to March","11/1/2001 to 12/8/2001"))
dat$from <- gsub("(.*) to (.*)","\\1",dat$datestring) # creates a new column 'from' that takes the first thing (before the 'to')
dat$to <- gsub("(.*) to (.*)","\\2",dat$datestring) # creates a new column 'to' that takes the second thing (after the 'to')
dat
datestring from to
1 November to March November March
2 11/1/2001 to 12/8/2001 11/1/2001 12/8/2001
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.