简体   繁体   中英

R Removing words from a string in a dataframe

Let's say I have the following dataset:

Date_Received = c("Addition 1/2/2018", "Swimming Pool 1/8/2018", "Abandonment 1/9/2018", "Existing Approval 3/14/2018", "Holding Tank 5/11/2018")

Date_Approved = c("1/2/2018", "1/8/2018", "1/9/2018", "SB 3/21/2018", "JW 5/11/2018")

And I want to removed the characters before the date in the Date_Received column, so that I can later convert it to date type data format using lubridate .

I tried using the following code but it only removes the first uppercase alphabet.

How can I fix this?

Desired Output:

Date_Received Date_Approved 
1/2/2018      1/2/2018
1/8/2018      1/8/2018
1/9/2018      1/9/2018
3/14/2018     SB 3/21/2018
5/11/2018     JW 5/11/2018

Code

library(tidyverse)

df = data.frame(Date_Received, Date_Approved)

df= df%>% mutate(Date.Received = trimws(Date_Received, whitespace = "[A-Z]*\\s*")) %>% filter(nzchar(Date.Received)) 

We can use trimws , which has a whitespace argument (as you used in your code) that can be used to specify the whitespace.

library(dplyr)

df %>% 
  mutate(Date_Received = trimws(Date_Received, "left", "\\D"))

Or with str_replace_all :

library(stringr)

df %>% 
  mutate(Date_Received = str_replace_all(Date_Received, "^\\D+", ""))

Output

  Date_Received Date_Approved
1      1/2/2018      1/2/2018
2      1/8/2018      1/8/2018
3      1/9/2018      1/9/2018
4     3/14/2018  SB 3/21/2018
5     5/11/2018  JW 5/11/2018

Another option using sub :

df$Date_Received <- sub("^\\D+", "", df$Date_Received)

Keep life simple:

Date_Received = c("Addition 1/2/2018", "Swimming Pool 1/8/2018", "Abandonment 1/9/2018", "Existing Approval 3/14/2018", "Holding Tank 5/11/2018")
stringr::word(Date_Received, -1)
[1] "1/2/2018"  "1/8/2018"  "1/9/2018"  "3/14/2018" "5/11/2018"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM