简体   繁体   中英

using tidyr separate function to split by \ backslash

I would like to split text in a column by '' using the separate function in tidyr. Given this example data...

library(tidyr) 
df1 <- structure(list(Parent.objectId = 1:2, Attachment.path = c("photos_attachments\\photos_image-20220602-192146.jpg", 
    "photos_attachments\\photos_image-20220602-191635.jpg")), row.names = 1:2, class = "data.frame")

And I've tried multiple variations of this...

df2 <- df1 %>%
  separate(Attachment.path,c("a","b","c"),sep="\\",remove=FALSE,extra="drop",fill="right")

Which doesn't result in an error, but it doesn't split the string into two columns, likely because I'm not using the correct regular expression for the single backslash.

We may need to escape

library(tidyr)
separate(df1, Attachment.path,c("a","b","c"),
        sep= "\\\\", remove=FALSE, extra="drop", fill="right")

According to ?separate

sep -... The default value is a regular expression that matches any sequence of non-alphanumeric values.

By splitting on \, assuming you are trying to get folder and filenames, try these 2 functions:

#get filenames
basename(df1$Attachment.path)
# [1] "photos_image-20220602-192146.jpg" "photos_image-20220602-191635.jpg"

#get foldernames
basename(dirname(df1$Attachment.path))
# [1] "photos_attachments" "photos_attachments"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM