简体   繁体   中英

Separating cells with several delimiters (splitstackshape)

I am working with a database that should be separated by several delimiters. The most common are semicolons and a point followed by a slash: './'.

How do I complete the code in order to apply both delimiters?

library(tidyverse)
library(splitstackshape)

values <- c("cat; dog; mouse", "cat ./ dog ./ mouse")
data <- data.frame(cbind(values))

separated <- cSplit(data.frame(data), "values", sep = ";", drop = TRUE)

I tried a vector solution but without much success.

I'm not exactly sure what your final output structure should be, but one approach could be to start with tidy::separate which would put all of your animals in a separate column:

df <- tidyr::separate(data, col = values, 
                into = c("Animal1", "Animal2", "Animal3"), 
                sep = c(";|./"))

#. Animal1 Animal2 Animal3
#1     cat     dog   mouse
#2     cat     dog   mouse

Without a pre-defined number of elements in each string, you could also try:

# Add in a third value to data with only 2 animals
values <- c("cat; dog; mouse", "cat ./ dog ./ mouse", "frog; squirrel")
data <- data.frame(cbind(values))


data_clean <- gsub(";|./", ";", data$values)
separated <- splitstackshape::cSplit(data.frame(values = data_clean), 
                                     "values", sep = ";", drop = TRUE)

#    values_1 values_2 values_3
# 1:      cat      dog    mouse
# 2:      cat      dog    mouse
# 3:     frog squirrel     <NA>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM