Having a dataframe like this:
data.frame(text = c("separate1: and: more","another 20: 42")
How is it possible to separate using the first : in every row? Example expected output
data.frame(text1 = c("separate1","another 20"), text2 = c("and: more","42")
In base you can use regexpr
to find the position of the first :
which can be used to extract substrings and trimws
to remove whitespaces.
x <- c("separate1: and: more","another 20: 42")
i <- regexpr(":", x)
data.frame(text1 = trimws(substr(x, 1, i-1)), text2 = trimws(substring(x, i+1)))
# text1 text2
#1 separate1 and: more
#2 another 20 42
library(reshape2)
df <- data.frame(text = c("separate1: and: more","another 20: 42")
colsplit(df$text, ":", c("text1", "text2"))
You can use str_split_fixed
from stringr
package which will by default split on the first delimiter, ie
stringr::str_split_fixed(d1$text, ':', 2)
# [,1] [,2]
#[1,] "separate1" " and: more"
#[2,] "another 20" " 42"
df <- data.frame(text = c("separate1: and: more","another 20: 42"))
df$text1 <- gsub(':.*', '', df$text)
df$text2 <- gsub('^[^:]+: ', '', df$text)
df
# text text1 text2
# 1 separate1: and: more separate1 and: more
# 2 another 20: 42 another 20 42
Using tidyr :
library(dplyr)
library(tidyr)
df %>%
separate(text, c("a", "b"), sep = ": ", extra = "merge")
# a b
# 1 separate1 and: more
# 2 another 20 42
Another base R solution
df <- do.call(rbind,lapply(as.character(df$text), function(x) {
k <- head(unlist(gregexpr(":",x)),1)
data.frame(text1 = substr(x,1,k-1),
text2 = substr(x,k+1,nchar(x)))
}))
such that
> df
text1 text2
1 separate1 and: more
2 another 20 42
Sorry, @Sotos is right, this isn't a duplicate. Here is another base solution that splits on first occurrence of delimiter.
df <- data.frame(text = c("separate1: and: more","another 20: 42"))
list <- apply(df, 1, function(x) regmatches(x, regexpr(":", x), invert = TRUE))
df <- data.frame(matrix(unlist(list), nrow = length(list), byrow = TRUE))
df
#> X1 X2
#> 1 separate1 and: more
#> 2 another 20 42
Created on 2020-02-10 by the reprex package (v0.2.1)
Poor old ?utils::strcapture
never gets any respect:
strcapture("^(.+?):(.+$)", df$text, proto=list(text1="", text2=""))
# text1 text2
#1 separate1 and: more
#2 another 20 42
Inserted back:
cbind(df, strcapture("^(.+?):(.+$)", df$text, proto=list(text1="", text2="")))
# text text1 text2
#1 separate1: and: more separate1 and: more
#2 another 20: 42 another 20 42
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.