I have a column in my dataframe:
Colname
20151102
19920311
20130204
>=70
60-69
20-29
I wish to split this column into two columns like:
Col1 Col2
20151102
19920311
20130204
>=70
60-69
20-29
How can I achieve this result?
One possible solution, the idea is to use extract
from tidyr
. Note that the delimiter I choose (the dot) must not appear in your initial data.frame
.
library(magrittr)
library(tidyr)
df$colname = df$colname %>%
grepl("[>=|-]+", .) %>%
ifelse(paste0(".", df$colname), paste0(df$colname, "."))
extract(df, colname, c("col1","col2"), "(.*)\\.(.*)")
# col1 col2
#1 222222
#2 1111111
#3 >=70
#4 60-69
#5 20-29
Data:
df = data.frame(colname=c("222222","1111111",">=70","60-69","20-29"))
Without the need of any package:
df[,c("Col1", "Col2")] <- ""
isnum <- suppressWarnings(!is.na(as.numeric(df$colname)))
df$Col1[isnum] <- df$colname[isnum]
df$Col2[!isnum] <- df$colname[!isnum]
df <- df[,!(names(df) %in% "colname")]
Data:
df = data.frame(colname=c("20151102","19920311","20130204",">=70","60-69","20-29"), stringsAsFactors=FALSE)
Here is a single statement solution. read.pattern
captures the two field types separately in the parts of the regular expression surrounded by parentheses. ( format
can be omitted if the Colname
column is already of class "character"
. Also, if it were desired to have the first column numeric then omit the colClasses
argument.)
library(gsubfn)
read.pattern(text = format(DF$Colname), pattern = "(^\\d+$)|(.*)",
col.names = c("Col1", "Col2"), colClasses = "character")
giving:
col1 col2
1 20151102
2 19920311
3 20130204
4 >=70
5 60-69
6 20-29
Note: Here is a visualization of the regular expression used:
(^\d+$)|(.*)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.