Objective is to create two new columns from the Code column of the data below. One with the numbers and another with the codes (factor). How do I do that? I tried ifelse()
, but it gave erro.
structure(list(Potreiro = structure(c(3L, 3L, 3L, 3L,
3L, 4L, 3L, 4L, 3L, 4L), .Label = c("1A", "6B", "7A", "7B"), class =
"factor"), Code = structure(c(4L, 1L, 8L, 3L, 2L, 4L, 6L, 5L, 8L, 7L ),
.Label = c("2", "3", "4", "5", "50%", "70%", "ac", "ad", "av", "cd", "de",
"Dem"), class = "factor")), .Names = c("Potreiro", "Code"), row.names =
c(NA, 10L), class = "data.frame")
thanks!!!
library(dplyr)
library(stringr)
df <- structure(list(Potreiro = structure(c(3L, 3L, 3L, 3L, 3L, 4L, 3L, 4L, 3L, 4L),
.Label = c("1A", "6B", "7A", "7B"),
class = "factor"),
Code = structure(c(4L, 1L, 8L, 3L, 2L, 4L, 6L, 5L, 8L, 7L ),
.Label = c("2", "3", "4", "5", "50%", "70%", "ac", "ad", "av", "cd", "de", "Dem"),
class = "factor")),
.Names = c("Potreiro", "Code"),
row.names = c(NA, 10L),
class = "data.frame")
df %>%
mutate(
number = str_extract_all(Code, "\\d+"),
word = str_extract(Code, "\\D[^%]")
)
The number
variable regex is looking for digits and will match at least once \\\\d+
. The word
variable regex is looking for not a number while stripping off the % signs.
The result:
Potreiro Code number word
1 7A 5 5 <NA>
2 7A 2 2 <NA>
3 7A ad ad
4 7A 4 4 <NA>
5 7A 3 3 <NA>
6 7B 5 5 <NA>
7 7A 70% 70 <NA>
8 7B 50% 50 <NA>
9 7A ad ad
10 7B ac ac
I would do this:
df <-
structure(list(Potreiro = structure(c(3L, 3L, 3L, 3L,
3L, 4L, 3L, 4L, 3L, 4L), .Label = c("1A", "6B", "7A", "7B"), class =
"factor"), Code = structure(c(4L, 1L, 8L, 3L, 2L, 4L, 6L, 5L, 8L, 7L ),
.Label = c("2", "3", "4", "5", "50%", "70%", "ac", "ad", "av", "cd", "de",
"Dem"), class = "factor")), .Names = c("Potreiro", "Code"), row.names =
c(NA, 10L), class = "data.frame")
arenum <- sapply(df$Code, function (x) grepl('[[:digit:]]', x))
df$codenum <- ifelse(arenum, as.character(df$Code), NaN)
df$codechar <- ifelse(!arenum, as.character(df$Code), NaN)
df
In case you don't really want anything else than numbers change arnum:
arenum <- sapply(df$Code, function (x) gsub('[[:digit:]]', '', x) == '')
Here is an option using extract
library(dplyr)
library(tidyr)
df %>%
extract(Code, into = c('number', 'word'), '(\\d*)([a-z]*)', remove = FALSE, convert = TRUE)
# Potreiro Code number word
#1 7A 5 5
#2 7A 2 2
#3 7A ad NA ad
#4 7A 4 4
#5 7A 3 3
#6 7B 5 5
#7 7A 70% 70
#8 7B 50% 50
#9 7A ad NA ad
#10 7B ac NA ac
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.