I've tried replace character which match with specific character or followed by "BT", but my codes failed. This is my codes:
df <- data.frame(
exposure = c("123BT", "113BB", "116BB", "117BT")
)
df %>%
mutate(
exposure2 = case_when(exposure == regmatches("d+\\BT") ~ paste0("-", exposure),
TRUE ~ exposure)
)
the error is:
Error: Problem with `mutate()` column `exposure2`.
i `exposure2 = case_when(...)`.
x argument "m" is missing, with no default
Run `rlang::last_error()` to see where the error occurred.
Whereas my target is:
df <- data.frame(
exposure = c("123BT", "113BB", "116BB", "117BT"),
exposure2 = c(-123, 113, 116, -117)
)
I recommend you use library stringr
, you can extract your numbers with regex (\\d)+
:
library(stringr)
library(dplyr)
df %>%
mutate(
exposure2 = case_when(str_detect(exposure,"BT") ~ paste0("-", str_extract(exposure, "(\\d)+")),
TRUE ~ str_extract(exposure, "(\\d)+"))
)
Output:
exposure exposure2
1 123BT -123
2 113BB 113
3 116BB 116
4 117BT -117
If you still prefer use regmatches
you can get same result with:
df %>%
mutate(
exposure2 = case_when(exposure %in% regmatches(exposure, regexpr("\\d+BT", exposure)) ~ paste0("-", regmatches(exposure, regexpr("\\d+", exposure))),
TRUE ~ regmatches(exposure, regexpr("\\d+", exposure)))
)
First, a concise solution that you can easily implement in your dplyr::mutate
. Using gsub
we remove characters and coerce the result as.integer
. The result, we multiply by 1
or -1
according to if the string contains "BT"
or not; for this we use grepl
(gives boolean) and add 1L
(coerces to integer) to get indices 1
or 2
.
c(1, -1)[grepl('BT', df$exposure) + 1L]*as.integer(gsub('\\D', '', df$exposure))
# [1] -123 113 116 -117
Above is the recommended solution. The solution you envision is much more complex since it processes the information not very efficient. I implement the logic in a small f
unction 1 to demonstrate.
f <- \(x) {
rm <- regmatches(x, regexpr("\\d+BT", x))
o <- gsub('\\D', '', x)
o <- ifelse(x %in% rm, paste0('-', o), o)
as.integer(o)
}
f(df$exposure)
# [1] -123 113 116 -117
1 Notes: For regmatches
you need matching info, eg from regexpr
. The regex should actually look sth like "\\d+BT"
.
Data:
df <- structure(list(exposure = c("123BT", "113BB", "116BB", "117BT"
)), class = "data.frame", row.names = c(NA, -4L))
library(readr)
(-1)^grepl('BT', df$exposure) * parse_number(df$exposure)
[1] -123 113 116 -117
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.