[英]Combination Regmatches and Replacement for specific Character
I've tried replace character which match with specific character or followed by "BT", but my codes failed.我试过替换与特定字符匹配或后跟“BT”的字符,但我的代码失败了。 This is my codes:
这是我的代码:
df <- data.frame(
exposure = c("123BT", "113BB", "116BB", "117BT")
)
df %>%
mutate(
exposure2 = case_when(exposure == regmatches("d+\\BT") ~ paste0("-", exposure),
TRUE ~ exposure)
)
the error is:错误是:
Error: Problem with `mutate()` column `exposure2`.
i `exposure2 = case_when(...)`.
x argument "m" is missing, with no default
Run `rlang::last_error()` to see where the error occurred.
Whereas my target is:而我的目标是:
df <- data.frame(
exposure = c("123BT", "113BB", "116BB", "117BT"),
exposure2 = c(-123, 113, 116, -117)
)
I recommend you use library stringr
, you can extract your numbers with regex (\\d)+
:我建议您使用库
stringr
,您可以使用正则表达式(\\d)+
提取您的数字:
library(stringr)
library(dplyr)
df %>%
mutate(
exposure2 = case_when(str_detect(exposure,"BT") ~ paste0("-", str_extract(exposure, "(\\d)+")),
TRUE ~ str_extract(exposure, "(\\d)+"))
)
Output: Output:
exposure exposure2
1 123BT -123
2 113BB 113
3 116BB 116
4 117BT -117
If you still prefer use regmatches
you can get same result with:如果您仍然喜欢使用
regmatches
,您可以获得相同的结果:
df %>%
mutate(
exposure2 = case_when(exposure %in% regmatches(exposure, regexpr("\\d+BT", exposure)) ~ paste0("-", regmatches(exposure, regexpr("\\d+", exposure))),
TRUE ~ regmatches(exposure, regexpr("\\d+", exposure)))
)
First, a concise solution that you can easily implement in your dplyr::mutate
.首先,您可以在
dplyr::mutate
中轻松实现一个简洁的解决方案。 Using gsub
we remove characters and coerce the result as.integer
.使用
gsub
我们删除字符并将结果强制为as.integer
。 The result, we multiply by 1
or -1
according to if the string contains "BT"
or not;结果,我们根据字符串是否包含
"BT"
乘以1
或-1
; for this we use grepl
(gives boolean) and add 1L
(coerces to integer) to get indices 1
or 2
.为此,我们使用
grepl
(给出布尔值)并添加1L
(强制转换为整数)以获得索引1
或2
。
c(1, -1)[grepl('BT', df$exposure) + 1L]*as.integer(gsub('\\D', '', df$exposure))
# [1] -123 113 116 -117
Above is the recommended solution.以上是推荐的解决方案。 The solution you envision is much more complex since it processes the information not very efficient.
您设想的解决方案要复杂得多,因为它处理信息的效率不是很高。 I implement the logic in a small
f
unction 1 to demonstrate.我在一个小
f
1中实现逻辑来演示。
f <- \(x) {
rm <- regmatches(x, regexpr("\\d+BT", x))
o <- gsub('\\D', '', x)
o <- ifelse(x %in% rm, paste0('-', o), o)
as.integer(o)
}
f(df$exposure)
# [1] -123 113 116 -117
1 Notes: For regmatches
you need matching info, eg from regexpr
. 1注意:对于正
regexpr
regmatches
信息。 The regex should actually look sth like "\\d+BT"
.正则表达式实际上应该看起来像
"\\d+BT"
。
Data:数据:
df <- structure(list(exposure = c("123BT", "113BB", "116BB", "117BT"
)), class = "data.frame", row.names = c(NA, -4L))
library(readr)
(-1)^grepl('BT', df$exposure) * parse_number(df$exposure)
[1] -123 113 116 -117
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.