[英]Remove one number at position n of the number in a string of numbers separated by slashes
I have a character column with this configuration:我有一个具有此配置的字符列:
data <- data.frame(
id = 1:3,
codes = c("08001301001", "08002401002 / 08002601003 / 17134604034", "08004701005 / 08005101001"))
I want to remove the 6th digit of any number within the string.我想删除字符串中任意数字的第 6 位。 The numbers are always 10 characters long.
数字始终为 10 个字符长。
My code works.我的代码有效。 However I believe it might be done easier using RegEx, but I couldn't figure it out.
但是我相信使用 RegEx 可能会更容易,但我无法弄清楚。
library(stringr)
remove_6_digit <- function(x){
idxs <- str_locate_all(x,"/")[[1]][,1]
for (idx in c(rev(idxs+7), 6)){
str_sub(x, idx, idx) <- ""
}
return(x)
}
result <- sapply(data$codes, remove_6_digit, USE.NAMES = F)
You can use您可以使用
gsub("\\b(\\d{5})\\d", "\\1", data$codes)
See the regex demo .请参阅正则表达式演示。 This will remove the 6th digit from the start of a digit sequence.
这将从数字序列的开头删除第 6 位。
Details :详情:
\b
- word boundary \b
- 单词边界(\d{5})
- Capturing group 1 ( \1
): five digits (\d{5})
- 捕获组 1 ( \1
):五位数\d
- a digit. \d
- 一个数字。 While word boundary looks enough for the current scenario, a digit boundary is also an option in case the numbers are glued to word chars:虽然单词边界对于当前场景来说已经足够了,但数字边界也是一种选择,以防数字粘附到单词字符上:
gsub("(?<!\\d)(\\d{5})\\d", "\\1", data$codes, perl=TRUE)
where perl=TRUE
enables the PCRE regex syntax and (?<!\d)
is a negative lookbehind that fails the match if there is a digit immediately to the left of the current location.其中
perl=TRUE
启用 PCRE 正则表达式语法,并且(?<!\d)
是一个负向后视,如果当前位置的左侧紧邻有一个数字,则匹配失败。
And if you must only change numeric char sequences of 10 digits (no shorter and no longer) you can use如果您必须只更改 10 位数字字符序列(不再更短),您可以使用
gsub("\\b(\\d{5})\\d(\\d{4})\\b", "\\1\\2", data$codes)
gsub("(?<!\\d)(\\d{5})\\d(?=\\d{4}(?!\\d))", "\\1", data$codes, perl=TRUE)
One remark though: your numbers consist of 11 digits, so you need to replace \\d{4}
with \\d{5}
, see this regex demo .不过请注意:您的号码由 11 位数字组成,因此您需要将
\\d{4}
替换为\\d{5}
,请参阅此正则表达式演示。
Another possible solution, using stringr::str_replace_all
and lookaround:另一种可能的解决方案,使用
stringr::str_replace_all
和 lookaround:
library(tidyverse)
data %>%
mutate(codes = str_replace_all(codes, "(?<=\\d{5})\\d(?=\\d{5})", ""))
#> id codes
#> 1 1 0800101001
#> 2 2 0800201002 / 0800201003 / 1713404034
#> 3 3 0800401005 / 0800501001
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.