删除带有 2 个字母后跟 2 个数字的字母数字

Question

a <- c("it is ZZ10ASDJN123 and ZZ100DD22")

How can i remove the words starting with first 2 alphabets followed by starting 2 digit numbers and not remove any alphanumeric more than follows 2 + digit numbers.如何删除以前 2 个字母开头的单词，然后是 2 位数字，而不是删除超过 2 + 位数字的任何字母数字。

Expected output:预期输出：

"it is and ZZ100DD22"

This code removes the numbers alone.此代码仅删除数字。 Please help in geting me the expected output.请帮助我获得预期的输出。

gsub('[[:digit:]]+', '', a)

Answer 1

You may use您可以使用

gsub("\\s*\\b[A-Za-z]{2}\\d{2}(?!\\d)\\w*\\b", "", a, perl=TRUE)

See the regex demo .请参阅正则表达式演示。 An alternative:替代：

gsub("\\s*\\b[A-Za-z]{2}\\d{2}[A-Za-z_]\\w*\\b", "", a)

Details细节

\\s* - 0 or more whitespace chars \\s* - 0 个或多个空白字符
\\b - a word boundary \\b - 单词边界
[A-Za-z]{2} - two ASCII letters (use \\p{L} to match any Unicode letters) [A-Za-z]{2} - 两个 ASCII 字母（使用\\p{L}匹配任何 Unicode 字母）
\\d{2} - two digits \\d{2} - 两位数
(?!\\d) - there can be no digit immediately to the right (?!\\d) - 右边不能有数字
\\w* - 0 or more letters, digits or underscores \\w* - 0 个或多个字母、数字或下划线
\\b - word boundary. \\b - 字边界。

Add (*UCP) at the start of the regex to make it fully Uniocde-aware.在正则表达式的开头添加(*UCP)以使其完全识别 Uniocde。

R demo : R演示：

a <- c("it is ZZ10ASDJN123 and ZZ100DD22")
gsub("\\s*\\b[A-Za-z]{2}\\d{2}(?!\\d)\\w*", "", a, perl=TRUE)
## => [1] "it is and ZZ100DD22"

删除带有 2 个字母后跟 2 个数字的字母数字

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-09-07 10:01:44

删除带有 2 个字母后跟 2 个数字的字母数字

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-09-07 10:01:44

解决方案1
1 已采纳 2020-09-07 10:01:44