匹配字符串末尾的某些数字

Question

I have a vector of strings: 我有一个字符串向量：

s <- c('abc1',   'abc2',   'abc3',   'abc11',   'abc12', 
       'abcde1', 'abcde2', 'abcde3', 'abcde11', 'abcde12', 
       'nonsense')

I would like a regular expression to match only the strings that begin with abc and end with 3 , 11 , or 12 . 我想一个正则表达式匹配只与开始字符串abc和结束3 ， 11 ，或12 。 In other words, the regex has to exclude abc1 but not abc11 , abc2 but not abc12 , and so on. 换句话说，正则表达式必须排除abc1而不是abc11 ， abc2而不是abc12 ，依此类推。

I thought that this would be easy to do with lookahead assertions, but I haven't found a way. 我认为使用前瞻断言很容易做到，但我找不到办法。 Is there one? 有吗？

EDIT: Thanks to posters below for pointing out a serious ambiguity in the original post. 编辑：感谢下面的海报，指出原帖中的严重歧义。

In reality, I have many strings. 实际上，我有许多字符串。 They all end in digits: some in 0, some in 9, some in the digits in between. 它们都以数字结尾：一些在0中，一些在9中，一些在数字之间。 I am looking for a regex that will match all strings except those that end with a letter followed by a 1 or a 2. (The regex should also match only those strings that start with abc , but that's an easy problem.) 我正在寻找一个匹配所有字符串的正则表达式，除了以字母后跟1或2结尾的字符串。（正则表达式也应该只匹配那些以abc开头的字符串，但这很容易出问题。）

I tried to use negative lookahead assertions to create such a regex. 我试图使用负前瞻断言来创建这样的正则表达式。 But I didn't have any success. 但我没有任何成功。

Thanks to all who replied and commented. 感谢所有回复和评论的人。 Inspired by several of you, I ended up using this combination: grepl('^abc', s) & !grepl('[[:lower:]][12]$', s) . 受到你们几个人的启发，我最终使用了这个组合： grepl('^abc', s) & !grepl('[[:lower:]][12]$', s) 。

Answer 1

Is this what you want? 这是你想要的吗？

s[grepl("abc.*(3|11|12)", s)]
[1] "abc3"    "abc11"   "abc12"   "abcde3"  "abcde11" "abcde12"

And the excluded strings are: 被排除的字符串是：

s[!grepl("abc.*(3|11|12)", s)]
[1] "abc1"     "abc2"     "abcde1"   "abcde2"   "nonsense"

Edit: As the comments indicate, there is some ambiguity in your requirements. 编辑：正如评论所示，您的要求存在一些模糊性。 A more comprehensive regex will test for the string start ^ and string end $ and possibly only allow alphabet characters [[:alpha:]] before the final digits: 更全面的正则表达式将测试字符串start ^和string end $并且可能只允许字母字符[[:alpha:]]在最终数字之前：

s[grepl("^abc[[:alpha:]]*.*(3|11|12)$", s)]
[1] "abc3"    "abc11"   "abc12"   "abcde3"  "abcde11" "abcde12"

You can also get grep to return the values directly, by passing the argument value=TRUE , thus saving a bit of duplication in the code: 您还可以通过传递参数value=TRUE来获取grep以直接返回值，从而在代码中保存一些重复：

grep("^abc[[:alpha:]]*.*(3|11|12)$", s, value=TRUE)
[1] "abc3"    "abc11"   "abc12"   "abcde3"  "abcde11" "abcde12"

Answer 2

Instead of one complicated regular expression, in this case I think it's easier to use two simple regular expressions: 在这种情况下，我认为使用两个简单的正则表达式更容易，而不是一个复杂的正则表达式：

s <- c('abc1',   'abc2',   'abc3',   'abc11',   'abc12', 
       'abcde1', 'abcde2', 'abcde3', 'abcde11', 'abcde12', 
       'nonsense')

s[grepl("^abc", s) & grepl("(3|11|12)$", s)]

Answer 3

You could use substring in this case too: 在这种情况下你也可以使用substring ：

z <- nchar(s)
s[substring(s, 1, 3) == "abc" & substring(s, z) == "3" | 
    substring(s, z-1) %in%  c("12", "11")]

Answer 4

Looking specifically for the requested numbers gives this: 专门寻找所需的数字给出了：

n <-  c(3,11,12)

s[sub('abc[^[:digit:]]*([[:digit:]]+)$',s, replacement='\\1') %in% n]
 [1] "abc3"    "abc11"   "abc12"   "abcde3"  "abcde11" "abcde12"

This doesn't confuse 11 for 1: 这不会混淆11为1：

 n <-  c(3,1,12)

s[sub('abc[^[:digit:]]*([[:digit:]]+)$',s, replacement='\\1') %in% n]
 [1] "abc1"    "abc3"    "abc12"   "abcde1"  "abcde3"  "abcde12"

For your edit, not ending in 1 or 2 (and using two regular expressions) 对于您的编辑，不以1或2结尾（并使用两个正则表达式）

s[grepl('^abc',s) & !(sub('.*[^[:digit:]]([[:digit:]]+)$',s, replacement='\\1') %in% c(1,2))]
[1] "abc3"    "abc11"   "abc12"   "abcde3"  "abcde11" "abcde12"

匹配字符串末尾的某些数字

问题描述

4 个解决方案

解决方案1
3 2012-11-21 22:14:00

解决方案2
3 已采纳 2012-11-21 22:34:13

解决方案3
1 2012-11-21 22:18:38

解决方案4
0 2012-11-21 22:39:15

匹配字符串末尾的某些数字

问题描述

4 个解决方案

解决方案1 3 2012-11-21 22:14:00

解决方案2 3 已采纳 2012-11-21 22:34:13

解决方案3 1 2012-11-21 22:18:38

解决方案4 0 2012-11-21 22:39:15

解决方案1
3 2012-11-21 22:14:00

解决方案2
3 已采纳 2012-11-21 22:34:13

解决方案3
1 2012-11-21 22:18:38

解决方案4
0 2012-11-21 22:39:15