使用正则表达式在一行中搜索字符串

Question

I am searching for a string in the format XXXXX_XXXXX or XXXXXX_XXXXX or XXXXXX in a line, where X is alphanumeric. 我正在一行中搜索格式为XXXXX_XXXXX或XXXXXX_XXXXX或XXXXXX的字符串，其中X是字母数字。

So the string before "_" is 5 or 6 characters long and the string after "_" is always five or may be just 6 characters long without any underscore. 因此，“ _”之前的字符串的长度为5或6个字符，“ _”之后的字符串的长度始终为5或仅6个字符，没有任何下划线。 I am coding in Python. 我在用Python编码。

Any help will be much appreciated. 任何帮助都感激不尽。

Answer 1

Howabout this? 这个怎么样？

([a-zA-Z0-9]{5,6}_[a-zA-Z0-9]{5})|[a-zA-Z0-9]{6}

Full code example: 完整的代码示例：

import re
pat = re.compile(r'^(([a-zA-Z0-9]{5,6}_[a-zA-Z0-9]{5})|[a-zA-Z0-9]{6})$')
print pat.match('xxxxx_xxxxx') is not None    # True, 5 chars, underscore, 5 chars
print pat.match('xxxxxx_xxxxx') is not None    # True, 6 chars, underscore, 5 chars
print pat.match('xxxxxx') is not None    # True, 6 chars

NOTE: I previously wrote this, not realizing python doesn't support POSIX character classes 注意：我以前写过这个，没有意识到python不支持POSIX字符类

([[:alnum:]]{5,6}_[[:alnum:]]{5})|[[:alnum:]]{6}

Answer 2

import re and then: 然后重新导入：

re.match("[a-zA-Z0-9]{5,6}(_[a-zA-Z0-9]{5})?", c).group()

Note, that predefined \\w gets "_" as alphanum, so you cannot use it here. 请注意，预定义的\\ w以字母数字形式表示“ _”，因此您不能在此处使用它。

Answer 3

import re

regex = re.compile("[[:alnum:]]{5,6}_[[:alnum:]]{5})|[[:alnum:]]{6}")
here = re.search(regex, "your string")
if here:
     #pattern has been found

Answer 4

If Python doesen't assume start and end boundry conditions as a default, 如果Python不将开始和结束边界条件视为默认条件，
or, if searching for a string in a string, you may have to account for boundry conditions. 或者，如果要在字符串中搜索字符串，则可能必须考虑边界条件。
Otherwise, XXXXXXXXXXXXXXXXXXXXXX_XXXXXXXXXXXXXXXXXXXXXXX will be matched as well. 否则，XXXXXXXXXXXXXXXXXXXXXX_XXXXXXXXXXXXXXXXXXXXXXX也将被匹配。

/ (?: ^ | [\W_] )              # beginning of line or non-alphameric
  (?:
       [^\W_]{5,6}_[^\W_]{5}   # 5-6 alphameric's, underscore, 5 alphameric's
    |  [^\W_]{6}               # or, 6 alphameric's
  )
  (?: [\W_] | $)               # non-alphameric or end of line
/

Answer 5

I quite like Michał Šrajer's answer, but, as has been pointed out, his version also matches just 5 alnum characters (which we don't want). 我非常喜欢MichałŠrajer的答案，但是，正如已经指出的那样，他的版本也只匹配5个字母数字字符（我们不希望这样）。

Here's an edit of his version to compensate for that: 这是他的版本的编辑以弥补这一点：

re.match("[a-zA-Z0-9]{5}(([a-zA-Z0-9]?_[a-zA-Z0-9]{5})?|[a-zA-Z0-9])", c)

Though some of the other answers are probably more readable... 尽管其他一些答案可能更具可读性...

使用正则表达式在一行中搜索字符串

问题描述

5 个解决方案

解决方案1
3 2011-08-04 18:27:48

解决方案2
1 2011-08-04 18:31:36

解决方案3
0 2011-08-04 18:28:10

解决方案4
0

解决方案5
0 2011-08-09 14:28:57

使用正则表达式在一行中搜索字符串

问题描述

5 个解决方案

解决方案1 3 2011-08-04 18:27:48

解决方案2 1 2011-08-04 18:31:36

解决方案3 0 2011-08-04 18:28:10

解决方案4 0

解决方案5 0 2011-08-09 14:28:57

解决方案1
3 2011-08-04 18:27:48

解决方案2
1 2011-08-04 18:31:36

解决方案3
0 2011-08-04 18:28:10

解决方案4
0

解决方案5
0 2011-08-09 14:28:57