正则表达式-字符串不包含特定字符

Question

I want a regular expression that would check if a string contains any character apart from "A" , "G", "C" , "U" e the string would be like ggggugcccgcuagagagacagu 我想要一个正则表达式，它可以检查字符串是否包含除“ A”，“ G”，“ C”，“ U”之外的任何字符，否则该字符串将类似于ggggugcccgcuagagagacagu

i want regex to check if it containns only these , it is not case sensitive. 我希望正则表达式检查是否仅包含这些，所以不区分大小写。

what i tried 我尝试过的

match= re.match(r'[^GaAgUuCc]',seq2)

It is to find non RNA characters in a RNA sequence 在RNA序列中发现非RNA特征

Answer 1

Use re.search instead: 使用re.search代替：

>>> re.search(r'[^GAUC]', 'acg', re.I)
>>> re.search(r'[^GAUC]', 'acgf', re.I)
<_sre.SRE_Match object at 0x7f1b6a9e32a0>

re.I makes the regex case-insensitive. re.I使正则表达式不区分大小写。

A faster way to do it would be to use sets to check if the set of characters is a subset of your allowed characters: 一种更快的方法是使用集合来检查字符集是否是允许的字符的子集：

>>> set('acg'.upper()) <= set('GAUC')
True
>>> set('acgs'.upper()) <= set('GAUC')
False

Answer 2

You need to use a quantifier with your regex to match more characters: - 您需要在正则表达式中使用量词以匹配更多字符：-

>>> match = re.search("[^GAUC]+","ggggugcccgcuagrrragagacagu", re.I)
>>> match
9: <_sre.SRE_Match object at 0x01BCA8A8>
>>> match.group()
10: 'rrr'

Answer 3

You should use re.search() or re.findall() rather than re.match() : 您应该使用re.search()或re.findall()而不是re.match() ：

In [9]: seq2 = 'ggggugcccQgcuagagaZgacagu'

In [10]: re.findall(r'[^GaAgUuCc]',seq2)
Out[10]: ['Q', 'Z']

正则表达式-字符串不包含特定字符

问题描述

3 个解决方案

解决方案1
2 2012-11-30 19:57:03

解决方案2
1 2012-11-30 19:56:35

解决方案3
1 已采纳 2012-11-30 19:56:47

正则表达式-字符串不包含特定字符

问题描述

3 个解决方案

解决方案1 2 2012-11-30 19:57:03

解决方案2 1 2012-11-30 19:56:35

解决方案3 1 已采纳 2012-11-30 19:56:47

解决方案1
2 2012-11-30 19:57:03

解决方案2
1 2012-11-30 19:56:35

解决方案3
1 已采纳 2012-11-30 19:56:47