[英]Python Regex - checking for a capital letter with a lowercase after
I am trying to check for a capital letter that has a lowercase letter coming directly after it. 我正在尝试检查一个大写字母,后面有一个小写字母。 The trick is that there is going to be a bunch of garbage capital letters and number coming directly before it.
诀窍是会有一堆垃圾大写字母和数字直接出现在它之前。 For example:
例如:
AASKH317298DIUANFProgramming is fun
as you can see, there is a bunch of stuff we don't need coming directly before the phrase we do need, Programming is fun
. 正如你所看到的,在我们需要的短语之前有一些我们不需要的东西,
Programming is fun
。
I am trying to use regex to do this by taking each string and then substituting it out with ''
as the original string does not have to be kept. 我试图使用正则表达式通过取每个字符串,然后用
''
代替它,因为原始字符串不必保留。
re.sub(r'^[A-Z0-9]*', '', string)
The problem with this code is that it leaves us with rogramming is fun
, as the P
is a capital letter. 这段代码的问题在于它让我们
rogramming is fun
,因为P
是大写字母。
How would I go about checking to make sure that if the next letter is a lowercase, then I should leave that capital untouched. 我如何检查以确保如果下一个字母是小写字母,那么我应该保持该资本不受影响。 (The
P
in Programming
) (
Programming
的P
)
Use a negative look-ahead: 使用否定前瞻:
re.sub(r'^[A-Z0-9]*(?![a-z])', '', string)
This matches any uppercase character or digit that is not followed by a lowercase character. 这匹配任何后面没有小写字符的大写字符或数字。
Demo: 演示:
>>> import re
>>> string = 'AASKH317298DIUANFProgramming is fun'
>>> re.sub(r'^[A-Z0-9]*(?![a-z])', '', string)
'Programming is fun'
You can also use match like this : 你也可以像这样使用匹配:
>>> import re
>>> s = 'AASKH317298DIUANFProgramming is fun'
>>> r = r'^.*([A-Z][a-z].*)$'
>>> m = re.match(r, s)
>>> if m:
... print(m.group(1))
...
Programming is fun
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.