简体   繁体   English

Python Regex - 检查大写字母后面的大写字母

[英]Python Regex - checking for a capital letter with a lowercase after

I am trying to check for a capital letter that has a lowercase letter coming directly after it. 我正在尝试检查一个大写字母,后面有一个小写字母。 The trick is that there is going to be a bunch of garbage capital letters and number coming directly before it. 诀窍是会有一堆垃圾大写字母和数字直接出现在它之前。 For example: 例如:

AASKH317298DIUANFProgramming is fun

as you can see, there is a bunch of stuff we don't need coming directly before the phrase we do need, Programming is fun . 正如你所看到的,在我们需要的短语之前有一些我们不需要的东西, Programming is fun

I am trying to use regex to do this by taking each string and then substituting it out with '' as the original string does not have to be kept. 我试图使用正则表达式通过取每个字符串,然后用''代替它,因为原始字符串不必保留。

re.sub(r'^[A-Z0-9]*', '', string)

The problem with this code is that it leaves us with rogramming is fun , as the P is a capital letter. 这段代码的问题在于它让我们rogramming is fun ,因为P是大写字母。

How would I go about checking to make sure that if the next letter is a lowercase, then I should leave that capital untouched. 我如何检查以确保如果下一个字母是小写字母,那么我应该保持该资本不受影响。 (The P in Programming ) ProgrammingP

Use a negative look-ahead: 使用否定前瞻:

re.sub(r'^[A-Z0-9]*(?![a-z])', '', string)

This matches any uppercase character or digit that is not followed by a lowercase character. 这匹配任何后面没有小写字符的大写字符或数字。

Demo: 演示:

>>> import re
>>> string = 'AASKH317298DIUANFProgramming is fun'
>>> re.sub(r'^[A-Z0-9]*(?![a-z])', '', string)
'Programming is fun'

You can also use match like this : 你也可以像这样使用匹配:

>>> import re
>>> s = 'AASKH317298DIUANFProgramming is fun'
>>> r = r'^.*([A-Z][a-z].*)$'
>>> m = re.match(r, s)
>>> if m:
...     print(m.group(1))
... 
Programming is fun

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM