简体   繁体   中英

Python Regex - checking for a capital letter with a lowercase after

I am trying to check for a capital letter that has a lowercase letter coming directly after it. The trick is that there is going to be a bunch of garbage capital letters and number coming directly before it. For example:

AASKH317298DIUANFProgramming is fun

as you can see, there is a bunch of stuff we don't need coming directly before the phrase we do need, Programming is fun .

I am trying to use regex to do this by taking each string and then substituting it out with '' as the original string does not have to be kept.

re.sub(r'^[A-Z0-9]*', '', string)

The problem with this code is that it leaves us with rogramming is fun , as the P is a capital letter.

How would I go about checking to make sure that if the next letter is a lowercase, then I should leave that capital untouched. (The P in Programming )

Use a negative look-ahead:

re.sub(r'^[A-Z0-9]*(?![a-z])', '', string)

This matches any uppercase character or digit that is not followed by a lowercase character.

Demo:

>>> import re
>>> string = 'AASKH317298DIUANFProgramming is fun'
>>> re.sub(r'^[A-Z0-9]*(?![a-z])', '', string)
'Programming is fun'

You can also use match like this :

>>> import re
>>> s = 'AASKH317298DIUANFProgramming is fun'
>>> r = r'^.*([A-Z][a-z].*)$'
>>> m = re.match(r, s)
>>> if m:
...     print(m.group(1))
... 
Programming is fun

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM