I'm trying to parse some specific text for like below. I have tried to use python re with r'[AZ]{5}[A-Z0-9]{2}'
expression but this is giving me unwanted text also. Please see below for the expected output.
Conditions:
Given String: "DHKGNC1, DHDHK32, DHKGN1K, SOME, GARBAGE, TEXT"
Expected output: ['DHKGNC1', 'DHDHK32', 'DHKGN1K']
Actual output: ['DHKGNC1', 'DHDHK32', 'DHKGN1K', 'GARBAGE']
Don't use [A-Z0-9]{2}
, use ([A-Z0-9][0-9])|([0-9][A-Z0-9])
That is, one or the other has to be a digit.
re.findall(r'([A-Z]{5}(?:(?:[A-Z0-9][0-9])|(?:[0-9][A-Z0-9])))', "DHKGNC1, DHDHK32, DHKGN1K, SOME, GARBAGE, TEXT")
['DHKGNC1', 'DHDHK32', 'DHKGN1K']
I think the following regex may work:
[A-Z]{5}([A-Z]?[0-9]{1,2}[A-Z]?)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.