简体   繁体   中英

Python doesn't recognize regex pattern - Number between 1 and 6 digits

I have a problem, I try to recognize a pattern among a list of words. I need to find a number of 1 to 6 digits with or without characters around.

my input is this: [1]: https://i.stack.imgur.com/RNOdL.png

With the OCR I obtained:

Kundennummer:
21924

The pattern r"(\D|\A)+ \d{5} (\D|\Z)+" works but when I change it to r"(\D|\A)+ \d{1,6} (\D|\Z)+" it doesn't.

I used re.match, re.findall and re.search and none of them works

the repr():

'Kundennummer:'
'21924'

Assuming you only need the first match:

import re

ocr_result = """
Kundennummer:
21924
"""

for result in re.findall(r'\d+', ocr_result):
    if 1 <= len(result) <= 6:
        break
else:
    result = None

print(result)

Result:

21924
ocr_result1 = """
Kundennummer:
21924
"""

ocr_result2 = """
Kundennummer:3000
"""

for e in [ocr_result1, ocr_result2]:
    print(re.findall(r'\w*\d{1,6}\w*', e))


['21924']
['3000']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM