简体   繁体   中英

Python regex for exact match

I want to check if my input has a length of {5,50} for any kind of characters. I used this code:

re.match('.{5,50}', my_string)

the problem is that when there are strings with more than 50 characters it doesn't return None. What should I do?

In this specific case, there's no need to use regex. Instead

5 <= len(my_string) <= 50

is sufficient. If you insist on using regex, make sure to include ^ for the beginning and $ for the end of the string:

re.match(r'^.{5,50}$', my_string)

^ is optional here, since re.match - in contrast to re.search - always starts matching at the start of the string.

If your user input can be something other than ASCII, such as u'เมื่อแรกเริ่ม' then the user perceived length of the string (9 Thai characters) can be quite different that the string length in bytes (13 bytes because of combining characters.)

>>> s=u'เมื่อแรกเริ่ม'    # 9 graphems
>>> len(s)
13

There is no instant, easy solution to this in Python since support for integration over a string by grapheme is lacking.

The easiest is to use the regex module that has more extensive unicode support than Python's re module. Then you can get the length of a string by graphemes:

>>> regex.findall(r'\X', s)
['เ', 'มื่', 'อ', 'แ', 'ร', 'ก', 'เ', 'ริ่', 'ม']
>>> len(regex.findall(r'\X', text))
9

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM