简体   繁体   English

如何在Python中使用正则表达式匹配USN?

[英]how to match a USN using regular expression in Python?

Given an USN: 1722AB3401 and range: 3401 to 3470. 给定USN:1722AB3401,范围:3401至3470。

if 1722AB3433 is given as an input it should display valid USN, if not (for instance: 1722AB3499) it should say Invalid USN. 如果将1722AB3433作为输入给出,则应显示有效的USN,如果未输入(例如:1722AB3499),则应显示无效的USN。

How to solve this using python ? 如何使用python解决此问题?

I tried below approach (using Python 3.6.x): 我尝试了以下方法(使用Python 3.6.x):

import re

pattern = r"1722AB34[0-7][0-9]"

if re.search(pattern, "1722AB3471"):
    print("Valid USN")
else:
    print("Invalid USN")

But, if I try with input 1722AB3471 it would give me a wrong answer as the range is from *3401 to *3470 但是,如果我尝试使用输入1722AB3471,它将给出一个错误的答案,因为范围是从* 3401到* 3470

Note: USN is University Serial Number 注意:USN是大学序列号

Your expectation is wrong, since your regular expression clearly allows 3400..3479. 您的期望是错误的,因为您的正则表达式显然允许3400..3479。

I don't favour the attempt to get the validation completely using the regular expression, even if it would work with a complicated one like. 我不赞成尝试使用正则表达式完全获得验证,即使它可以与复杂的表达式一起使用也是如此。

pattern = r"1722AB34(([0-6][0-9])|70)"

I would try to extract the number following the characters and compare this numerically. 我会尝试提取字符后面的数字并进行数字比较。

Regexes for mixed number ranges tend to be quite complicated. 混合数字范围的正则表达式往往非常复杂。 In your case, you would need the following for the range 3401–3470: 在您的情况下,对于3401–3470范围,您需要执行以下操作:

34(0[1-9]|[1-6][0-9]|70)

It only gets more complicated if the ranges get longer and more mixed inside of decimal places. 如果范围越来越长且在小数位内混合得更多,它只会变得更加复杂。

A better way would be to simply extract that number part, making the validation outside of the regular expression: 更好的方法是简单地提取该数字部分,并在正则表达式之外进行验证:

usn = '1722AB3471'

m = re.match('1722AB(\d{4})', usn)
if m and 3401 <= int(m.group(1)) <= 3470:
    print('Valid USN')
else:
    print('Invalid USN')

I would favor simply testing the last four digits of your USN. 我希望仅测试您的USN的后四位数字。 Add an additional term to your if statement: 在您的if语句中添加一个附加术语:

import re

pattern = r"1722AB34[0-7][0-9]"

usn = "1722AB3470"

if re.search(pattern, usn) and int(usn[-4:]) in range(3401, 3472):
    print("Valid USN")
else:
    print("Invalid USN")

[0-7][0-9] will match from 00 to 79. You need to use: [0-7][0-9]范围是00到79。您需要使用:

pattern = r"1722AB34(0[1-9]|[1-6][0-9]|70)"

This will match from 01 to 09 or from 10 to 69 or 70 这将匹配01 to 0910 to 6970

But getting the four last digit, convert then to integers and compare would be a better way. 但是获取最后四个数字,然后转换为整数并进行比较将是更好的方法。

Building on the suggestions given in the other answers, the best way to find if the USN is valid, is to check the range of the last four digits of the USN. 根据其他答案中给出的建议,确定USN是否有效的最佳方法是检查USN的后四位数字的范围。

usn = '1722AB3469' # Given USN 
last_four_digits = int(usn[-4:]) # Slice the string to get only the last four digits, then convert it to an integer
unchanged_pattern = usn[:6] # Find the pattern that doesn't change across all USNs
if (last_four_digits >= 3401 and last_four_digits <= 3470) and unchanged_pattern == "1722AB": # Check the range of the extracted number, and also the unchanged pattern
    print('Valid USN')
else:
    print('Invalid USN')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM