简体   繁体   English

正则表达式:区分下划线(_)和破折号(-)

[英]Regex: Differentiating underscore(_) and dash(-)

I want to construct a pattern that identifies the valid domain name. 我想构造一个识别有效域名的模式。 A valid domain name has alphanumeric characters and dashes in it. 有效域名中包含字母数字字符和破折号。 The only rule is that the name should not begin or end with a dash. 唯一的规则是名称不应以短划线开头或结尾。

I have a regular expression for the validation as ^\\w((\\w|-)*\\w)?$ 我有一个用于验证的正则表达式为^\\w((\\w|-)*\\w)?$

However the expression is validating the strings with underscores too (for ex.: cake_centre) which is wrong. 但是表达式也使用下划线来验证字符串(例如:cake_centre),这是错误的。 Can anyone tell me why this is happening and how it can be corrected? 谁能告诉我为什么会这样以及如何纠正呢?

PS: I am using preg_match() function in PHP for checking the validation. PS:我在PHP中使用preg_match()函数来检查验证。

The metacharacter \\w includes underscores, you can make a character class that will allow your listed requirements: 元字符\\w包含下划线,您可以创建一个字符类,以允许列出的要求:

[a-zA-Z\d-]

or per your regex: 或根据您的正则表达式:

^[a-zA-Z\d]([a-zA-Z\d-]*[a-zA-Z\d])?$

(Also note the - position in the character class is important, a - at the start or end is the literal value. If you have it in the middle it can create a range. What special characters must be escaped in regular expressions? ) (还要注意,字符类中的-位置很重要, -开头或结尾处的-是字面值。如果在中间,则可以创建范围。 必须在正则表达式中转义哪些特殊字符?

Underscores are being validated because they are part of the \\w character class. 下划线已被验证,因为它们是\\w字符类的一部分。 If you want to exclude it, try: 如果要排除它,请尝试:

/^[a-z0-9]+[a-z0-9\-]*[a-z0-9]+$/i

Here is the regexp with lookaround approach 这是带有环顾四周方法的regexp

 (?<!-)([a-zA-Z0-9_]+)(?!-)

regexp pattern is created in 3 groups

First group ^(?<!-) is negetive look back to ensure that matched chars does not have dash before

Second group ([a-zA-Z0-9_]+) give matching characters

Third group (?!-) is negetive lookahead to ensure match is not ending with dash 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM