简体   繁体   English

正则表达式,忽略前面字符的结果

[英]Regex, ignore results with preceding character

I have a regex for matching URLs ((https?:\\/\\/)?[\\w-]+(\\.[\\w-]+)+\\.?(:\\d+)?(\\/\\S*)?) and it does the job, it works for what I want. 我有一个正则表达式用于匹配URL ((https?:\\/\\/)?[\\w-]+(\\.[\\w-]+)+\\.?(:\\d+)?(\\/\\S*)?)并完成工作,它可以满足我的需求。 However it will also match the domain for an e-mail when I don't want it to. 但是,如果我不希望它也与电子邮件的域相匹配。

Currently matchs: 当前匹配项:

  • http://www.foo.bar http://www.foo.bar
  • foo.bar foo.bar
  • website: foo.bar (matches the foo.bar part) 网站: foo.bar (与foo.bar部分匹配)
  • info@ foo.bar (matches the foo.bar part) info @ foo.bar (与foo.bar部分匹配)

I don't want it to match that last one, so it only matches the first three. 我不希望它与最后一个匹配,因此它仅与前三个匹配。 I tried adding (?!=@) to the front but that didn't do it. 我尝试在前面添加(?!=@) ,但没有这样做。 How can I get it to ignore results preceded with an @ symbol? 我如何才能忽略以@符号开头的结果?

Add anchors to your regex 将锚添加到您的正则表达式

^((https?:\/\/)?[\w-]+(\.[\w-]+)+\.?(:\d+)?(\/\S*)?)$

see the example http://regex101.com/r/lI8kZ6/1 参见示例http://regex101.com/r/lI8kZ6/1

Explantion Explantion

^ asserts the regex at the start of the line ^在行首断言正则表达式

$ asserts the regex at the end of the line $在行末断言正则表达式

EDIT 编辑

If the urls are embedded within text use \\s to delemit the regex match strings as 如果网址嵌入在文本中,请使用\\s删除正则表达式匹配字符串为

(\s|^)((https?:\/\/)?[\w-]+(\.[\w-]+)+\.?(:\d+)?(\/\S*)?)\s

see the example 看例子

http://regex101.com/r/lI8kZ6/3 http://regex101.com/r/lI8kZ6/3

Anchors will only work if your string only consists of the URL you want to match. 仅当您的字符串包含您要匹配的URL时,锚才会起作用。 This is probably not the case. 情况可能并非如此。

Instead, what you really want is to match where there is space (or nothing) before the URL. 相反,您真正想要的是匹配URL之前有空格(或没有空格)的位置。 Try: 尝试:

(?:^|(?<=\s))YOUR REGEX HERE

This will check if there is nothing, or a space character, before the regex you already have. 这将检查您已经拥有的正则表达式之前是否没有任何内容或空格字符。

Demo on regex101 regex101上的演示

Consider further adding (?=\\s|$) to the end of the regex, to ensure it doesn't match half a word. 考虑进一步在正则表达式的末尾添加(?=\\s|$) ,以确保它与半个单词不匹配。

^((https?:\/\/)?[\w-]+(\.[\w-]+)+\.?(:\d+)?(\/[\S]*)?)$

just add anchors to remove partial matching.Enable m or multliline flag.See demo. 只需添加锚点以删除部分匹配即可。启用mmultliline标志。 multliline参见演示。

http://regex101.com/r/sU3fA2/43 http://regex101.com/r/sU3fA2/43

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM