[英]perl regular expression
I am having some URLs in this format. 我有一些这种格式的URL。 Some URLs contain
&abc=4
and some not. 有些网址包含
&abc=4
,有些则不含。
xxxxxxxxxxxxxxxxxxxxxxxxxxx&abc=4
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx&abc=4
xxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
here xxxxxxxxxxxxxxxxxxxxx
is string 此处
xxxxxxxxxxxxxxxxxxxxx
是字符串
I want to match URLs which have xxxxxxxxxxxxxxxxx
only and not &abc=4
(meaning I want to get these type of URLs, only xxxxxxxxxxxxxx
, xxxxxxxxxxxxxx
, xxx
) 我想匹配仅具有
xxxxxxxxxxxxxxxxx
而不具有&abc=4
的URL(这意味着我想获得这些类型的URL,仅具有xxxxxxxxxxxxxx
, xxxxxxxxxxxxxx
, xxx
)
I know how to write a regular expression which matches the entire url. 我知道如何编写与整个网址匹配的正则表达式。 For example:
/x.*abc=4/
例如:
/x.*abc=4/
But how do I write a regular expression that matches only xxxxxxxxxx
and not &abc=4
? 但是,如何编写仅匹配
xxxxxxxxxx
而不匹配&abc=4
的正则表达式?
I would use negative look-ahead assertion (Look ahead what is not allowed to follow my pattern) 我将使用否定的前瞻性断言(请注意不允许遵循的模式)
^(?!.*&abc=4$).*$
This pattern will match any string that does not end with &abc=4
此模式将匹配任何以
&abc=4
结尾的字符串
you can verify it online here: http://www.rubular.com/ 您可以在此处在线验证: http : //www.rubular.com/
Use negative lookbehind assertion . 在断言后使用否定性后置 The form is:
形式是:
(?<![&?]abc=4)
(this will also exclude ?abc=4
). (这还将排除
?abc=4
)。
Assuming your URLs are on each line, you can use: 假设您的网址在每一行中,则可以使用:
([^&]+?)
This basically will match anything up to the the first instance of &. 这基本上将匹配&的第一个实例之前的所有内容。
As @Benoit said, you can do this using a zero width expression to negate the capture of the query string, but you would be after a positive lookahead, and not a negative lookbehind, syntax example below: 正如@Benoit所说,您可以使用零宽度表达式来取消对查询字符串的捕获,但是您将采用正向先行而不是负向后行的语法示例:
(?=(&[^=]+?\d+)+)
As you can see though, this would complicate the expression a touch. 如您所见,这会使表达式变得复杂。
Hope this helps. 希望这可以帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.