简体   繁体   English

如何避免捕获给定正则表达式模式的主要组?

[英]How do I avoid capturing the primary group of a given regex pattern?

I have a regexp pattern: 我有一个正则表达式模式:

<^(([a-z]+)\:([0-9]+)\/?.*)$>

How do I avoid capturing the primary group? 如何避免占领主要人群?

<^(?:([a-z]+)\:([0-9]+)\/?.*)$>

The above pattern will still put the whole string 'localhost:8080' into the first (0) group. 上面的模式仍然会将整个字符串'localhost:8080'放入第一个(0)组。 But I need to get only 2 matched groups, so that first (0) group is populated with 'localhost' and second (1) with '8080'. 但是我只需要获得2个匹配的组,以便第一个(0)组填充为'localhost',第二个(1)填充为'8080'。

Where did I make a mistake? 我在哪里弄错了?

第一组0将始终是整个匹配项。

That's just the way the regex functions work. 这就是正则表达式功能的工作方式。 The first group is always the entire match. 第一组始终是整个比赛。 You can use array_shift to get rid of it. 您可以使用array_shift摆脱它。

http://www.php.net/manual/en/function.array-shift.php http://www.php.net/manual/zh/function.array-shift.php

In a regex $0 is always equal to match string and not one of the groupings. 在正则表达式中,$ 0始终等于匹配字符串,而不是分组之一。 Match groups will always start at $1. 比赛组的总起价为$ 1。 So look at $1 and $2 instead of $0 and $1. 因此,请看$ 1和$ 2而不是$ 0和$ 1。

If you are dealing with URLs, you can try using PEAR NetURL, or what might be better for you in this case would be parse-url() 如果您正在处理URL,则可以尝试使用PEAR NetURL,或者在这种情况下可能更适合您的方法是parse-url()

print_r(parse_url($url)); print_r(parse_url($ url));

from the docs : 文档

matches 火柴

If matches is provided, then it is filled with the results of search. 如果提供了匹配项,则将其填充为搜索结果。 $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on. $ matches [0]将包含与完整模式匹配的文本,$ matches [1]将具有与第一个捕获的带括号的子模式匹配的文本,依此类推。

if you don't care about the full match, you can use array_shift() to remove the unwanted element. 如果您不关心完全匹配,则可以使用array_shift()删除不需要的元素。

array_shift($matches);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM