简体   繁体   English

RegEx用于解析电子邮件的PHP如何工作?

[英]How does this RegEx for parsing emails work in PHP?

Okay, I have the following PHP code to extract an email address of the following two forms: 好的,我有以下PHP代码可提取以下两种形式的电子邮件地址:

Random Stranger <email@domain.com>
email@domain.com

Here is the PHP code: 这是PHP代码:

// The first example
$sender = "Random Stranger <email@domain.com>";

$pattern = '/([\w_-]*@[\w-\.]*)|.*<([\w_-]*@[\w-\.]*)>/';

preg_match($pattern,$sender,$matches,PREG_OFFSET_CAPTURE);

echo "<pre>";
print_r($matches);
echo "</pre><hr>";

// The second example
$sender = "user@domain.com";

preg_match($pattern,$sender,$matches,PREG_OFFSET_CAPTURE);

echo "<pre>";
print_r($matches);
echo "</pre>";

My question is... what is in $matches ? 我的问题是... $matches什么? It seems to be a strange collection of arrays. 这似乎是一个奇怪的数组集合。 Which index holds the match from the parenthesis? 哪个索引包含括号中的匹配项? How can I be sure I'm getting the email address and only the email address? 如何确定我收到电子邮件地址?

Update: 更新:

Here is the output: 这是输出:

Array
(
    [0] => Array
        (
            [0] => Random Stranger 
            [1] => 0
        )

    [1] => Array
        (
            [0] => 
            [1] => -1
        )

    [2] => Array
        (
            [0] => user@domain.com
            [1] => 5
        )

)
Array
(
    [0] => Array
        (
            [0] => user@domain.com
            [1] => 0
        )

    [1] => Array
        (
            [0] => user@domain.com
            [1] => 0
        )

)

This doesn't help you with your preg question but it will simplify your code. 这对您的预习题没有帮助,但可以简化您的代码。 Since those are the only 2 options, dont use regular expressions 由于只有这两个选项,因此请勿使用正则表达式

echo end( explode( '<', rtrim( $sender, '>' ) ) );

The preg_match() manual page explains how $matches works. preg_match()手册页介绍了$matches如何工作。 It's an optional parameter that gets filled with the results of any bracketed sub-expression from your regexp, in the order that they matched. 这是一个可选参数,它会以匹配的顺序填充正则表达式中任何带括号的子表达式的结果。 $matches[0] is always the entire expression match, followed by the sub-expressions. $matches[0]始终是整个表达式匹配项,后跟子表达式。

So for example, that pattern contains two sub-expression, ([\\w_-]*@[\\w-\\.]*) and ([\\w_-]*@[\\w-\\.]*) . 因此,例如,该模式包含两个子表达式([\\w_-]*@[\\w-\\.]*)([\\w_-]*@[\\w-\\.]*) The parts matching those two expressions will be put into $matches[1] and $matches[2] , respectively. 与这两个表达式匹配的部分将分别放入$matches[1]$matches[2] I would guess after a quick glance that for the email address of Random Stranger <email@domain.com> , you would have something like this in $matches : 快速浏览一下,我想对于Random Stranger <email@domain.com>的电子邮件地址,您在$matches中将具有以下内容:

Array( 
    0 => "Random Stranger <email@domain.com>",
    1 => "Random Stranger",
    2 => "email@domain.com"
)

Think of it as passing an array named $matches by reference, that gets filled with all the sub-parts that are matched. 可以认为它是通过引用传递一个名为$matches的数组,其中填充了所有匹配的子部分。

Edit - note that you are using the PREG_OFFSET_CAPTURE flag, which alters the behaviour of how $matches gets filled, so your result won't match my example. 编辑 -请注意,您正在使用PREG_OFFSET_CAPTURE标志,该标志会更改$matches填充方式的行为,因此您的结果将与我的示例不匹配。 The manual explains how this flag alters the capture as well. 手册说明了该标志也如何更改捕获。 In this case, instead of a set of matched sub-expressions, you get a multidimensional array of each expression with the position it was found at in the string. 在这种情况下,您将获得每个表达式的多维数组,而不是在字符串中找到的位置,而不是一组匹配的子表达式。

The following is copied directly from the help doc at http://us.php.net/preg_match 以下内容直接从http://us.php.net/preg_match的帮助文档中复制而来

If matches is provided, then it is filled with the results of search. 如果提供了匹配项,则将其填充为搜索结果。 $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on. $ matches [0]将包含与完整模式匹配的文本,$ matches [1]将具有与第一个捕获的带括号的子模式匹配的文本,依此类推。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM