简体   繁体   English

使用量词的多个正则表达式捕获组

[英]Multiple regex capture groups using quantifiers

Is there a way to get multiple capture groups out of a regex that is using quantifiers? 有没有办法从正在使用量词的正则表达式中获取多个捕获组? For example, say I have this data (simplified from what I have to deal with): 例如,假设我有这些数据(简化了我必须处理的内容):

<td>Data 1</td>
<td>data 2</td>
<td>data 3</td>
<td>data 4</td>

Right now, if I write a regex like this: 现在,如果我写这样的正则表达式:

(?:<td>(.+?)<\/td>\s*){4}

I end up with only one capture group, the last one "data 4" . 我最终只有一个捕获组,最后一个是“数据4” Is there a way to use the quantifier and end up with 4 capture groups, or am I forced to write the regex like this to get what I want: 有没有办法使用量词并最终得到4个捕获组,或者我被迫写这样的正则表达式来得到我想要的东西:

<td>(.+?)<\/td>\s*<td>(.+?)<\/td>\s*<td>(.+?)<\/td>\s*<td>(.+?)<\/td>

Yes, I am well aware that I can hack this simple example up much easier programmatically and then apply and necessary regexes or simpler pattern matches. 是的,我很清楚我可以通过编程方式更容易地破解这个简单的例子然后应用和必要的正则表达式或更简单的模式匹配。 The data I am working with is far more complex and I would really like to use a regex to handle all of the parsing. 我正在使用的数据要复杂得多,我真的想使用正则表达式来处理所有的解析。

With php you can use preg_match_all : 使用php,您可以使用preg_match_all

$str = '<td>Data 1</td>
<td>data 2</td>
<td>data 3</td>
<td>data 4</td>
';
preg_match_all('/(?:<td>(.+?)<\/td>\s*)/', $str, $m);
print_r($m);

output: 输出:

Array
(
    [0] => Array
        (
            [0] => <td>Data 1</td>

            [1] => <td>data 2</td>

            [2] => <td>data 3</td>

            [3] => <td>data 4</td>

        )

    [1] => Array
        (
            [0] => Data 1
            [1] => data 2
            [2] => data 3
            [3] => data 4
        )

)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM