简体   繁体   English

PHP Regex匹配,匹配以数字开头和后跟句点的所有行

[英]PHP Regex match, match all lines that begin with number followed by period

I used simple_html_dom , to parse some HTML and have the following HTML Table that in an array called $pre 我使用了simple_html_dom来解析一些HTML,并在名为$pre的数组中包含以下HTML Table

Now, how can I use PHP regex to get only the lines that match the result below? 现在,如何使用PHP正则表达式仅获取与以下结果匹配的行?

<table>
    <tr>
        <td>
            <pre>1.   APEAL/890/2010     HUSSAIN ISMAIL SATWILKAR        SHRI C.K. PENDSE</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>     [Criminal]                                         MS.ROHINI DANDEKAR ADV.AP</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>                        V/S THE STATE OF MAHARASH       PTD AS PER CTS ORD 7/9/17</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>                        TRA                             P.P.FOR  P. P</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre></pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>        REMARK : (By Accused against Conviction) Note: (1) Matter is Ready for final</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>                 hearing. (2) Accd. is in jail. (3) R & P with PB received. (4)</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>                 Muddemal article are to be called for. (5) Report received from</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>                 Nashik Central Prison stated therein that "Orig. accd. death dated</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>                 20/11/2015 (Report kept at flag "A") . ....... Court (DB) for final</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>                 hearing.</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre></pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre></pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre></pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>2.   APEAL/966/2011     ABDUL MALIK SHAIKH              SHRI S. R. MITHARE</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>     [Criminal]</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>                        V/S THE STATE OF MAHARASH</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>                        TRA</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre></pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>        REMARK : (By Accused Against Conviction) Note:- (1) Matter is ready for</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>                 Final Hearing. (2) Original Accused is in Jail. (3) R & P received</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>                 with PaperBooks. (4) Muddemal Articles are to be called for. (5)</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>                 Report received from Kolhapur central Prison stated therein that</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>                 "Orig. Accused expired on 19/04/2015 (Report kept at flag "A")</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>                 - Court D.B. for Final Hearing.</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre></pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre></pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre></pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>3.   APEAL/486/2012     AJAY SITARAM BHARATI            MISS. TANU KHATTRI</pre>
        </td>
    </tr>
    <tr>
        <td>
            <pre>     [Criminal]</pre>
        </td>
    </tr>
</table>

Result after using Regex: 使用Regex后的结果:

<pre>1.   APEAL/890/2010     HUSSAIN ISMAIL SATWILKAR        SHRI C.K. PENDSE</pre>
<pre>2.   APEAL/966/2011     ABDUL MALIK SHAIKH              SHRI S. R. MITHARE</pre>
<pre>3.   APEAL/486/2012     AJAY SITARAM BHARATI            MISS. TANU KHATTRI</pre>

Using this code: preg_match('^\\<pre\\>\\d2*\\./gm', $pre[$i]) returns: preg_match(): No ending delimiter '^' found 使用此代码: preg_match('^\\<pre\\>\\d2*\\./gm', $pre[$i])返回: preg_match(): No ending delimiter '^' found

This looks like the correct regex to use, this is from regex101: 这看起来像是要使用的正确正则表达式,来自regex101:

^ asserts position at start of the string
\< matches the character < literally (case sensitive)
pre matches the characters pre literally (case sensitive)
\> matches the character > literally (case sensitive)
\d matches a digit (equal to [0-9])
    2* matches the character 2 literally (case sensitive)
    * Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\. matches the character . literally (case sensitive)

Global pattern flags
g modifier: global. All matches (don't return after first match)

Here's what you need: 这是您需要的:

#<pre>(?<line>\d+\..+)<\/pre>#

Obviously you know what pre is. 显然,您知道pre是什么。 The brackets denote a capture group, which I have named 'line', by putting ?<line> in the brackets. 方括号表示捕获组,我通过将?<line>放在方括号中将其命名为“ line”。

Then it looks for a number \\d+\\ , a literal dot \\. 然后,它寻找一个数字\\d+\\ ,一个文字点\\. , anything .+ followed by the closing tag. ,任何.+后跟结束标记。

$regex = '#<pre>(?<line>\d+\..+)<\/pre>#';

preg_match_all($regex, $html, $matches);

foreach($matches['line'] as $line) {
    echo $line ."\n";
}

Output: 输出:

1. APEAL/890/2010 HUSSAIN ISMAIL SATWILKAR SHRI C.K. PENDSE 
2. APEAL/966/2011 ABDUL MALIK SHAIKH SHRI S. R. MITHARE

Here it is in action: https://regex101.com/r/6U8S9C/1 它正在起作用: https : //regex101.com/r/6U8S9C/1

And again running in php: https://3v4l.org/QoVsY 并再次在php中运行: https : //3v4l.org/QoVsY

The php preg_* functions require a delimiter - a symbol that is not used within the pattern. php preg_*函数需要一个定界符-模式中未使用的符号。

Also, your pattern won't match correctly. 另外,您的模式将无法正确匹配。 The reason for this is that ^ matches the very start of the line. 这样做的原因是^与行的开头匹配。 And the pre tag doesn't start for a few tabs. 而且pre标签在几个标签中没有开始。

This regex will match any pre tag that is on the same line, that starts with at least one number (for example, 1, 16, 256, etc) and a period. 此正则表达式将与同一行上的任何pre标签匹配,该pre标签以至少一个数字(例如1、16、256等)和一个句点开头。

preg_match('#(<pre>\d+\..*</pre>)#', $pre[$1], $matches);
vaR_dump($matches);

In this example, I've used # as the delimiter. 在此示例中,我使用#作为分隔符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM