简体   繁体   English

PHP ::清洁preg_match_all结果

[英]PHP:: Cleaner preg_match_all results

I'm trying to fetch a value from external html page. 我正在尝试从外部html页面获取值。

Now this do the magic for me : 现在,这为我做魔术:

preg_match_all('/id="localWeather">(.*?)<\/div>/',$returnedPage,$returnValues,PREG_SET_ORDER);

But after this line I need to loop the results and clean it. 但是在此行之后,我需要循环结果并进行清理。

Why? 为什么? because I only need this (.*?) and for some reason it return also add the extra </div> closing tag so I need to loop it and clean the array afterwards. 因为我只需要这个(.*?)并且由于某种原因它还返回了额外的</div>结束标记,因此我需要对其进行循环并随后清理数组。

My question is how do I force to return only this : (.*?) ? 我的问题是如何强制仅返回以下内容: (.*?)

Get rid of PREG_SET_ORDER . 摆脱PREG_SET_ORDER Example: 例:

<?php

$returnedPage = '<div id="localWeather">test</div><div id="localWeather">test2</div>';

preg_match_all('/id="localWeather">(.*?)<\/div>/',$returnedPage,$returnValues);

print_r($returnValues);

Output: 输出:

Array
(
    [0] => Array
        (
            [0] => id="localWeather">test</div>
            [1] => id="localWeather">test2</div>
        )

    [1] => Array
        (
            [0] => test
            [1] => test2
        )

)

So in this case, $returnValues[1] is an array of matches with just the content in between the divs (and not the closing div), while $returnValues[0] is an array of the entire portion of the string that matched your regex. 因此,在这种情况下, $returnValues[1]是一个匹配数组,其中的内容仅位于div之间(而不是结束div之间),而$returnValues[0]是与您的字符串匹配的整个字符串的数组正则表达式。

Also, using regular expressions isn't recommended to parse HTML. 另外,不建议使用正则表达式来解析HTML。 I'd look at PHP's DOMDocument class instead, it's more robust. 我会看PHP的DOMDocument类,它更健壮。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM