简体   繁体   中英

simple_html_dom.php parser not working

I downloaded the simple_html_dom.php file and uploaded it to my web server and I immediately tested it with a simple script, but it won't work, or at least I suppose it doesn't since it doesn't output anything.

Here's the script:

<?php
require('simple_html_dom.php');

// Retrieve the DOM from a given URL
$html = file_get_html('http://davidwalsh.name/');

// Find all "A" tags and print their HREFs
foreach($html->find('a') as $e) {
    echo $e->innertext . '<br>';
}
?>

I got the script from this site http://davidwalsh.name/php-notifications and the same was presented on other sites so I don't understand why it won't output anything.

My guess is that the script isn't able to retrieve any data from the other website, something like the problem I came across here: Retrieving cross-domain data php .

If it were like so, could there be any way to avoid the problem?

In BIno Carlos' answer to the question I linked before he stated "it's not really a cross domain problem because you are loading the data from the server not the browser", so could there be for example a way to load the data from the browser?

So, as suggested by user868766 in his answer, i tried the two ini_set methods which didn't report any errors, so it would seem like the script works apparently just fine. When I tried the print_r() method on $html it outputted the below:

simple_html_dom Object ( [root] => simple_html_dom_node Object ( [nodetype] => 5 [tag] => root [attr] => Array ( ) [children] => Array ( ) [nodes] => Array ( ) [parent] => [_] => Array ( [0] => -1 [1] => 1 ) [dom:simple_html_dom_node:private] => simple_html_dom Object *RECURSION* ) [nodes] => Array ( [0] => simple_html_dom_node Object ( [nodetype] => 5 [tag] => root [attr] => Array ( ) [children] => Array ( ) [nodes] => Array ( ) [parent] => [_] => Array ( [0] => -1 [1] => 1 ) [dom:simple_html_dom_node:private] => simple_html_dom Object *RECURSION* ) ) [callback] => [lowercase] => 1 [pos:protected] => 0 [char:protected] => [size:protected] => 0 [cursor:protected] => 1 [parent:protected] => simple_html_dom_node Object ( [nodetype] => 5 [tag] => root [attr] => Array ( ) [children] => Array ( ) [nodes] => Array ( ) [parent] => [_] => Array ( [0] => -1 [1] => 1 ) [dom:simple_html_dom_node:private] => simple_html_dom Object *RECURSION* ) [token_blank:protected] => [token_equal:protected] => =/> [token_slash:protected] => /> [token_attr:protected] => > [self_closing_tags:protected] => Array ( [img] => 1 [br] => 1 [input] => 1 [meta] => 1 [link] => 1 [hr] => 1 [base] => 1 [embed] => 1 [spacer] => 1 ) [block_tags:protected] => Array ( [root] => 1 [body] => 1 [form] => 1 [div] => 1 [span] => 1 [table] => 1 ) [optional_closing_tags:protected] => Array ( [tr] => Array ( [tr] => 1 [td] => 1 [th] => 1 ) [th] => Array ( [th] => 1 ) [td] => Array ( [td] => 1 ) [li] => Array ( [li] => 1 ) [dt] => Array ( [dt] => 1 [dd] => 1 ) [dd] => Array ( [dd] => 1 [dt] => 1 ) [dl] => Array ( [dd] => 1 [dt] => 1 ) [p] => Array ( [p] => 1 ) [nobr] => Array ( [nobr] => 1 ) ) [doc:protected] => [noise:protected] => Array ( ) )

The code you posted is working fine for me.

Think you can try these, put below lines on top of script :

    ini_set('error_reporting', E_ALL);
    ini_set('display_errors', 1);

Also check print_r of $html

    print_r($html);

Hope this help.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM