简体   繁体   中英

php - Query a table with DOMXPath

I try to access the values of a table on a web page with a php expression DOMXPath::query . When I navigate with my web browser in this page I can see this table but when I execute my query this table isn't visible and don't seem accessible.

This table have an id, but when I specify it on my query an other one is returned. I want to read the table with the id 'totals', but I only have that one with the id 'per_game'. When I inspect page's code, a lot of elements seem to be in comments.

Here is my script:

<?php
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->strictErrorChecking = false;
$doc->recover = true;
$doc->loadHTMLFile('https://www.basketball-reference.com/players/j/jokicni01.html');
$xpath = new DOMXPath($doc);
$table = $xpath->query("//div[@id='totals']")->item(0);
$elem = $doc->saveXML($table);
echo $elem;
?>

How can i read elements in the table with the id 'totals'?

The full path is /html/body/div[@id="wrap"]/div[@id="content"]/div[@id="all_totals"]/div[@class="table_outer_container"]/div[@id="div_totals"]/table[@id="totals"]

You can cut your query in two parts: first, retrieve the comment in the correct div, then create a new document with this content to retrieve the element you want:

$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->strictErrorChecking = false;
$doc->recover = true;
@$doc->loadHTMLFile('https://www.basketball-reference.com/players/j/jokicni01.html');
$xpath = new DOMXPath($doc);

// retrieve the comment section in 'all_totals' div
$all_totals_element = $xpath->query('/html/body/div[@id="wrap"]/div[@id="content"]/div[@id="all_totals"]/comment()')->item(0);
$all_totals_table = $doc->saveXML($all_totals_element);

// strip comment tags to keep the content inside
$all_totals_table = substr($all_totals_table, strpos($all_totals_table, '<!--') + strlen('<!--'));
$all_totals_table = substr($all_totals_table, 0, strpos($all_totals_table, '-->'));

// create a new Document with the content of the comment
$tableDoc = new DOMDocument ;
$tableDoc->loadHTML($all_totals_table);
$xpath = new DOMXPath($tableDoc);

// second part of the query
$totals = $xpath->query('/div[@class="table_outer_container"]/div[@id="div_totals"]/table[@id="totals"]')->item(0);

echo $tableDoc->saveXML($totals) ;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM