As an intern, I have to scrape the price of a product sheet on a site. But the div contains multiple spans with prices and only one contains the actual price that is displayed. Their absolute left position changes with each refresh. So when I do something like $productPrice $packtPageXpath->query('//div[@class="bloc_price"]'); if ($productPrice->length > 0) { $productSheet['name'] = $productPrice->item(0)->nodeValue;
$productPrice $packtPageXpath->query('//div[@class="bloc_price"]'); if ($productPrice->length > 0) { $productSheet['name'] = $productPrice->item(0)->nodeValue;
this returns me the list of all the prices in a different order at each refresh in the array
Here is an example of a product sheet https://www.pompes-direct.com/pompage/pompe-electrique/surface/jet/jet-102-t/4432.html
The sensible thing to do is to extract the left position from the appropriate div
elements in //div[@id="top_produit"]//div[@class="bloc_price"]/div
by getting the value between "left:" and "px;" in the style
attributes, comparing all the values in PHP, and using the div with the lowest value.
The less sensible thing to do is assume that they'll never have a real position of 100 or greater or a fake position lower than 100, conclude that since the style
attributes are otherwise identical one must be shorter than all the others, and write an xpath to find that one.
//div[@id="top_produit"]//div[@class="bloc_price"]/div[(string-length(@style) =81)]/span[not(contains(@style, "text-decoration:line-through"))]
Might work!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.