简体   繁体   中英

How can i scrape the lowest absolute position on the left of a span among of a div

As an intern, I have to scrape the price of a product sheet on a site. But the div contains multiple spans with prices and only one contains the actual price that is displayed. Their absolute left position changes with each refresh. So when I do something like $productPrice $packtPageXpath->query('//div[@class="bloc_price"]'); if ($productPrice->length > 0) { $productSheet['name'] = $productPrice->item(0)->nodeValue; $productPrice $packtPageXpath->query('//div[@class="bloc_price"]'); if ($productPrice->length > 0) { $productSheet['name'] = $productPrice->item(0)->nodeValue; this returns me the list of all the prices in a different order at each refresh in the array

Here is an example of a product sheet https://www.pompes-direct.com/pompage/pompe-electrique/surface/jet/jet-102-t/4432.html

The sensible thing to do is to extract the left position from the appropriate div elements in //div[@id="top_produit"]//div[@class="bloc_price"]/div by getting the value between "left:" and "px;" in the style attributes, comparing all the values in PHP, and using the div with the lowest value.

The less sensible thing to do is assume that they'll never have a real position of 100 or greater or a fake position lower than 100, conclude that since the style attributes are otherwise identical one must be shorter than all the others, and write an xpath to find that one.

//div[@id="top_produit"]//div[@class="bloc_price"]/div[(string-length(@style) =81)]/span[not(contains(@style, "text-decoration:line-through"))]

Might work!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM