I am pulling HTML from Selenium, and then extracting data from the HTML using Xpaths.
This is the Xpath:
/html/body/div[2]/div[1]/div/div/div/div/ul/li/div[1]/h3/a
This is my code:
$data = $webdriver->getPageSource();
d($data, $urltemplate);
$doc = new DOMDocument();
$doc->loadHTML($data);
$xp = "/html/body/div[2]/div[1]/div/div/div/div/ul/li/div[1]/h3/a";
$xpatho = new DOMXpath($doc);
$elementsn = $xpatho->query($xp);
d(get_class($elementsn),$elementsn->count(),$xp,$name);
// d() is a custom function like var_dump().
I always get $elementsn->count() = 0.
This is $data:
I am trying to extract those strings like "NAD M10 BLUOS...", "NAD M12 DIRECT DIGITAL..." and so on...
I saved the HTML into a file, and opened it in my browser. I am attaching screenshot of what data I was looking to retrieve (highlighted in blue):
Basically, the HTML page is a product listing, and I am looking to extract all the product names. To confirm, I used Chrome Developer tools, and used the copy full Xpath function. I have the following Xpaths for some of the product names:
/html/body/div[2]/div[1]/div/div/div/div/ul/li[1]/div[1]/h3/a
/html/body/div[2]/div[1]/div/div/div/div/ul/li[3]/div[1]/h3/a
I would guess that this would generalise to:
/html/body/div[2]/div[1]/div/div/div/div/ul/li/div[1]/h3/a
However, I keep on getting a DOMNodeList with count = 0. Why is this so, and how can I check what the error is, if any?
PS: This is the original webpage: http://lenbrook.com.sg/3-shop-by-brand#/page-4/price-49-8667
Try changing your $xp
$xp = '//a[@class="product_link"]/text()'
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.