xpath从iframe domXPath php获取数据

Question

I am playing around scraping website technique, For ex link , Its always returning empty for description. 我正在玩刮网站技术，对于前链接，它总是返回空的描述。 The reason is its populated by JS with the following code, How do we go about with these kinds of senarios. 原因是它由JS填充以下代码，我们如何处理这些类型的senarios。

// Frontend JS
P.when('DynamicIframe').execute(function(DynamicIframe){
    var BookDescriptionIframe = null,
        bookDescEncodedData = "book desc data",
        bookDescriptionAvailableHeight,
        minBookDescriptionInitialHeight = 112,
        options = {},
        iframeId = "bookDesc_iframe";

I am using php domxpath as below 我正在使用php domxpath如下

    $file = 'sample.html';
    $dom = new DOMDocument();
    $dom->preserveWhiteSpace = false;
    // I am saving the returned html to a file and reading the file.
    @$dom->loadHTMLFile($file);
    $xpath = new DOMXPath($dom);

    // This xpath works on chrome console, but not here
    // because the content is dynamically created via js
    $desc  = $xpath->query('//*[@id="bookDesc_iframe"]')

Answer 1

Everytime when you see these kinds of JavaScript Generated content and especially from big guys like amazon, google, you should immediately think that it would have a graceful degradation implementation. 每当你看到这些JavaScript生成的内容，特别是像亚马逊，谷歌这样的大家伙时，你应该立即认为它会有一个优雅的降级实现。

Meaning it would be done for where Javascript doesn't work like links browser for better browser coverage. 这意味着它可以用于Javascript不能像链接浏览器一样工作以获得更好的浏览器覆盖率。

Lookout for <noscript> you may find one. 寻找<noscript>你可能会找到一个。 and with that you can solve the problem. 并且你可以解决问题。

xpath从iframe domXPath php获取数据

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-05-26 05:53:31

xpath从iframe domXPath php获取数据

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-05-26 05:53:31

解决方案1
0 已采纳 2019-05-26 05:53:31