简体   繁体   English

使用JSONP从URL网页使用JavaScript获取源代码

[英]Get a source code from URL web page with JavaScript using JSONP

I'm trying to get the source code form a URL web page using JSONP. 我正在尝试使用JSONP从URL网页获取源代码。 This is the code: 这是代码:

<script type="text/javascript">
var your_url = '';

$(document).ready(function(){
jQuery.ajax = (function(_ajax){

var protocol = location.protocol,
    hostname = location.hostname,
    exRegex = RegExp(protocol + '//' + hostname),
    YQL = 'http' + (/^https/.test(protocol)?'s':'') + '://query.yahooapis.com/v1/public/yql?callback=?',
    query = 'select * from html where url="{URL}" and xpath="*"';

function isExternal(url) {
    return !exRegex.test(url) && /:\/\//.test(url);
}

return function(o) {

    var url = o.url;

    if ( /get/i.test(o.type) && !/json/i.test(o.dataType) && isExternal(url) ) {
        // Manipulate options so that JSONP-x request is made to YQL

        o.url = YQL;
        o.dataType = 'json';

        o.data = {
            q: query.replace(
                '{URL}',
                url + (o.data ?
                    (/\?/.test(url) ? '&' : '?') + jQuery.param(o.data)
                : '')
            ),
            format: 'xml'
        };

        // Since it's a JSONP request
        // complete === success
        if (!o.success && o.complete) {
            o.success = o.complete;
            delete o.complete;
        }

        o.success = (function(_success){
            return function(data) {

                if (_success) {
                    // Fake XHR callback.
                    _success.call(this, {
                        responseText: data.results[0]
                            // YQL screws with <script>s
                            // Get rid of them
                            .replace(/<script[^>]+?\/>|<script(.|\s)*?\/script>/gi, '')
                    }, 'success');
                }

            };
        })(o.success);

    }

    return _ajax.apply(this, arguments);

};

})(jQuery.ajax);

$.ajax({
    url: your_url,
    type: 'GET',
    success: function(res) {
         var text = res.responseText;
         //document.getElementById("contenuto").innerHTML = text;

    alert(text);
}
});


});
</script>

I printed with an alert all the source code, from the URL. 我用警告打印了URL中的所有源代码。

alert(text);

First, how to know if the printed code is all the web code of the page? 首先,如何知道打印的代码是否是页面的所有Web代码? If I try to do in this way 如果我尝试这样做

document.getElementById("contenuto").innerHTML = text;

this is the result: 结果是:

\ \ <'+'/ins>\ \ \ '); } ]]>

I tried to use HTML DOM to print just one element, doing in this way 我试图使用HTML DOM仅打印一个元素,这样做

 document.getElementById("contenuto").innerHTML = text;
 var elem = text.getElementById("strip_adv").innerHTML;
 document.getElementById("contenuto_1").innerHTML = elem;

}

But this is the error on the JS console: 但这是JS控制台上的错误:

text.getElementById is not a function

Recap: I would to get the source code of a web page from URL, using JSONP. 回顾:我将使用JSONP从URL获取网页的源代码。 I would use HTML DOM from the returned text, to keep only the element/class I need. 我将从返回的文本中使用HTML DOM,以仅保留我需要的元素/类。 I'm a newbie on JS, I'm trying to learn more & more about JS. 我是JS的新手,我试图了解有关JS的更多信息。

getElementById() is present only in the document object. getElementById()仅存在于文档对象中。 What you are trying to do is trying to access getElementId from a string object. 您尝试做的是尝试从字符串对象访问getElementId。

Instead what I would suggest is insert the returned html string inside iframe and you can access the elements within iframe otherwise you can use some kind of html parser in your application. 相反,我建议在iframe中插入返回的html字符串,然后您可以访问iframe中的元素, 否则可以在应用程序中使用某种html解析器。

lets say your html looks like this after you insert your html string inside iframe 可以说,在iframe中插入html字符串后,您的html看起来像这样

<body>
    <iframe id="one">
      <html>
        <body> <h1 id="strip_adv">Heading</h1> </body>
      </html
    </iframe>
</body>

function iframeObj( frameEle ) {
    return frameEle.contentWindow
        ? frameEle.contentWindow.document
        : frameEle.contentDocument
}

var element = iframeObj( document.getElementById('strip_adv') );

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM