简体   繁体   English

获取html源代码

[英]get html source code

I am trying to put the html source code for any webpage in a string using Javascript. 我正在尝试使用Javascript将任何网页的html源代码放入字符串中。 Please tell me if i can do something else to solve my problem.. I am using the following code that i found from another post 请告诉我是否可以做其他事情来解决我的问题。.我正在使用从另一篇文章中找到的以下代码

function httpGet(theUrl)
{
var xmlHttp = null;

xmlHttp = new XMLHttpRequest();
xmlHttp.open( "GET", theUrl, false );
xmlHttp.send( null );
return xmlHttp.responseText;
}

I tried this in IE Firefox and Chrome but i always get the following source code which is the source code for "PAGE NOT FOUND" page..If you any other info please let me know in a comment.. What i am trying is to get html from any webpage like google.com and other webpages..If i can't do that then what can i do? 我在IE Firefox和Chrome中尝试了此操作,但是我始终会获得以下源代码,这是“ PAGE NOT FOUND”页面的源代码。如果您有其他任何信息,请在评论中让我知道。从任何网页(例如google.com和其他网页)获取HTML。如果我不能这样做,那我该怎么办?

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">  
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head profile="http://gmpg.org/xfn/11">
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>404 - PAGE NOT FOUND</title>
            <style type="text/css">
            body{padding:0;margin:0;font-family:helvetica;}
            #container{margin:20px auto;width:868px;}
            #container #top404{background-image:url('http://74.53.143.237/images/404top.gif');background-repeat:no-repeat;width:868px;height:168px;}
            #container #mid404{background-image:url('http://74.53.143.237/images/404mid.gif');background-repeat:repeat-y;width:868px;}
            #container #mid404 #gatorbottom{position:relative;left:39px;float:left;}
            #container #mid404 #xxx{float:left;padding:40px 237px 10px;}
            #container #mid404 #content{float:left;text-align:center;width:868px;}
            #container #mid404 #content #errorcode{font-size:30px;font-weight:800;}
            #container #mid404 #content p{font-weight:800;}
            #container #mid404 #content #banner{margin:20px 0 0 ;}
            #container #mid404 #content #hostedby{font-weight:800;font-size:25px;font-style:italic;margin:20px 0 0;}
            #container #mid404 #content #coupon{color:#AB0000;font-size:22px;font-style:italic;}
            #container #mid404 #content #getstarted a{color:#AB0000;font-size:31px;font-style:italic;font-weight:800;}
            #container #mid404 #content #getstarted {margin:0 0 35px;}
            #container #bottom404{background-image:url('http://74.53.143.237/images/404bottom.gif');background-repeat:no-repeat;width:868px;height:14px;}
            </style>
</head>
<body>
<div id="container">
    <div id="top404"></div>
    <div id="mid404">

            <div id="gatorbottom"><img src="http://74.53.143.237/images/gatorbottom.png" alt="" /></div>
            <div id="xxx"><img src="http://74.53.143.237/images/x.png" alt="" /></div>
    <div id="content">
            <div id="errorcode">ERROR 404 - PAGE NOT FOUND</div>
            <p>Oops! Looks like the page you're looking for was moved or never existed.<br />Make sure you typed the correct URL or followed a valid link.</p>

            <div id="banner">

                    <object width="728" height="90"><param name="movie" value="http://74.53.143.237/images/hg728x90.swf">

                            <embed src="http://74.53.143.237/images/hg728x90.swf?clickTAG=http://secure.hostgator.com/cgi-bin/affiliates/clickthru.cgi?id=page404" width="728" height="90"></embed>
                    </object>
            </div>

            <div id="hostedby">This site is hosted by HostGator!</div>
            <div id="coupon">Build your website today for 1 cent!   Coupon code: "404PAGE"</div>

            <div id="getstarted"><a href="http://www.hostgator.com/?utm_source=internal&utm_medium=link&utm_campaign=page404" title="HostGator Web Hosting" >CLICK HERE TO GET STARTED</a></div>

    </div>

    <div style="clear:left;"></div>
    </div>
    <div id="bottom404"></div>
</div>

</body>

</html>

I am trying to put the html source code for any webpage in a string using Javascript 我正在尝试使用Javascript将任何网页的html源代码放入字符串中

If by "any" you mean pages from origins other than the origin your document is served from, you can't do that from JavaScript running in a browser , because you're using an ajax call and those are restricted by the Same Origin Policy , which says that (for instance) script running in a document on http://stackoverflow.com can't use ajax to load content from http://example.com . 如果用“任何”来表示来自文档来源之外的其他来源的页面,则不能通过运行在浏览器中的 JavaScript 来实现 ,因为您使用的是ajax调用,并且这些页面受到Same Origin Policy的限制,它表示(例如)在http://stackoverflow.com上的文档中运行的脚本不能使用Ajax从http://example.com加载内容。 (An "origin" is more than just the domain name, there are several aspects to it, see the link for details). (“来源”不只是域名,它涉及多个方面,有关详细信息,请参见链接)。

Some of the pages you might request (but probably very few) might support Cross-Origin Resource Sharing , in which case if they allow your origin (probably by allowing all origins), you could use ajax to load their content. 有些你可能会请求的页面(但可能很少 )可能支持跨来源资源共享 ,在这种情况下,如果他们让你的原点(可能是允许所有起源),你可以使用Ajax加载的内容。

If you're running JavaScript outside the browser (NodeJS, SilkJS, RingoJS, Rhino, Windows Scripting Host, etc.), then the SOP wouldn't apply, but I suspect you'd probably need to use something other than the XMLHttpRequest object to do it. 如果您正在浏览器外部运行JavaScript(NodeJS,SilkJS,RingoJS,Rhino,Windows脚本宿主等),那么SOP将不适用,但我怀疑您可能需要使用XMLHttpRequest对象以外的其他东西去做吧。

But fundamentally, in a web page (not an extension/add-on) in a browser, you can't do that. 但从根本上讲,在浏览器的网页(不是扩展/附加组件)中,您不能这样做。

...but i always get the ... source code for "PAGE NOT FOUND" page ...但是我总是得到“ PAGE NOT FOUND”页面的源代码

But that sounds like the URL is just wrong. 但是, 听起来像URL是错误的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM