简体   繁体   English

使用Java脚本读取页面XML

[英]Read page XML using Javascript

Hey guys, this is driving me absolutely insane so I wanted to ask the experts on this site to see if you know how to do it =) 大家好,这真让我发疯了,所以我想问这个网站的专家,看看你是否知道该怎么做=)

I'm trying to create some javascript code that can read out elements of a web page (eg. what does the first paragraph say?). 我正在尝试创建一些可以读取网页元素的javascript代码(例如,第一段说的是什么?)。 Here's what I have so far, but it doesnt work and I cant figure out why: 这是我到目前为止的内容,但是它不起作用,我无法弄清原因:

<script type="text/javascript">
<!--
var req;
// handle onreadystatechange event of req object
function processReqChange() {
    // only if req shows "loaded"
    if (req.readyState == 4) {
        // only if "OK"
        if (req.status == 200) {
            //document.write(req.responseText);
            alert("done loading");

            var responseDoc = new DOMParser().parseFromString(req.responseText, "text/xml");
            alert(responseDoc.evaluate("//title",responseDoc,null,
                        XPathResult.FIRST_ORDERED_NODE_TYPE,null).singleNodeValue);
         } 
         else {
            document.write("<error>could not load page</error>");
         }
    }
}

req = new XMLHttpRequest();
req.onreadystatechange = processReqChange;
req.open("GET", "http://www.apple.com", true);
req.send(null);
// -->

The alert that keeps appearing is "null" and I can't figure out why. 不断出现的警报为“空”,我不知道为什么。 Any ideas? 有任何想法吗?

This may be due to cross domain restriction... unless you're hosting your web page on apple.com. 这可能是由于跨域限制所致...除非您将网页托管在apple.com上。 :) You could also use jQuery and avoid writing all that out and/or dealing with any common possible cross-browser XML loading/parsing issues. :)您还可以使用jQuery并避免将所有内容写出和/或处理任何常见的跨浏览器XML加载/解析问题。 http://api.jquery.com/category/ajax/ http://api.jquery.com/category/ajax/

Update: Looks like it may have something to do with the source web site's Content-Type or something similar... For example, this code seems to work... (Notice the domain loaded...) 更新:看起来它可能与源网站的Content-Type或其他类似内容有关……例如,此代码似乎有效……(请注意已加载域...)


var req;
// handle onreadystatechange event of req object
function processReqChange() {
    // only if req shows "loaded"
    if (req.readyState == 4) {
        // only if "OK"

        if (req.status == 200) {
            //document.write(req.responseText);
            //alert("done loading");
            //alert(req.responseText);

            var responseDoc = new DOMParser();
            var xmlText = responseDoc.parseFromString(req.responseText, "text/xml");
            try{
              alert(xmlText.evaluate("//title",xmlText,null,XPathResult.FIRST_ORDERED_NODE_TYPE,null).singleNodeValue);
            }catch(e){
              alert("error");
            }
         } 
         else {
            document.write("could not load page");
         }
    }
}

req = new XMLHttpRequest();
req.onreadystatechange = processReqChange;
req.open("GET", "http://www.jquery.com", true);
req.send(null);

I also tried loading espn.com and google.com, and noticed they both have "Content-Encoding:gzip" so maybe that's the issue, just guessing though. 我还尝试加载espn.com和google.com,并注意到它们都具有“ Content-Encoding:gzip”,所以也许就是问题所在,不过只是在猜测。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM