简体   繁体   English

如何访问用户选择的网址的DOM

[英]How to access the DOM of a user selected web address

I need to do what a bookmarklet does but from my page directly. 我需要做一个书签,但是直接从我的页面中做。

I need to pull the document.title property of a web page given that url. 给定该URL,我需要提取网页的document.title属性。

So say a user types in www.google.com, I want to be able to somehow pull up google.com maybe in an iframe and than access the document.title property. 假设某个用户输入www.google.com,我希望能够以某种方式在iframe中拉起google.com,而不是访问document.title属性。

I know that bookmarklets ( the javacript that runs from bookmark bar ) can access the document.title property of any site the user is on and then ajax that information to the server. 我知道小书签(从书签栏运行的javacript)可以访问用户所在的任何站点的document.title属性,然后将该信息ajax到服务器。

This is essentially what I want to do but from my web page directly with out the use of a bookmarklet. 从本质上讲,这是我想要做的,但无需使用书签即可直接在我的网页上进行。

According to This question You can achive this using PHP, try this code : 根据此问题,您可以使用PHP来实现,请尝试以下代码:

    <?php

function getTitle($Url){
    $str = file_get_contents($Url);
    if(strlen($str)>0){
        preg_match("/\<title\>(.*)\<\/title\>/",$str,$title);
        return $title[1];
    }
}
//Example:
echo getTitle("http://www.washingtontimes.com/");

?>

However, i assume it is possible to read file content with JS and do the same logic of searching for the tags. 但是,我认为可以用JS读取文件内容,并执行与搜索标签相同的逻辑。

Try searching here 尝试在这里搜索

Unfortunately, its not that easy. 不幸的是,这并不容易。 For security reasons, JavaScript is not allowed to access the document object of a frame or window that is not on the same domain. 出于安全原因,不允许JavaScript访问不在同一域中的框架或窗口的文档对象。 This sort of thing has to be done with a request to a backend PHP script that can fetch the requested page, go through the DOM, and retrieve the text in the <title> tag. 此类事情必须通过请求后端PHP脚本来完成,该后端PHP脚本可以获取请求的页面,遍历DOM并检索<title>标记中的文本。 If you don't have that capability, what you're asking will be much harder. 如果您没有该功能,那么您要问的将变得更加困难。

Here is the basic PHP script, which will fetch the page and use PHP's DOM extension to parse the page's title: 这是基本的PHP脚本,它将获取页面并使用PHP的DOM扩展来解析页面的标题:

<?php
$html = file_get_contents($_GET["url"]);

$dom = new DOMDocument;
$dom->loadXML($html);
$titles = $dom->getElementsByTagName('title');

foreach ($titles as $title) {
    echo $title->nodeValue;
}
?>

Demo: http://www.dstrout.net/pub/title.htm 演示: http //www.dstrout.net/pub/title.htm

You could write a server side script that would retrieve the page for you (ie using curl) and pars the dom and return the desired properties as json. 您可以编写一个服务器端脚本,该脚本将为您检索页面(即使用curl)并解析dom并以json形式返回所需的属性。 Then call it with ajax. 然后用ajax调用它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM