简体   繁体   English

书签:如何获取要运行书签的页面的 html?

[英]bookmarklet : how to get the html of the page you want to run the bookmarklet on?

javascript bookmarklets can use document object to get selected text or links. javascript 小书签可以使用文档 object 来获取选定的文本或链接。 But how do I get the whole html?但是如何获得整个 html?

I want to get the whole html so that I could retrieve specific data between html tags.我想获取整个 html 以便我可以检索 html 标签之间的特定数据。

for eg I want to get the following text from this page http://www.linguee.de/englisch-deutsch/uebersetzung/futile.html例如,我想从此页面http://www.linguee.de/englisch-deutsch/uebersetzung/futile.html获取以下文本

futile;

zwecklos;兹韦克洛斯;

There is no manual removal or exception for end customers at LEVEL 3 - Requests are futile; LEVEL 3 的最终客户没有手动删除或例外 - 请求是徒劳的;

Es gibt weder manuelle Entfernungen noch Ausnahmen für Endkunden aus LEVEL 3. Anfragen sind absolut zwecklos; Es gibt weder manuelle Entfernungen noch Ausnahmen für Endkunden aus LEVEL 3. Anfragen sind absolut zwecklos;

after getting the html by document.documentElement.innerHTML how do I get the above specific text from the innerhtml?通过 document.documentElement.innerHTML 获取 html 后,如何从 innerhtml 获取上述特定文本?

document.documentElement.innerHTML

You can get the tagname with element.tagName , and any attribute of that tag with element.getAttribute('attributename') .您可以使用element.tagName获取标记名,并使用element.getAttribute('attributename')获取该标记的任何属性。 Would that get you the data you are trying to get?这会让你得到你想要得到的数据吗?

If not, can you give us an example of the data you are trying to get?如果没有,你能给我们一个你试图获取的数据的例子吗?

You can check my bookmarklets as example .您可以查看我的书签作为示例 There are some features: 1. It is use post request, so any size of data can be send to server 2. It is use user selection from page 3. It is extract page text content using boilerpipe library.有一些特点: 1. 它是使用 post 请求,所以任何大小的数据都可以发送到服务器 2. 它是使用页面中的用户选择 3. 它是使用样板库提取页面文本内容。

<a class="button" href="javascript:function post_to_url(path,params,method){method=method||'post';var form=document.createElement('form');form.setAttribute('method', method);form.setAttribute('action',path);form.setAttribute('accept-charset','UTF-8');for(var key in params){var hiddenField=document.createElement('input');hiddenField.setAttribute('type','hidden');hiddenField.setAttribute('name',key);hiddenField.setAttribute('value',params[key]);form.appendChild(hiddenField);}document.body.appendChild(form); form.submit();};var t=((window.getSelection&&window.getSelection())||(document.getSelection&&document.getSelection())||(document.selection&&document.selection.createRange&&document.selection.createRange().text)||location.href);if(t=='')t=location.href;post_to_url('http://g-calendar.appspot.com/analyze/analyze', {withKeywords:'true', message:t, sumSize:5, return_type:'list', title:document.title, url:location.href}, 'post')">
    Summarise
</a>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM