简体   繁体   中英

How to get an array of all links on a website in JavaScript

I need a list/collection/array/whatever of all links on a website. Currently I'm using window.content.document.links, but that doesn't work an all websites. (Those websites produce an empty array.) (Example: dctp.ws) I'm guessing that's because those sites contain frames. Is there any way to access the links inside the frames?

Also, this is a FireGestures script, so it'll run "inside the browser". I don't want to download the website or something like that, since the browser already downloaded and parsed it.

You can get a NodeList of all a elements from a document using getElementsByTagName , like this:

var list = document.getElementsByTagName("a");

So you'd do that for the main document, and for all frames in the document. To access the frames, you can use the window.frames pseudo-array . Each entry is the window object of that frame, so:

var listInFrame = window.frames[n].document.getElementsByTagName("a");

So create a blank array, add in the elements from the document itself, then loop through the windows adding the links from their documents.

I'm not familiar with FireGestures, so I don't know if the Same Origin Policy applies to the scripts it runs.


Update : From your comment below, it sounds like FireGesture scripts are subject to the SOP. So you won't be able to directly access the content of documents from different origins in a FireGestures script.

You might be able to do something combining FireGestures and GreaseMonkey. GreaseMonkey has an API call, GM_xmlhttpRequest , that bypasses the SOP — but note that it would be another GET , you wouldn't be reading the copy of the page that's already in-memory, which you said you wanted to do. Unfortunately, it's entirely possible that you may not be able to do what you want with FireGestures. You may have to write your own add-on entirely (and have it request relevant permissions).

You can use document.getElementsByTagName('a') .

This does exactly what it sounds like--you get a NodeList of all the a elements on the page.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM