I'm writing a program (in Java) that needs to extract links from webpages. I'm using htmlParser ( http://htmlparser.sourceforge.net/ ) but I'm only able to extract html links (defined with <a href="...">
) and I don't know how to handle javascript code to extract links from... can you help me??
You can use Rhino with DOM environment, written in JavaScript .
By the way it is written by John Resig.
This is probally the most comprehensive tool out there. Rhino . Everything you want to do can be done with Rhino.
HTML Parser from sourceforge is useful. I have used it to parse a whole bunch of HTML already. However, parsing JS is different. Cheers.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.