简体   繁体   English

如何提取HTML文档中的javascript链接?

[英]How to extract javascript links in an HTML document?

I am writing a small webspider for a website which uses a lot of javascript for links: 我正在为使用很多JavaScript链接的网站编写一个小型webspider:

<htmlTag onclick="someFunction();">Click here</htmlTag>

where the function looks like: 该函数如下所示:

function someFunction() {
  var _url;
  ...
  // _url constructed, maybe with reference to a value in the HTML doc
  // and/or a value passed as argument(s) to this function
  ...
  window.location.href = _url;
}

What is the best way of evaluating this function server-side so I can construct the value for _url? 在服务器端评估此功能的最佳方法是什么,以便我可以为_url构造值?

您还可以使用env.jsrhino来实际评估html中的JavaScript,并在手动触发click事件之后检测到位置对象的更改。

Not exactly sure what you're trying to accomplish. 不确定您要完成什么。

If you need to send these values to the server for processing, Ajax would be your best option. 如果您需要将这些值发送到服务器进行处理,那么Ajax将是您的最佳选择。

It should be a mess to do. 这应该是一团糟。 But it depends on a lot of params: 但这取决于很多参数:

  1. Where does the link is stored ? 链接存储在哪里? inside the element, in a javascript var, etc... 在元素中,在javascript var中,等等。
  2. Is the javascript function always be your own ? javascript函数始终是您自己的吗?

Some hints that could do the trick, should to simply parse your html and use regex to catch http links where the onclick="someFunction();" 可以解决问题的一些提示,应该简单地解析您的html并使用正则表达式来捕获http链接,其中onclick =“ someFunction();” attribute is present. 属性存在。

If you need server-side processing, you need to either: 如果需要服务器端处理,则需要:

  1. Do the processing before the content is delivered to the user, and include its output in the response, or 在将内容交付给用户之前进行处理,并将其输出包括在响应中,或者
  2. Use something like AJAX to make a new request back to the server 使用AJAX之类的东西向服务器发出新请求

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM