执行JavaScript链接后从网页获取HTML

Question

There is an internet page, that when you click a javascript link ( tag with javascript:... in it) loads a table. 有一个互联网页面，当您单击一个javascript链接（其中带有javascript：...的标记）时，会加载一个表格。 I need to get this table into my Asp.net website. 我需要将此表放入我的Asp.net网站。 There is no URL that contains the table without executing any scripts. 没有执行任何脚本的URL将不包含该表。 This is what I am currently using: 这是我目前正在使用的：

public string GetFromUrl(string path)
{
    WebClient web = new WebClient();
    return web.DownloadString(path);
}

public string GetTagHTML(string html)
{
    Regex regex = new Regex("<table>(.*)</table>");
    var v = regex.Match(html);
    return v.Groups[1].ToString();
}

more info 更多信息

The website I am trying to get data from is http://beitbiram.iscool.co.il/default.aspx (it's in hebrew. The link I am trying to click is one of the table titles). 我试图从中获取数据的网站是http://beitbiram.iscool.co.il/default.aspx （希伯来语。我试图单击的链接是表格标题之一）。

The website is an asp.net website. 该网站是一个asp.net网站。

The function that the link calls is __doPostBack . 链接调用的函数是__doPostBack 。 I don't have any idea what it does, and can't find any online info about it, but this is it's code: 我不知道它做什么，也找不到关于它的任何在线信息，但这是代码：

var theForm = document.forms['Form'];
if (!theForm) {
    theForm = document.Form;
}
function __doPostBack(eventTarget, eventArgument) {
    if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
        theForm.__EVENTTARGET.value = eventTarget;
        theForm.__EVENTARGUMENT.value = eventArgument;
        theForm.submit();
    }
}

Thanks in advance. 提前致谢。

Answer 1

In general, the only way to get the HTML of a page after running Javascript is to run the Javascript, and that requires a browser. 通常，运行Javascript后获取页面HTML的唯一方法是运行Javascript，这需要浏览器。

The direct answer to your question, then, is to use something like Headless Chrome to spin up a browser, load the page, click the link, and export the HTML. 因此，您问题的直接答案是使用诸如Headless Chrome之类的工具启动浏览器，加载页面，单击链接并导出HTML。 This has historically been a massive pain to get working, although Headless Chrome is supposed to be rather less painful. 从历史上看，上班一直很痛苦，尽管Headless Chrome的痛苦要小得多。

However, the javascript: link you run must get the data from somewhere in order to put it into the table, so I would strongly advise looking for that source and building the table yourself, because I certainly wouldn't want to be maintaining a website with an embedded browser if I didn't absolutely have to. 但是，您运行的javascript:链接必须从某处获取数据才能将其放入表中，因此我强烈建议您寻找该源并自行构建表，因为我当然不希望维护网站使用嵌入式浏览器，如果我不是绝对必要的话。

执行JavaScript链接后从网页获取HTML

问题描述

1 个解决方案

解决方案1
0 2017-12-30 11:30:03

执行JavaScript链接后从网页获取HTML

问题描述

1 个解决方案

解决方案1 0 2017-12-30 11:30:03

解决方案1
0 2017-12-30 11:30:03