简体   繁体   English

执行与其关联的JavaScript后,如何获取页面的源HTML?

[英]How do I get the source HTML for a page after executing its associated JavaScript?

there have been quiet a few posts on that issue but it seems none realy answer the question I have. 关于该问题的帖子很少,但是似乎没有人真正回答我的问题。

I use TIdHttp to load the source code of this website: http://www.nationalgeographic.com/ 我使用TIdHttp加载此网站的源代码: http : //www.nationalgeographic.com/

I try to extract some data but realized that the data is generated by a script. 我尝试提取一些数据,但意识到该数据是由脚本生成的。 There is a script on in the source code and a few links to external js files. 源代码中有一个脚本,还有一些指向外部js文件的链接。

How could i possibly run some or all of the scripts on the page and get the source code generated ? 我如何才能在页面上运行部分或全部脚本并获取生成的源代码?

I am using this part in a secondary thread and would like to avoid using a WebBrowser component. 我在辅助线程中使用此部分,并希望避免使用WebBrowser组件。

I could extract the scripts or links from the Idhttp generated source code, but running a js file with idhttp.get(*.js) but I presume that would probably be too simple to work. 我可以从Idhttp生成的源代码中提取脚本或链接,但是使用idhttp.get(*.js)运行js文件,但我想这可能太简单了。

Finally, the answer have been very basic : 最后,答案很基本:

document := webBrowser.Document as IHTMLDocument2; result := document.body.innerHTML;

That retrieves the source code and include the content generated dynamically at runtime by scripts. 这将检索源代码并包括脚本在运行时动态生成的内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM