简体   繁体   English

Python机械化等待并单击

[英]Python mechanize wait and click

Using mechanize, how can I wait for some time after page load (some websites have a timer before links appear, like in download pages), and after the links have been loaded, click on a specific link? 使用mechanize,如何在页面加载后等待一些时间(某些网站在显示链接之前就存在计时器,例如在下载页面中),并且在链接加载后,单击特定的链接? Since it's an anchor tag and not a submit button, will browser.submit() work(I got errors while doing that)? 由于这是一个锚标记,而不是提交按钮,所以browser.submit()会起作用(这样做时出现错误)?

Mechanize does not offer javascript functionality, so you will not see dynamic content (like a timer that turns into a link). Mechanize不提供javascript功能,因此您将看不到动态内容(例如变成链接的计时器)。

As far as clicking on a link, you have to find the element, and then you can call click_link on it. 只要单击链接,就必须找到该元素,然后可以在其上调用click_link。 See the Finding Links section of this site . 请参阅本网站的“ Finding Links部分。

If you are looking for something to handle such sites, a good option is PhantomJS . 如果您正在寻找可以处理此类网站的东西,那么PhantomJS是一个不错的选择。 It uses nodejs, but runs on the webkit engine, allowing you parse dynamic content. 它使用nodejs,但是在webkit引擎上运行,允许您解析动态内容。 If you have your heart set on python, using Selenium to programatically drive a real browser may be your best bet. 如果您对python充满信心,那么使用Selenium以编程方式驱动真正的浏览器可能是最好的选择。

If it's an anchor tag, then just GET/POST whatever it is. 如果它是一个锚标记,则无论它是什么,都只需GET / POST。

The timer between links appearing is generally done in javascript - some sites you are attempting to scrape may not be usable without javascript, or requires a token generated in javascript with clientside math. 链接之间出现的计时器通常是用javascript完成的-您尝试抓取的某些网站如果没有javascript可能无法使用,或者需要在javascript中通过客户端数学生成令牌。

Depending on the site, you can either extract the wait time in msec/sec and time.sleep() for that long, or you'll have to use something that can execute javascript 根据站点的不同,您可以提取等待时间(以毫秒/秒为单位)和time.sleep()这么长的时间,或者必须使用可以执行javascript的内容

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM