浏览器实例是否从机械化缓存？

Question

I am doing some webscraping with the mechanize browser and using the following code. 我正在使用机械化浏览器进行一些网络爬虫，并使用以下代码。 I realized in some cases we keep getting the same page, although the remote page is already changed. 我意识到在某些情况下，尽管远程页面已经更改，但我们仍会获得相同的页面。 So my question is: 所以我的问题是：

Does mechanize broswer instance cache page by default (in some config)? 默认情况下（在某些配置中）机械化浏览器实例缓存页面吗？
If so, how can we change it, or is there a way to avoid caching (apart from creating the browser instance every time in the loop we webscrape) 如果是这样，我们如何更改它，或者有一种避免缓存的方法（除了每次在Webscrape循环中创建浏览器实例之外）
```
 # put int login detail and submit, return a mechanize.Browser instance browser = _login() # main loop while True: rsp = browser.open(URL) html = rsp.read() 
```

thanks 谢谢

Answer 1

According to this thread , 根据这个线程，

Mechanize instances do cache pages you've visited, but you can clear that with agent.history.clear; 机械化实例确实会缓存您访问过的页面，但是您可以使用agent.history.clear;清除它。 or prevent history from being saved by setting agent.history.max_size = 0. Or, you can use a new Mechanize instance altogether. 或通过设置agent.history.max_size = 0阻止保存历史记录。或者，您可以完全使用新的Mechanize实例。

Particularly, 尤其，

Currently Mechanize reuses pages in the history of the session if a request with an If-Modified-Since header results in 304 Not Modified. 当前，如果带有If-Modified-Since标头的请求导致304 Not Modified，则Mechanize重用会话历史记录中的页面。

And by the documentation here , in Python, the following code will prevent the caching-like behavior (seekable responses): 并且通过此处的文档（在Python中），以下代码将防止类似缓存的行为（可寻求的响应）：

import mechanize
ua = mechanize.UserAgent()
ua.set_seekable_responses(False)
ua.set_handle_equiv(False)
ua.set_debug_responses(False)

Hope that provides some insight. 希望能提供一些见识。

浏览器实例是否从机械化缓存？

问题描述

1 个解决方案

解决方案1
3 2013-10-04 14:08:09

浏览器实例是否从机械化缓存？

问题描述

1 个解决方案

解决方案1 3 2013-10-04 14:08:09

解决方案1
3 2013-10-04 14:08:09