Python Mechanize中的错误 - “mechanize._mechanize.BrowserStateError：not viewing HTML”

Question

for link in br.links(url_regex="inquiry-results.jsp"):
    cb[link.url] = link

for page_link in cb.values():               
   for link in br.links(url_regex="inquiryDetail.jis"): 
            ....................
      url = link.absolute_url
      br.follow_link(link)
            ......................
   br.follow_link(page_link)

This is my code. 这是我的代码。 Basically, it extracts page links [Link of page 1,2,3,4,5...] and data links from particular page. 基本上，它提取页面链接[第1,2,3,4,5页的链接...]和特定页面的数据链接。 Then it goes in each data link and extracts some data and when done it moves to the next page. 然后它进入每个数据链接并提取一些数据，完成后移动到下一页。 But I always get this error: 但我总是得到这个错误：

Traceback (most recent call last):
  File "C:\python27\test.py", line 95, in <module>
    for link in br.links(url_regex="inquiryDetail.jis"):
  File "build\bdist.win32\egg\mechanize\_mechanize.py", line 405, in links
mechanize._mechanize.BrowserStateError: not viewing HTML

Can anyone help? 有人可以帮忙吗？

Answer 1

Thanks to the link posted by loevborg, I've been using this: 感谢loevborg发布的链接，我一直在使用这个：

br.open('http://example.com')
br._factory.is_html = True

Now br.viewing_html() will evaluate to True 现在br.viewing_html()将评估为True

Answer 2

This seems to be related to a check to see if the response is valid HTML: 这似乎与检查响应是否有效HTML有关：

http://github.com/jjlee/mechanize/blob/master/mechanize/_mechanize.py#L440 http://github.com/jjlee/mechanize/blob/master/mechanize/_mechanize.py#L440

Perhaps the response you get it XHTML, or has invalid headers? 也许你得到XHTML的响应，或者有无效的标题？ There may be some way to override the is_html attribute (like here ). 可能有某种方法可以覆盖is_html属性（就像这里一样）。

Answer 3

在br.open可能帮助之前将您的应用程序作为浏览器引入：

br.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/45.0.2454101')]

Python Mechanize中的错误 - “mechanize._mechanize.BrowserStateError：not viewing HTML”

问题描述

3 个解决方案

解决方案1
6 2010-11-17 02:41:40

解决方案2
2 已采纳 2010-08-12 09:25:09

解决方案3
0 2015-11-04 05:04:21

Python Mechanize中的错误 - “mechanize._mechanize.BrowserStateError：not viewing HTML”

问题描述

3 个解决方案

解决方案1 6 2010-11-17 02:41:40

解决方案2 2 已采纳 2010-08-12 09:25:09

解决方案3 0 2015-11-04 05:04:21

解决方案1
6 2010-11-17 02:41:40

解决方案2
2 已采纳 2010-08-12 09:25:09

解决方案3
0 2015-11-04 05:04:21