简体   繁体   中英

Error when trying to open a webpage with mechanize

I'm trying to learn mechanize to create a chat logging bot later, so I tested out some basic code

import mechanize as mek
import re

br = mek.Browser()
br.open("google.com")

However, whenever I run it, I get this error.

Traceback (most recent call last):
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/mechanize/_mechanize.py", line 262, in _mech_open
    url.get_full_url
AttributeError: 'str' object has no attribute 'get_full_url'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 5, in <module>
    br.open("google.com")
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/mechanize/_mechanize.py", line 253, in open
    return self._mech_open(url_or_request, data, timeout=timeout)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/mechanize/_mechanize.py", line 269, in _mech_open
    raise BrowserStateError("can't fetch relative reference: "
mechanize._mechanize.BrowserStateError: can't fetch relative reference: not viewing any document

I double checked with the documentation on the mechanize page and it seems consistent. What am I doing wrong?

You have to use a schema, otherwise mechanize thinks you are trying to open a local/relative path (as the error suggests).

br.open("google.com") should be br.open("http://google.com") .

Then you will see an error mechanize._response.httperror_seek_wrapper: HTTP Error 403: b'request disallowed by robots.txt' , because google.com does not allow crawlers. This can be remedied with br.set_handle_robots(False) before open .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM