I'm trying to learn mechanize to create a chat logging bot later, so I tested out some basic code
import mechanize as mek
import re
br = mek.Browser()
br.open("google.com")
However, whenever I run it, I get this error.
Traceback (most recent call last):
File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/mechanize/_mechanize.py", line 262, in _mech_open
url.get_full_url
AttributeError: 'str' object has no attribute 'get_full_url'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 5, in <module>
br.open("google.com")
File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/mechanize/_mechanize.py", line 253, in open
return self._mech_open(url_or_request, data, timeout=timeout)
File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/mechanize/_mechanize.py", line 269, in _mech_open
raise BrowserStateError("can't fetch relative reference: "
mechanize._mechanize.BrowserStateError: can't fetch relative reference: not viewing any document
I double checked with the documentation on the mechanize page and it seems consistent. What am I doing wrong?
You have to use a schema, otherwise mechanize thinks you are trying to open a local/relative path (as the error suggests).
br.open("google.com")
should be br.open("http://google.com")
.
Then you will see an error mechanize._response.httperror_seek_wrapper: HTTP Error 403: b'request disallowed by robots.txt'
, because google.com
does not allow crawlers. This can be remedied with br.set_handle_robots(False)
before open
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.