简体   繁体   中英

python mechanize - submit custom form

I'm interfacing a page that needs a login with mechanize. It uses some javascript on the front page making using mechanize straight-up more difficult. I know what form I have to submit to log-in - the one always generated by the js, the same every time. How can I make mechanize just submit a custom form that isn't on the page? Basically equivalent to this perl problem but in Python.

(NOTE: This came up again recently and I actually got it to work now.)

This seems to work:

br.open(URL)
res = mechanize._form.ParseString(FORM_HTML, BASE_URL)
br.form = res[1]
#continue as if the form was on the page and selected with .select_form()
br['username'] = 'foo'
br['password'] = 'bar'
br.submit()

URL is the full URL of the visited site. BASE_URL is the directory the URL is in. FORM_HTML is any HTML that has a form element, eg:

<form method='post' action='/login.aspx'>
    <input type='text' name='username'>
    <input type='text' name='password'>
    <input type='hidden' name='important_js_thing' value='processed_with_python TM'>
</form>

For some reason, mechanize._form.ParseString returns two forms. The first is a GET request to the base URL with no inputs; the second, the properly parsed form from FORM_HTML .

Parse the page, extract the elements you want, reform the page, and inject them back into mechanize.

For a project I worked on, I had to employ a simulated browser and found Mechanize to be very poor at form handling. It would yank uninterpreted elements out of Javascript blocks and die. I had to write a workaround that used BeautifulSoup to strip out all the bits that would cause it to die before it reached the form parser.

You may or may not run into that problem, but it's something to keep in mind. I ultimately ended up abandoning the Mechanize approach and went with Selenium. It's form handler was far superior and it could handle JS. It's got its issues (the browser adds a layer of complexity), but I found it much easier to work with.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM