简体   繁体   English

Python HTML填写并提交表格

[英]Python HTML fill and submit form

I am trying to write a Python script that goes to a form on a internal website (with the name "DefaultForm") and fills in the input name="username" field in the form with 'user001', the input name="password" field with 'pass001' and click on submit 我正在尝试编写一个Python脚本,该脚本将转到内部网站上的表单(名称为“ DefaultForm”),并使用“ user001”填写表单中的输入名称=“ username”字段,输入名称=“ password” “带有“ pass001”的字段,然后点击提交

Tried doing this with Selenium and it works. 尝试使用Selenium做到这一点,并且可以正常工作。 But want to accomplish the same task with Requests(and BeautifulSoup for some html scraping later on) 但是想通过Requests(和BeautifulSoup稍后完成一些html抓取)来完成相同的任务

Code I wrote which does NOT work! 我写的代码不起作用!

url = 'http://SERVER:PORT/dashboard/portal'
payload = {'username':'user001','password':'pass001'}
r = requests.get(url, params=payload)
print(r.text)

I check the content in r.text before and after the requests.get(..) and both are the same. 我在request.get(..)之前和之后检查r.text中的内容,并且两者相同。

Can anyone help me on how to do this ? 谁能帮助我该怎么做?

Edit/Update: Also tried this to submit my form using lxml but there seems to be an error which i can't seem to get my head around 编辑/更新:也尝试过使用lxml提交我的表单,但是似乎有一个错误,我似乎无法理解

page = parse(url).getroot()
page.forms[0].fields['username'] = 'user001'
page.forms[0].fields['password'] = 'pass001'
result = parse(submit_form(page.forms[0]).encode('utf-8')).getroot()
print(result.text)

This is the console result i get 这是我得到的控制台结果

runfile('C:/Users/mgreza/Downloads/WinPython-64bit-3.5.1.3/notebooks/temp.py', wdir='C:/Users/mgreza/Downloads/WinPython-64bit-3.5.1.3/notebooks') Traceback (most recent call last): 运行文件('C:/Users/mgreza/Downloads/WinPython-64bit-3.5.1.3/notebooks/temp.py',wdir ='C:/Users/mgreza/Downloads/WinPython-64bit-3.5.1.3/notebooks')追溯(最近一次通话):

File "", line 1, in runfile('C:/Users/mgreza/Downloads/WinPython-64bit-3.5.1.3/notebooks/temp.py', wdir='C:/Users/mgreza/Downloads/WinPython-64bit-3.5.1.3/notebooks') 文件“”,第1行,位于runfile('C:/Users/mgreza/Downloads/WinPython-64bit-3.5.1.3/notebooks/temp.py',wdir='C:/ Users / mgreza / Downloads / WinPython-64bit -3.5.1.3 /笔记本')

File "C:\\Users\\mgreza\\Downloads\\WinPython-64bit-3.5.1.3\\python-3.5.1.amd64\\lib\\site-packages\\spyderlib\\widgets\\externalshell\\sitecustomize.py", line 841, in runfile execfile(filename, namespace) 在运行文件execfile中的行841行中的文件“ C:\\ Users \\ mgreza \\ Downloads \\ WinPython-64bit-3.5.1.3 \\ python-3.5.1.amd64 \\ lib \\ site-packages \\ spyderlib \\ widgets \\ externalshell \\ sitecustomize.py” (文件名,名称空间)

File "C:\\Users\\mgreza\\Downloads\\WinPython-64bit-3.5.1.3\\python-3.5.1.amd64\\lib\\site-packages\\spyderlib\\widgets\\externalshell\\sitecustomize.py", line 103, in execfile exec(compile(f.read(), filename, 'exec'), namespace) execfile exec中的文件“ C:\\ Users \\ mgreza \\ Downloads \\ WinPython-64bit-3.5.1.3 \\ python-3.5.1.amd64 \\ lib \\ site-packages \\ spyderlib \\ widgets \\ externalshell \\ sitecustomize.py”,第103行(编译(f.read(),文件名,'exec'),命名空间)

File "C:/Users/mgreza/Downloads/WinPython-64bit-3.5.1.3/notebooks/temp.py", line 13, in result = parse(submit_form(page.forms[0]).encode('utf-8')).getroot() 文件“ C:/Users/mgreza/Downloads/WinPython-64bit-3.5.1.3/notebooks/temp.py”, 第13行, 结果= parse(submit_form(page.forms [0])。encode('utf-8 '))。getroot()

File "C:\\Users\\mgreza\\Downloads\\WinPython-64bit-3.5.1.3\\python-3.5.1.amd64\\lib\\site-packages\\lxml\\html__init__.py", line 1110, in submit_form return open_http(form.method, url, values) 文件“ C:\\ Users \\ mgreza \\ Downloads \\ WinPython-64bit-3.5.1.3 \\ python-3.5.1.amd64 \\ lib \\ site-packages \\ lxml \\ html__init __。py”, 行1110,在Submit_form中 返回open_http(form。方法,网址,值)

File "C:\\Users\\mgreza\\Downloads\\WinPython-64bit-3.5.1.3\\python-3.5.1.amd64\\lib\\site-packages\\lxml\\html__init__.py", line 1131, in open_http_urllib return urlopen(url, data) 1131行的文件“ C:\\ Users \\ mgreza \\ Downloads \\ WinPython-64bit-3.5.1.3 \\ python-3.5.1.amd64 \\ lib \\ site-packages \\ lxml \\ html__init __。py” 在open_http_urllib中 返回urlopen(url,数据)

File "C:\\Users\\mgreza\\Downloads\\WinPython-64bit-3.5.1.3\\python-3.5.1.amd64\\lib\\urllib\\request.py", line 162, in urlopen return opener.open(url, data, timeout) 网址为162的文件“ C:\\ Users \\ mgreza \\ Downloads \\ WinPython-64bit-3.5.1.3 \\ python-3.5.1.amd64 \\ lib \\ urllib \\ request.py”,在openopen返回opener.open(url,data,暂停)

File "C:\\Users\\mgreza\\Downloads\\WinPython-64bit-3.5.1.3\\python-3.5.1.amd64\\lib\\urllib\\request.py", line 463, in open req = meth(req) 文件“ C:\\ Users \\ mgreza \\ Downloads \\ WinPython-64bit-3.5.1.3 \\ python-3.5.1.amd64 \\ lib \\ urllib \\ request.py”,行463,在打开的req = meth(req)中

File "C:\\Users\\mgreza\\Downloads\\WinPython-64bit-3.5.1.3\\python-3.5.1.amd64\\lib\\urllib\\request.py", line 1170, in do_request_ raise TypeError(msg) do_request_中的文件“ C:\\ Users \\ mgreza \\ Downloads \\ WinPython-64bit-3.5.1.3 \\ python-3.5.1.amd64 \\ lib \\ urllib \\ request.py”行1170引发TypeError(msg)

TypeError: POST data should be bytes or an iterable of bytes. TypeError:POST数据应为字节或字节可迭代。 It cannot be of type str. 它不能是str类型。

Please help!!!!!!! 请帮忙!!!!!!!

This is the page I am trying to submit 这是我要提交的页面

 <html> <head> <base href="http://SERVER/DIRECTORY"> <link href="css/default.css" type="text/css" rel="stylesheet"> <title>XYZ</title> </head> <body bgcolor="#ffffff"> <p>&nbsp;</p> <table width="100%" cellpadding="0" cellspacing="0"> <tbody> <tr> <td> <form method="POST" name="DefaultForm" action="http://SERVER/DIRECTORY" onsubmit="return (isReady(this));" autocomplete="off" _lpchecked="1"> <input name="action" type="hidden" value="JLoginUser"> <input name="serverTimeStamp" type="hidden" value="1467104268529"> <input name="clientTimeStamp" type="hidden" value="1467104268904"> <input name="clientIP" type="hidden" value="10.221.12.67"> <table height="400" cellspacing="0" cellpadding="0" width="540" align="center" background="images/bkground.gif" border="0"> <tbody> <tr> <td> <table heigh="395" cellspacing="0" cellpadding="0" width="100%" background="images/Transparent.gif" border="0"> <tbody> <tr> <td> <img height="19" src="images/Transparent.gif"> </td> <td> <img src="images/logo.gif" align="top" border="0"> </td> <td> <img height="19" src="images/Transparent.gif" width="8"> </td> </tr> <tr> <td width="10"> <img height="1" src="images\\Transparent.gif" width="10"> </td> <td valign="top" width="497"> <table cellspacing="0" cellpadding="0" width="100%" background="images/Transparent.gif" border="0"> <tbody> <tr> <td valign="top" align="left" width="90"><span>&nbsp;</span> <table cellspacing="0" cellpadding="0" width="100%" background="images/Transparent.gif" border="0"> <tbody> <tr> <td colspan="3">&nbsp;</td> </tr> <tr> <td width="4">&nbsp;</td> <td> <img height="1" src="images/Transparent.gif" width="5"> </td> <td><span>&nbsp;</span> </td> </tr> <tr> <td colspan="3">&nbsp;</td> </tr> </tbody> </table> </td> <td width="18"> <img height="1" src="images/Transparent.gif" width="18"> </td> <td valign="top"> <table cellspacing="0" cellpadding="0" width="100%" background="images/Transparent.gif" border="0"> <tbody> <tr> <td colspan="2">&nbsp;</td> </tr> <tr> <td valign="top" colspan="2" height="15"> <p> <img height="15" src="images/Transparent.gif" width="1"> </p> </td> </tr> <tr> <td valign="top" align="center" colspan="2"> <img src="images/Integrator_login.gif"> </td> </tr> <tr> <td colspan="2"> <table cellspacing="0" cellpadding="0" width="100%" background="images/Transparent.gif" border="0"> <tbody> <tr> <td valign="top" align="center"> <div id="xboxLogin"> <b class="xtop"><b class="xb1"></b><b class="xb2"></b><b class="xb3"></b><b class="xb4"></b><b class="xb5"></b></b> <div class="xboxLoginContent"> <table cellspacing="0" cellpadding="0" width="200" background="images/Transparent.gif" border="0"> <tbody> <tr> <td valign="bottom" align="left" width="11" height="11"></td> <td height="11"> <img height="11" src="images/Transparent.gif" width="1"> </td> <td height="11"> <img height="11" src="images/Transparent.gif" width="1"> </td> <td valign="bottom" align="right" width="11" height="11"></td> </tr> <tr> <td width="11">&nbsp;</td> <td class="loginForm" colspan="2">Please sign in</td> <td width="11">&nbsp;</td> </tr> <tr> <td width="11">&nbsp;</td> <td class="loginForm">&nbsp;</td> <td height="11"> <img height="11" src="images/Transparent.gif" width="1"> </td> <td width="11">&nbsp;</td> </tr> <tr> <td width="11">&nbsp;</td> <td class="loginForm">User ID</td> <td height="11"> <img height="11" src="images/Transparent.gif" width="1"> </td> <td width="11">&nbsp;</td> </tr> <tr> <td width="11">&nbsp;</td> <td class="loginForm" colspan="2"> <input class="inputStyle" type="Input" name="username" size="20" style="cursor: pointer; background-image: url(&quot;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAASCAYAAABSO15qAAAAAXNSR0IArs4c6QAAAUBJREFUOBGVVE2ORUAQLvIS4gwzEysHkHgnkMiEc4zEJXCMNwtWTmDh3UGcYoaFhZUFCzFVnu4wIaiE+vvq6+6qTgthGH6O4/jA7x1OiCAIPwj7CoLgSXDxSjEVzAt9k01CBKdWfsFf/2WNuEwc2YqigKZpK9glAlVVwTTNbQJZlnlCkiTAZnF/mePB2biRdhwHdF2HJEmgaRrwPA+qqoI4jle5/8XkXzrCFoHg+/5ICdpm13UTho7Q9/0WnsfwiL/ouHwHrJgQR8WEwVG+oXpMPaDAkdzvd7AsC8qyhCiKJjiRnCKwbRsMw9hcQ5zv9maSBeu6hjRNYRgGFuKaCNwjkjzPoSiK1d1gDDecQobOBwswzabD/D3Np7AHOIrvNpHmPI+Kc2RZBm3bcp8wuwSIot7QQ0PznoR6wYSK0Xb/AGVLcWwc7Ng3AAAAAElFTkSuQmCC&quot;); background-attachment: scroll; background-size: contain; background-position: 98% 50%; background-repeat: no-repeat;" autocomplete="off"> </td> <td width="11">&nbsp;</td> </tr> <tr> <td width="11" height="5">&nbsp;</td> <td height="5">&nbsp;</td> <td height="11"> <img height="11" src="images/Transparent.gif" width="1"> </td> <td width="11" height="5">&nbsp;</td> </tr> <tr> <td width="11">&nbsp;</td> <td class="loginForm">Password</td> <td height="11"> <img height="11" src="images/Transparent.gif" width="1"> </td> <td width="11">&nbsp;</td> </tr> <tr> <td width="11">&nbsp;</td> <td colspan="2"> <input class="inputStyle" type="password" maxlength="28" name="password" size="20" style="cursor: auto; background-image: url(&quot;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAASCAYAAABSO15qAAAAAXNSR0IArs4c6QAAAUBJREFUOBGVVE2ORUAQLvIS4gwzEysHkHgnkMiEc4zEJXCMNwtWTmDh3UGcYoaFhZUFCzFVnu4wIaiE+vvq6+6qTgthGH6O4/jA7x1OiCAIPwj7CoLgSXDxSjEVzAt9k01CBKdWfsFf/2WNuEwc2YqigKZpK9glAlVVwTTNbQJZlnlCkiTAZnF/mePB2biRdhwHdF2HJEmgaRrwPA+qqoI4jle5/8XkXzrCFoHg+/5ICdpm13UTho7Q9/0WnsfwiL/ouHwHrJgQR8WEwVG+oXpMPaDAkdzvd7AsC8qyhCiKJjiRnCKwbRsMw9hcQ5zv9maSBeu6hjRNYRgGFuKaCNwjkjzPoSiK1d1gDDecQobOBwswzabD/D3Np7AHOIrvNpHmPI+Kc2RZBm3bcp8wuwSIot7QQ0PznoR6wYSK0Xb/AGVLcWwc7Ng3AAAAAElFTkSuQmCC&quot;); background-attachment: scroll; background-size: contain; background-position: 98% 50%; background-repeat: no-repeat;" autocomplete="off"> </td> <td width="11">&nbsp;</td> </tr> <tr> <td width="11" height="5">&nbsp;</td> <td height="5">&nbsp;</td> <td height="11"> <img height="11" src="images/Transparent.gif" width="1"> </td> <td width="11" height="5">&nbsp;</td> </tr> <tr> <td width="6">&nbsp;</td> <td class="loginForm"> <input class="submit" name="submit" type="submit" value="Sign In" style="font-size:10"> </td> <td width="50%" align="center">&nbsp;</td> <td width="11">&nbsp;</td> </tr> <tr> <td valign="top" align="left" width="11" height="11"></td> <td height="11"> <img height="11" src="images/Transparent.gif" width="1"> </td> <td height="11"> <img height="11" src="images/Transparent.gif" width="1"> </td> <td valign="top" align="right" width="11" height="11"></td> </tr> </tbody> </table> </div> <b class="xbottom"><b class="xb5"></b><b class="xb4"></b><b class="xb3"></b><b class="xb2"></b><b class="xb1"></b></b> </div> </td> <td width="43"> <img height="8" src="images/Transparent.gif" width="43"> </td> </tr> </tbody> </table> </td> </tr> <tr> <td valign="top" align="left">&nbsp;</td> <td valign="top" align="right" colspan="3"> </td> </tr> </tbody> </table> </td> </tr> </tbody> </table> </td> </tr> </tbody> </table> </td> </tr> </tbody> </table> </form> </td> </tr> </tbody> </table> </body> </html> 

EDIT/UPDATE: Logs when POST-ing form 编辑/更新:张贴表单时记录

Request Header:- Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Accept-Encoding:gzip, deflate Accept-Language:en-US,en;q=0.8,en-GB;q=0.6 Cache-Control:max-age=0 Connection:keep-alive Content-Length:140 Content-Type:application/x-www-form-urlencoded Cookie:JSESSIONID=1ax740u3chasqa4rmen8ifq5b; SCI_DLSSO=U2Vzc2lvbklEaGl1YWFseGJqeGx1MTN1OGFtaXpva3Yybw== Host:10.1.28.189:5010 Origin:http://SERVER:PORT Referer:http://SERVER:PORT/dashboard/ Upgrade-Insecure-Requests:1 User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36 请求标头: Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Accept-Encoding:gzip, deflate Accept-Language:en-US,en;q=0.8,en-GB;q=0.6 Cache-Control:max-age=0 Connection:keep-alive Content-Length:140 Content-Type:application/x-www-form-urlencoded Cookie:JSESSIONID=1ax740u3chasqa4rmen8ifq5b; SCI_DLSSO=U2Vzc2lvbklEaGl1YWFseGJqeGx1MTN1OGFtaXpva3Yybw== Host:10.1.28.189:5010 Origin:http://SERVER:PORT Referer:http://SERVER:PORT/dashboard/ Upgrade-Insecure-Requests:1 User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36 Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Accept-Encoding:gzip, deflate Accept-Language:en-US,en;q=0.8,en-GB;q=0.6 Cache-Control:max-age=0 Connection:keep-alive Content-Length:140 Content-Type:application/x-www-form-urlencoded Cookie:JSESSIONID=1ax740u3chasqa4rmen8ifq5b; SCI_DLSSO=U2Vzc2lvbklEaGl1YWFseGJqeGx1MTN1OGFtaXpva3Yybw== Host:10.1.28.189:5010 Origin:http://SERVER:PORT Referer:http://SERVER:PORT/dashboard/ Upgrade-Insecure-Requests:1 User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36 Response Header:- Cache-Control:no-cache Content-Type:text/html; charset=utf-8 Expires:Thu, 01 Jan 1970 00:00:00 GMT Last-Modified:Thu, 30 Jun 2016 01:47:48 GMT Pragma:no-cache Set-Cookie:JSESSIONID=17zql1r4hdylrfg54lardx14p;Path=/dashboard/;HttpOnly Set-Cookie:SCI_DLSSO=U2Vzc2lvbklEMTd6cWwxcjRoZHlscmZnNTRsYXJkeDE0cA==;Path=/;HttpOnly Transfer-Encoding:chunked X-FRAME-OPTIONS:SAMEORIGIN Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Accept-Encoding:gzip, deflate Accept-Language:en-US,en;q=0.8,en-GB;q=0.6 Cache-Control:max-age=0 Connection:keep-alive Content-Length:140 Content-Type:application/x-www-form-urlencoded Cookie:JSESSIONID=1ax740u3chasqa4rmen8ifq5b; SCI_DLSSO=U2Vzc2lvbklEaGl1YWFseGJqeGx1MTN1OGFtaXpva3Yybw== Host:10.1.28.189:5010 Origin:http://SERVER:PORT Referer:http://SERVER:PORT/dashboard/ Upgrade-Insecure-Requests:1 User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36 响应标头:- Cache-Control:no-cache Content-Type:text/html; charset=utf-8 Expires:Thu, 01 Jan 1970 00:00:00 GMT Last-Modified:Thu, 30 Jun 2016 01:47:48 GMT Pragma:no-cache Set-Cookie:JSESSIONID=17zql1r4hdylrfg54lardx14p;Path=/dashboard/;HttpOnly Set-Cookie:SCI_DLSSO=U2Vzc2lvbklEMTd6cWwxcjRoZHlscmZnNTRsYXJkeDE0cA==;Path=/;HttpOnly Transfer-Encoding:chunked X-FRAME-OPTIONS:SAMEORIGIN Cache-Control:no-cache Content-Type:text/html; charset=utf-8 Expires:Thu, 01 Jan 1970 00:00:00 GMT Last-Modified:Thu, 30 Jun 2016 01:47:48 GMT Pragma:no-cache Set-Cookie:JSESSIONID=17zql1r4hdylrfg54lardx14p;Path=/dashboard/;HttpOnly Set-Cookie:SCI_DLSSO=U2Vzc2lvbklEMTd6cWwxcjRoZHlscmZnNTRsYXJkeDE0cA==;Path=/;HttpOnly Transfer-Encoding:chunked X-FRAME-OPTIONS:SAMEORIGIN Form Data:- action=JLoginUser&serverTimeStamp=1467251264480&clientTimeStamp=2146&clientIP=10.220.12.101&username=user001&password=pass001&submit=Sign+In Cache-Control:no-cache Content-Type:text/html; charset=utf-8 Expires:Thu, 01 Jan 1970 00:00:00 GMT Last-Modified:Thu, 30 Jun 2016 01:47:48 GMT Pragma:no-cache Set-Cookie:JSESSIONID=17zql1r4hdylrfg54lardx14p;Path=/dashboard/;HttpOnly Set-Cookie:SCI_DLSSO=U2Vzc2lvbklEMTd6cWwxcjRoZHlscmZnNTRsYXJkeDE0cA==;Path=/;HttpOnly Transfer-Encoding:chunked X-FRAME-OPTIONS:SAMEORIGIN 表单数据: - action=JLoginUser&serverTimeStamp=1467251264480&clientTimeStamp=2146&clientIP=10.220.12.101&username=user001&password=pass001&submit=Sign+In

This should get you loggeg in: 这应该可以使您登录:

import requests
from lxml import html

with requests.Session() as s:
    xml = .get("http://SERVER:PORT/dashboard/portal").getroot()
    form = xml.xpath("//form[@name='DefaultForm']")[0]
    print(form.xpath("./input[@name]"))
    data = {i.xpath("@name")[0] : i.xpath("@value")[0] for i in form.xpath("./input[@name]")}
    post_url = form.xpath("@action")[0]
    data["username"] = "username"
    data["password"] = "password"
    data["submit"] = "Sign In"
    r = s.post("http://SERVER:PORT/dashboard/portal"", data=data)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM