使用Python的请求从受密码保护的ASP网站获取数据

Question

I'm trying to get the whole content of a password-protected ASP site using Python's requests. 我正在尝试使用Python的请求获取受密码保护的ASP网站的全部内容。

The programmer of the ASP Site told me that using PowerShell he is able to get the data using the following script: ASP站点的程序员告诉我，使用PowerShell，他可以使用以下脚本获取数据：

$c = $host.UI.PromptForCredential('Your Credentials', 'Enter Credentials','','')
$r = Invoke-WebRequest 'https://server.com/app/login.aspx' -SessionVariable my_session
$form = $r.Forms[0]
$form.fields['xUsername']=$c.UserName
$form.fields['xPassword']=$c.GetNetworkCredential().Password
$r = Invoke-WebRequest -Uri ("https://server.com/app/login.aspx?ReturnUrl=%2Fapp%2FgetData.aspx%3Ftype%3DGETDATA%26id%3D123") -WebSession $my_session -Method POST -Body $form.Fields

I'm trying to achieve this using python's requests library, but does not seems to work properly. 我正在尝试使用python的请求库来实现此目的，但似乎无法正常工作。 Instead of getting the data, I get the HTML code you'll normally see when trying to access without password. 我没有获取数据，而是获得了在尝试不使用密码进行访问时通常会看到的HTML代码。

import getpass
import requests
requests.packages.urllib3.disable_warnings()
import re
from bs4 import BeautifulSoup

user="my_username"
password=getpass.getpass()

data = {"xUsername":user, "xPassword": password}
with requests.Session() as s:
    page = s.get('https://server.com/app/login.aspx',verify=False).content
    soup = BeautifulSoup(page)
    data["___VIEWSTATE"] = soup.select_one("#__VIEWSTATE")["value"]
    data["__VIEWSTATEGENERATOR"] = soup.select_one("#__VIEWSTATEGENERATOR")["value"]
    s.post('https://server.com/app/login.aspx', data=data)
    open_page = s.post(
        "https://server.com/app/login.aspx?ReturnUrl=/app/getData.aspx?type=GETDATA&id=123")

What am I doing wrong? 我究竟做错了什么？

Answer 1

I found the following problems: 我发现以下问题：

Headers were missing, I simply went to the website using Chrome and get this information. 标头不见了，我只是使用Chrome浏览器访问了该网站并获取了此信息。 In my case: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36" 在我的情况下：“ Mozilla / 5.0（Windows NT 10.0； Win64； x64）AppleWebKit / 537.36（KHTML，例如Gecko）Chrome / 63.0.3239.132 Safari / 537.36”
All data found under "Form Data" must be included in the python request. 在“表单数据”下找到的所有数据都必须包含在python请求中。 Again, went to Chrome and logged in to the website normally. 再次访问Chrome并正常登录该网站。 @Chrome: Inspect > Network > Search for login.asp > At the bottom I found "Form Data", which in my case looked like this (on parsed view): @Chrome：检查>网络>搜索login.asp>在底部，我找到了“表单数据”，在我看来，它是这样的（在已解析的视图上）：
__EVENTTARGET: __EVENTTARGET：
__EVENTARGUMENT: __EVENTARGUMENT：
__VIEWSTATE:random long string __VIEWSTATE：随机长字符串
__VIEWSTATEGENERATOR:random hex number __VIEWSTATEGENERATOR：随机十六进制数
__EVENTVALIDATION:random long string __EVENTVALIDATION：随机长字符串
xUsername:user xUsername：用户
xPassword:password xPassword：密码
btnLogin:Login btnLogin：登录

So, the correct python code looks like this: 因此，正确的python代码如下所示：

import getpass
import requests
requests.packages.urllib3.disable_warnings()
from bs4 import BeautifulSoup

user="my_username"
password=getpass.getpass()
url = "https://server.com/app/login.aspx?ReturnUrl=%2fapp%2fgetData.aspx%3ftype%3dGETDATA%26id%3d123"
data = {"xUsername":user, "xPassword": password}
with requests.Session() as s:
    headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36"}
    r = s.get('https://server.com/app/login.aspx',verify=False,headers=headers)
    soup = BeautifulSoup(r.content)
    data["___VIEWSTATE"] = soup.select_one("#__VIEWSTATE")["value"]
    data["__VIEWSTATEGENERATOR"] = soup.select_one("#__VIEWSTATEGENERATOR")["value"]
    data["__EVENTTARGET"] = ""
    data["__EVENTARGUMENT"] = ""
    data["__EVENTVALIDATION"] = soup.select_one("#__EVENTVALIDATION")["value"]
    data["btnLogin"] = "Login"

    response = s.post(url,data=data, headers=headers, allow_redirects=True)
    print response.content

I must include the url in encoded form, or else I will get an error message from the server saying that one parameter is missing, ie: 我必须以编码形式包含URL，否则我将从服务器收到一条错误消息，提示缺少一个参数，即：

url = "https://server.com/app/login.aspx?ReturnUrl=/app/getData.aspx?type=GETDATA&id=123"
... SAME SCRIPT AS ABOVE ...
>>> print response.url
https://server.com/app/getData.aspx?type=GETUSER
>>> print response.content
ERROR   Some parameter is missing

Maybe someone knows a better approach for not having to decode the url. 也许有人知道无需解码url的更好方法。

使用Python的请求从受密码保护的ASP网站获取数据

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-01-19 08:26:37

使用Python的请求从受密码保护的ASP网站获取数据

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-01-19 08:26:37

解决方案1
0 已采纳 2018-01-19 08:26:37